Pandas分组并查找最大值和最小值之间的差异

Question

Pandas分组并查找最大值和最小值之间的差异

5

我有一个数据框，如下所示已经聚合。但是，我希望将它们做差，即最大值减去最小值。

enter image description here

dnm=df.groupby('Type').agg({'Vehicle_Age': ['max','min']})

期望：

期望的结果是一个文本框，用户能够在其中输入文本并提交该文本。用户提交的文本会被传递给后端处理。

- user14815110

3个回答

4

只需比较这两个：

grouping = df.groupby('Type')
dnm = grouping.max() - grouping.min()

@cs95的答案是正确的方法，同时时间更加合适！：

设置：

df = pd.DataFrame({'a':np.arange(100),'Type':[1 if i %2 ==0 else 0 for i in range(100)]})

"@cs95："

%timeit df.groupby('Type').agg({'a': np.ptp}) 

1.29 ms ± 39.5 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

vs

%%timeit  
grouping = df.groupby('Type') 
dnm = grouping.max() - grouping.min() 

1.57 ms ± 299 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

- adir abargil

1

可能还值得一提的是，包括您创建大型数据框的设置，以便其他人可以比较时间。 - cs95

3

您应该对表格的列执行基本的逐元素操作，可以像这样进行：


import pandas as pd

# This is just setup to replicate your example
df = pd.DataFrame([[14, 7], [15, .25], [14, 9], [13, 2], [14, 4]], index=['Large SUV', 'Mid-size', 'Minivan', 'Small', 'Small SUV'], columns = ['max', 'min'])

print(df)

#             max   min
# Large SUV   14  7.00
# Mid-size    15  0.25
# Minivan     14  9.00
# Small       13  2.00
# Small SUV   14  4.00

# This is the operation that will give you the values you want
diff = df['max'] - df['min']

print(diff)

# Large SUV     7.00
# Mid-size     14.75
# Minivan       5.00
# Small        11.00
# Small SUV    10.00

- Marceli Wac

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- cs95 · Accepted Answer

10

你可以使用np.ptp，它会为你计算max - min：

df.groupby('Type').agg({'Vehicle_Age': np.ptp})

或者，

df.groupby('Type')['Vehicle_Age'].agg(np.ptp)

如果您需要将Series作为输出。

- cs95

2

这是一个更好的解决方案，也比maie快...

%timeit df.groupby('Type').agg({'a': np.ptp}) 1.29毫秒±39.5微秒每个循环(7次运行的平均值±标准差，每个循环1000次)

与

%%timeit grouping = df.groupby('Type') dnm = grouping.max() - grouping.min() 1.57毫秒±299微秒每个循环(7次运行的平均值±标准差，每个循环1000次)

。 - adir abargil

@adirabargil，谢谢你计时，请把它加到你的答案中，我会很高兴投票支持它。 - cs95