Pandas：按组对值进行归一化

Question

Pandas：按组对值进行归一化

pythonpandasdataframedata-sciencedata-wrangling

5

我很难用言语解释我想要实现的内容，所以请不要因我展示一个简单的例子就对我进行评判。我有一个表格，看起来像这样：

main_col	some_metadata	value
this	True	10
this	False	3
that	True	50
that	False	10
other	True	20
other	False	5

我想要分别针对main_col中的每种情况对这些数据进行归一化处理。例如，如果我们选择最小-最大值规范化，并将其缩放到范围[0; 100]，我希望输出看起来像这样：

main_col	some_metadata	value (normalized)
this	True	100
this	False	30
that	True	100
that	False	20
other	True	100
other	False	25

对于main_col中的每种情况，最高值都被缩放为100，并且另一个值按比例缩放。

- Max Skoryk

2个回答

1

你要找的标准化公式是100 * (x / x.max())：

df.groupby(['main_col'])['value'].transform(lambda x: 100 * (x / x.max()))

结果：

0    100.0
1     30.0
2    100.0
3     20.0
4    100.0
5     25.0
Name: value, dtype: float64

- Nuri Taş

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- mozway · Accepted Answer

您可以使用 groupby.transform('max') 按组获取最大值，然后就地进行归一化。

df['value'] /= df.groupby('main_col')['value'].transform('max').div(100)

或：

df['value'] *= df.groupby('main_col')['value'].transform('max').rdiv(100)

输出：

  main_col  some_metadata  value
0     this           True  100.0
1     this          False   30.0
2     that           True  100.0
3     that          False   20.0
4    other           True  100.0
5    other          False   25.0