Pandas合并指示器自定义值

3
什么是在pandas合并期间更新指标以获得更友好消息的最快方法? 默认情况下,indicator=True会产生left_onlyright_onlyboth,我想将其更新为Only present in last month's dataOnly present in current month's dataPresent in Both month's data。 我希望在不使用lambda运算符的情况下完成它。

不清楚你想要什么。请查看以下建议,以生成一个最小、完整、可验证的示例:https://stackoverflow.com/help/mcve - fordy
1
你可以使用字典来映射默认值。例如:d = {'left_only': '只存在于上个月的数据中', ...} - ALollz
1个回答

7

创建一个可工作的示例:

np.random.seed(0)
left = pd.DataFrame({'key': ['A', 'B', 'C', 'D'], 'value': np.random.randn(4)})    
right = pd.DataFrame({'key': ['B', 'D', 'E', 'F'], 'value': np.random.randn(4)})

merged=left.merge(right,on='key',how='outer',indicator=True)
print(merged)

  key   value_x   value_y      _merge
0   A  1.764052       NaN   left_only
1   B  0.400157  1.867558        both
2   C  0.978738       NaN   left_only
3   D  2.240893 -0.977278        both
4   E       NaN  0.950088  right_only
5   F       NaN -0.151357  right_only

用于映射值:

d={"left_only":"Only present in last month's data", "right_only":"Only present in current month's data","both":"Present in Both month's data"}

merged['_merge'] = merged['_merge'].map(d)
print(merged)

  key   value_x   value_y                                _merge
0   A  1.764052       NaN     Only present in last month's data
1   B  0.400157  1.867558          Present in Both month's data
2   C  0.978738       NaN     Only present in last month's data
3   D  2.240893 -0.977278          Present in Both month's data
4   E       NaN  0.950088  Only present in current month's data
5   F       NaN -0.151357  Only present in current month's data

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接