Pandas将两个数据框连接/合并，优先考虑某个数据框

Question

Pandas将两个数据框连接/合并，优先考虑某个数据框

3

如何将两个pandas数据框连接|合并，并保留来自具有优先级的DataFrame的行，如果特定列值匹配。是否有一种类型的连接可以描述这种情况？

import pandas as pd
Cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],
        'Price': [22000,25000,27000,35000]}
Cars2 = {'Brand': ['Honda CRV','Toyota Celica','Ford Explorer','Audi A8'],
        'Price': [40000,25000,37000,100000]}

df_priority = pd.DataFrame(Cars, columns= ['Brand', 'Price'])
df2 = pd.DataFrame(Cars2, columns= ['Brand', 'Price'])

# df_merge_with_priority = Merge dataframes and keep rows from df_priority if price matches

df_merge_with_priority 期望输出结果:

品牌: 本田CRV, 本田Civic, 丰田Corolla, 福特Explorer, 福特Focus, 奥迪A4, 奥迪A8

价格: 40000, 22000, 25000, 27000, 37000, 35000, 100000

请注意，丰田Corolla和丰田Celica的价格相同，但在这种情况下我们只想保留Corolla。有什么关于如何设置优先级的想法吗？

- Cody Glickman

你可以使用pandas的concat函数，并结合查询条件来实现。请参考这个链接：https://dev59.com/oZvga4cB1Zd3GeqP5KnN 希望对你有所帮助，特别是查询部分。 - 4.Pi.n

1

如果这是你的要求，可以尝试使用以下代码：pd.concat([df_priority, df2]).groupby(['Price']).aggregate('max')。 - vb_rises

1

你正在寻找 pd.concat((df_priority,df2)).sort_index().drop_duplicates('Price') 吗？ - anky

嘿 @vb_rises，这个很棒！谢谢你。 - Cody Glickman

谢谢@Prof.Mo，不过我不确定如何解释您提供的问题，对我来说，它似乎是一个条件连接而不是按优先级连接。 - Cody Glickman

嘿 @anky_91，这个也很好用，谢谢你。 - Cody Glickman

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- anky · Accepted Answer

如果你正在寻找第一优先级与df2的区别，可以尝试以下方法：

pd.concat((df_priority,df2)).sort_index().drop_duplicates('Price') #.reset_index(drop=True)

            Brand   Price
0     Honda Civic   22000
1       Honda CRV   40000
2  Toyota Corolla   25000
3      Ford Focus   27000
4   Ford Explorer   37000
5         Audi A4   35000
6         Audi A8  100000