假设有两个如下数据框:
df1:
id address price
0 1 8563 Parker Ave. Lexington, NC 27292 3
1 2 242 Bellevue Lane Appleton, WI 54911 3
2 3 771 Greenview Rd. Greenfield, IN 46140 5
3 4 93 Hawthorne Street Lakeland, FL 33801 6
4 5 8952 Green Hill Street Gettysburg, PA 17325 3
5 6 7331 S. Sherwood Dr. New Castle, PA 16101 4
df2:
state street quantity
0 PA S. Sherwood 12
1 IN Hawthorne Street 3
2 NC Parker Ave. 7
假设如果
df2
中的state
和street
都包含在df2
的address
列中,则将df2
合并到df1
中。 在Pandas中如何实现?谢谢。期望的结果
df
: id address ... street quantity
0 1 8563 Parker Ave. Lexington, NC 27292 ... Parker Ave. 7.00
1 2 242 Bellevue Lane Appleton, WI 54911 ... NaN NaN
2 3 771 Greenview Rd. Greenfield, IN 46140 ... NaN NaN
3 4 93 Hawthorne Street Lakeland, FL 33801 ... NaN NaN
4 5 8952 Green Hill Street Gettysburg, PA 17325 ... NaN NaN
5 6 7331 S. Sherwood Dr. New Castle, PA 16101 ... S. Sherwood 12.00
[6 rows x 6 columns]
我的测试代码:
df2['addr'] = df2['state'].astype(str) + df2['street'].astype(str)
pat = '|'.join(r'\b{}\b'.format(x) for x in df2['addr'])
df1['addr']= df1['address'].str.extract('\('+ pat + ')', expand=False)
df = df1.merge(df2, on='addr', how='left')
输出:
id address ... street_y quantity_y
0 1 8563 Parker Ave. Lexington, NC 27292 ... NaN nan
1 2 242 Bellevue Lane Appleton, WI 54911 ... NaN nan
2 3 771 Greenview Rd. Greenfield, IN 46140 ... NaN nan
3 4 93 Hawthorne Street Lakeland, FL 33801 ... NaN nan
4 5 8952 Green Hill Street Gettysburg, PA 17325 ... NaN nan
5 6 7331 S. Sherwood Dr. New Castle, PA 16101 ... NaN nan
[6 rows x 10 columns]
df2['state']
? - ah bonaddress
没有逗号进行分割,我们该如何修改您的代码? - ah bon