我有两个如下所示的数据框:
A = pd.DataFrame({"ID":["A", "A", "C" ,"B", "B"], "date":["06/22/2014","07/02/2014","01/01/2015","01/01/1991","08/02/1999"]})
B = pd.DataFrame({"ID":["A", "A", "C" ,"B", "B"], "date":["02/15/2015","06/30/2014","07/02/1999","10/05/1990","06/24/2014"], "value": ["3","5","1","7","8"] })
看起来像下面这样:
>>> A
ID date
0 A 2014-06-22
1 A 2014-07-02
2 C 2015-01-01
3 B 1991-01-01
4 B 1999-08-02
>>> B
ID date value
0 A 2015-02-15 3
1 A 2014-06-30 5
2 C 1999-07-02 1
3 B 1990-10-05 7
4 B 2014-06-24 8
我希望将A和B的值合并,并使用最接近的日期。在这个例子中,没有日期匹配,但有可能会匹配。输出应该类似于这样:
>>> C
ID date value
0 A 06/22/2014 8
1 A 07/02/2014 5
2 C 01/01/2015 3
3 B 01/01/1991 7
4 B 08/02/1999 1
我认为pandas应该有一个原生函数可以实现这个功能。
注意:类似的问题已经在这里提出过:pandas.merge: match the nearest time stamp >= the series of timestamps