从日期中减去时间间隔 - pandas

4

我正在尝试从给定的Pandas系列中减去delta时间。

date_current = hh.groupby('group').agg({'issue_date' : [np.min, np.max]})
date_current.issue_date.amax.head(5)

group
_101000000000_0.0   2017-01-03
_102000000000_1.0   2017-02-23
_102000000000_2.0   2017-03-20
_102000000000_3.0   2017-10-01
_103000000000_4.0   2017-01-24
Name: amax, dtype: datetime64[ns]

可以看到,我已经在使用日期时间了。但是,当我尝试执行减法时,出现了错误:

import datetime
months = 4
datetime.timedelta(weeks=4*months)
date_before = date_current.values - datetime.timedelta(weeks=4*months)

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-51-5a7f2a09bab6> in <module>()
      2 months = 4
      3 datetime.timedelta(weeks=4*months)
----> 4 date_before = date_current.values - datetime.timedelta(weeks=4*months)

TypeError: ufunc subtract cannot use operands with types dtype('<M8[ns]') and dtype('O')

我漏掉了什么?

2个回答

7

对于我来说,pandasTimedelta很有用:

date_before = date_current.values - pd.Timedelta(weeks=4*months)
print (date_before)
['2016-09-13T00:00:00.000000000' '2016-11-03T00:00:00.000000000'
 '2016-11-28T00:00:00.000000000' '2017-06-11T00:00:00.000000000'
 '2016-10-04T00:00:00.000000000']

date_before = date_current - pd.Timedelta(weeks=4*months)
print (date_before)
group
_101000000000_0.0   2016-09-13
_102000000000_1.0   2016-11-03
_102000000000_2.0   2016-11-28
_102000000000_3.0   2017-06-11
_103000000000_4.0   2016-10-04
Name: amax, dtype: datetime64[ns]

print (type(date_before.iloc[0]))
<class 'pandas._libs.tslib.Timestamp'>

我认为问题在于pythontimedelta未被转换为pandasTimedelta,从而导致错误。

但是如果需要使用date,则需要先将datetime转换为Pythondate对象的date

date_before = date_current.dt.date - datetime.timedelta(weeks=4*months)
print (date_before)
group
_101000000000_0.0    2016-09-13
_102000000000_1.0    2016-11-03
_102000000000_2.0    2016-11-28
_102000000000_3.0    2017-06-11
_103000000000_4.0    2016-10-04
Name: amax, dtype: object

print (type(date_before.iloc[0]))
<class 'datetime.date'>

谢谢。我以为我可以使用datetime中的timedelta函数。 - pceccon
1
我认为如果使用pandas,最好使用pandas函数,因为pandas开发人员主要实现它们。这里似乎有一个错误,可以创建新的问题 - jezrael

1

正如jezrael指出的那样,有一种pandas的方法,但你也可以使用 .dt访问器 将其作为日期时间来处理:

df.dt.values - dt.timedelta(weeks=4 * months)

测试代码:
import datetime as dt
import pandas as pd

df = pd.Series([dt.datetime.now()])
print(df)

months = 4
print(df.values - pd.Timedelta(weeks=4*months))
print(df.dt.values - dt.timedelta(weeks=4 * months))

Results:

0   2017-05-23 05:36:53.300
dtype: datetime64[ns]

['2017-01-31T05:36:53.300000000']

DatetimeIndex(['2017-01-31 05:36:53.300000'], dtype='datetime64[ns]', freq=None)

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接