如何在具有时区感知时间戳列的数据帧上进行追加?

5

我有一个时间戳列和一个数字列的数据框。如果时间戳列是没有时区的,我可以将新行添加到其中。

df = pd.DataFrame([[1,2],[3,4]], columns=['timestamp', 'number'])
df['timestamp']=pd.to_datetime(df['timestamp'])
df
#                       timestamp  number
# 0 1970-01-01 00:00:00.000000001       2
# 1 1970-01-01 00:00:00.000000003       4

df.append(df.loc[0])
#                       timestamp  number
# 0 1970-01-01 00:00:00.000000001       2
# 1 1970-01-01 00:00:00.000000003       4
# 0 1970-01-01 00:00:00.000000001       2

但如果我为时间戳列设置时区,然后尝试添加新行,就会出现错误。

df['timestamp']=df['timestamp'].apply(lambda x: x.tz_localize('utc'))
df
#                             timestamp  number
# 0 1970-01-01 00:00:00.000000001+00:00       2
# 1 1970-01-01 00:00:00.000000003+00:00       4
df.append(df.loc[0])
# Traceback (most recent call last):
#   File "<stdin>", line 1, in <module>
#   File "/Library/Python/2.7/site-packages/pandas-0.17.1-py2.7-macosx-10.10-intel.egg/pandas/core/frame.py", line 4231, in append
#     verify_integrity=verify_integrity)
#   File "/Library/Python/2.7/site-packages/pandas-0.17.1-py2.7-macosx-10.10-intel.egg/pandas/tools/merge.py", line 813, in concat
#     return op.get_result()
#   File "/Library/Python/2.7/site-packages/pandas-0.17.1-py2.7-macosx-10.10-intel.egg/pandas/tools/merge.py", line 995, in get_result
#     mgrs_indexers, self.new_axes, concat_axis=self.axis, copy=self.copy)
#   File "/Library/Python/2.7/site-packages/pandas-0.17.1-py2.7-macosx-10.10-intel.egg/pandas/core/internals.py", line 4456, in concatenate_block_managers
#     for placement, join_units in concat_plan]
#   File "/Library/Python/2.7/site-packages/pandas-0.17.1-py2.7-macosx-10.10-intel.egg/pandas/core/internals.py", line 4561, in concatenate_join_units
#     concat_values = com._concat_compat(to_concat, axis=concat_axis)
#   File "/Library/Python/2.7/site-packages/pandas-0.17.1-py2.7-macosx-10.10-intel.egg/pandas/core/common.py", line 2548, in _concat_compat
#     return _concat_compat(to_concat, axis=axis)
#   File "/Library/Python/2.7/site-packages/pandas-0.17.1-py2.7-macosx-10.10-intel.egg/pandas/tseries/common.py", line 256, in _concat_compat
#     return DatetimeIndex(np.concatenate([ x.tz_localize(None).asi8 for x in to_concat ]), tz=list(tzs)[0])
# AttributeError: 'numpy.ndarray' object has no attribute 'tz_localize'

任何关于如何在具有时区感知时间戳列的数据帧中添加新行的帮助都将不胜感激。

你的pandas版本是多少?我可以在0.16.1中成功运行这个例子。另外,不要使用apply(pd.to_datetime),而是直接使用pd.to_datetime(df)。这一行:df[0]=df[0].apply(pd.to_datetime)似乎也是错误的,你想要的应该是df['timestamp'] = df['timestamp']。 - Chris
@Chris 这个。这可能是我对pandas-wild代码最大的不满之一。我见过像 df.apply(lambda x: x.sum()) 这样的东西,甚至更糟糕。 :/ - Andy Hayden
@Chris,感谢您指出问题中的错误。我正在使用pandas版本0.17.1。 - yadu
1个回答

2

这是pandas版本的一个bug(感谢这个答案)。 正如他们在那里所述,您的解决方案可以是:

df = df.astype(str).append(df.loc[0].astype(str))
df['timestamp'] = pd.to_datetime(df['timestamp'], utc=True)

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接