Pandas重新索引将所有值转换为NaN。

4

我可以为您翻译以下的数据框相关内容:

>>> a = pd.DataFrame({'values':[random.randint(-10,10) for i in range(10)]})
>>> a        
   values
0      -3
1      -8
2      -2
3       3
4       8
5       6
6      -5
7       0
8       8
9      -4

我希望将索引完全转换为日期时间格式,以下是我使用的代码进行重建:

>>> times = [datetime.datetime(2018,1,2,12,40,0) + datetime.timedelta(seconds=i) for i in range(10)]

>>> times

[datetime.datetime(2018, 1, 2, 12, 40), datetime.datetime(2018, 1, 2, 12, 40, 1), datetime.datetime(2018, 1, 2, 12, 40, 2), datetime.datetime(2018, 1, 2, 12, 40, 3), datetime.datetime(2018, 1, 2, 12, 40, 4), datetime.datetime(2018, 1, 2, 12, 40, 5), datetime.datetime(2018, 1, 2, 12, 40, 6), datetime.datetime(2018, 1, 2, 12, 40, 7), datetime.datetime(2018, 1, 2, 12, 40, 8), datetime.datetime(2018, 1, 2, 12, 40, 9)]
>>> a.reindex(times)

                     values
2018-01-02 12:40:00     NaN
2018-01-02 12:40:01     NaN
2018-01-02 12:40:02     NaN
2018-01-02 12:40:03     NaN
2018-01-02 12:40:04     NaN
2018-01-02 12:40:05     NaN
2018-01-02 12:40:06     NaN
2018-01-02 12:40:07     NaN
2018-01-02 12:40:08     NaN
2018-01-02 12:40:09     NaN

如您所见,它删除了我刚刚拥有的值,并只在其位置放置NaN。我应该如何重新索引此数据框以看起来像这样:

正如您所看到的,它实际上是删除了我原本拥有的值,并将NaN放在它们的位置。 我应该如何重新索引这个数据帧,使其看起来像这样:

                     values
2018-01-02 12:40:00    -3
2018-01-02 12:40:01    -8
2018-01-02 12:40:02    -2
2018-01-02 12:40:03     3
2018-01-02 12:40:04     8
2018-01-02 12:40:05     6
2018-01-02 12:40:06    -5
2018-01-02 12:40:07     0
2018-01-02 12:40:08     8
2018-01-02 12:40:09    -4
2个回答

2
只要您拥有与df.size相同的times大小,就可以将其传递给set_index函数。
df = df.set_index([times])

Out[64]:
                     values
2018-01-02 12:40:00      -3
2018-01-02 12:40:01      -8
2018-01-02 12:40:02      -2
2018-01-02 12:40:03       3
2018-01-02 12:40:04       8
2018-01-02 12:40:05       6
2018-01-02 12:40:06      -5
2018-01-02 12:40:07       0
2018-01-02 12:40:08       8
2018-01-02 12:40:09      -4

或者你可以直接将其分配给index
In [67]: df.index = times

In [68]: df
Out[68]:
                     values
2018-01-02 12:40:00      -3
2018-01-02 12:40:01      -8
2018-01-02 12:40:02      -2
2018-01-02 12:40:03       3
2018-01-02 12:40:04       8
2018-01-02 12:40:05       6
2018-01-02 12:40:06      -5
2018-01-02 12:40:07       0
2018-01-02 12:40:08       8
2018-01-02 12:40:09      -4

2

代码

import random
import datetime
import pandas as pd

a = pd.DataFrame({'values':[random.randint(-10,10) for i in range(10)]})
a['times'] = [datetime.datetime(2018,1,2,12,40,0) + datetime.timedelta(seconds=i) for i in range(10)]
a = a.set_index('times')

结果

times                values      
2018-01-02 12:40:00      -2
2018-01-02 12:40:01      -3
2018-01-02 12:40:02       5
2018-01-02 12:40:03      -9
2018-01-02 12:40:04      -6
2018-01-02 12:40:05       2
2018-01-02 12:40:06       1
2018-01-02 12:40:07      -1
2018-01-02 12:40:08       5
2018-01-02 12:40:09       3

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接