Pandas数据框输出为JSON

5

我有一个带有DateTimeIndex的Pandas数据框,以及每小时对象的列,我想将单个列转换为由每日小时值数组组成的每日数组的JSON文件输出。

这是一个简单的例子:

如果我有以下数据框:

In [106]: 
rng = pd.date_range('1/1/2011 01:00:00', periods=12, freq='H') 
df = pd.DataFrame(randn(12, 1), index=rng, columns=['A'])

In [107]:
df

Out[107]:
                     A
2011-01-01 01:00:00 -0.067214
2011-01-01 02:00:00  0.820595
2011-01-01 03:00:00  0.442557
2011-01-01 04:00:00 -1.000434
2011-01-01 05:00:00 -0.760783
2011-01-01 06:00:00 -0.106619
2011-01-01 07:00:00  0.786618
2011-01-01 08:00:00  0.144663
2011-01-01 09:00:00 -1.455017
2011-01-01 10:00:00  0.865593
2011-01-01 11:00:00  1.289754
2011-01-01 12:00:00  0.601067

我想要这个json文件:
[    
 [-0.0672138259,0.8205950583,0.4425568167,-1.0004337373,-0.7607833867,-0.1066187698,0.7866183048,0.1446634381,-1.4550165851,0.8655931982,1.2897541164,0.6010672247]
]

我的实际数据框比这个长得多,因此大致看起来像这样:

[
 [value@hour1day1, value@hour2day1.....value@hour24day1],
 [value@hour1day2, value@hour2day2.....value@hour24day2],
 [value@hour1day3, value@hour2day3.....value@hour24day3],
 ....
 [value@hour1LastDay, value@hour2LastDay.....value@hour24LastDay]
]
1个回答

8
import json
import pandas as pd
import numpy as np

rng = pd.date_range('1/1/2011 01:00:00', periods=12, freq='H') 
df = pd.DataFrame(np.random.randn(12, 1), index=rng, columns=['A'])

print json.dumps(df.T.as_matrix().tolist(),indent=4)

输出:

[
    [
        -0.6916923670267555, 
        0.23075256008033393, 
        1.2390943452146521, 
        -0.9421708175530891, 
        -1.4622768586461448, 
        -0.3973987276444045, 
        -0.04983495806442656, 
        -1.9139530636627042, 
        1.9562147260518052, 
        -0.8296105620697014, 
        0.2888681009437529, 
        -2.3943000262784424
    ]
]

或者使用 groupby 功能,对多天进行完整的示例:

rng = pd.date_range('1/1/2011 01:00:00', periods=48, freq='H') 
df = pd.DataFrame(np.random.randn(48, 1), index=rng, columns=['A'])

grouped = df.groupby(lambda x: x.day)
data = [group['A'].values.tolist() for day, group in grouped]
print json.dumps(data, indent=4)

输出:

[
    [
        -0.8939584996681688, 
        ...
        -1.1332895023662326
    ], 
    [
        -0.1514553673781838, 
        ...
        -1.8380494963443343
    ], 
    [
        -1.8342085568898159
    ]
]

非常优雅的解决方案。谢谢。 - Clayton

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接