从Pandas数据帧创建JSON对象

6
      Groups sub-groups selections
    0   sg1    csg1       sc1
    1   sg1    csg1       sc2
    2   sg1    csg2       sc3
    3   sg1    csg2       sc4
    4   sg2    csg3       sc5
    5   sg2    csg3       sc6
    6   sg2    csg4       sc7
    7   sg2    csg4       sc8

我有上述的数据框,正尝试创建以下JSON对象:
{
  "sg1": {
    "csg1": ['sc1', 'sc2'],
    "csg2": ['sc3', 'sc4']
  },
  "sg2": {
    "csg3": ['sc5', 'sc6'],
    "csg4": ['sc7', 'sc8']
  }
}

我尝试使用带有方向参数的pandas to_json和to_dict,但没有得到预期的结果。我还尝试了按列分组然后创建列表并将其转换为JSON。

非常感谢任何帮助。

3个回答

5

您可以使用 groupby ['Groups','sub-groups'] 进行分组,并使用字典推导构建来自多索引系列的字典:

s = df.groupby(['Groups','sub-groups']).selections.agg(list)
d = {k1:{k2:v} for (k1,k2),v in s.iteritems()}

print(d)
# {'sg1': {'csg2': ['sc3', 'sc4']}, 'sg2': {'csg4': ['sc7', 'sc8']}}

0

你需要按照感兴趣的列进行分组,例如:

import pandas as pd

data = {
        'Groups': ['sg1', 'sg1', 'sg1', 'sg1', 'sg2', 'sg2', 'sg2', 'sg2'],
        'sub-groups': ['csg1', 'csg1', 'csg2', 'csg2', 'csg3', 'csg3', 'csg4', 'csg4'],
        'selections': ['sc1', 'sc2', 'sc3', 'sc4', 'sc5', 'sc6', 'sc7', 'sc8']
}

df = pd.DataFrame(data)
print(df.groupby(['Groups', 'sub-groups'])['selections'].unique().to_dict())

输出结果为:

{
    ('sg1', 'csg1'): array(['sc1', 'sc2'], dtype=object), 
    ('sg1', 'csg2'): array(['sc3', 'sc4'], dtype=object), 
    ('sg2', 'csg3'): array(['sc5', 'sc6'], dtype=object), 
    ('sg2', 'csg4'): array(['sc7', 'sc8'], dtype=object)
}

0
让我们尝试使用“dictify”函数,该函数构建一个嵌套字典,其中顶级键来自“Groups”,相应的子级键来自“sub-groups”:
from collections import defaultdict

def dictify():
    dct = defaultdict(dict)
    for (x, y), g in df.groupby(['Groups', 'sub-groups']):
        dct[x][y] = [*g['selections']]
    return dict(dct)

# dictify()
{
    "sg1": {
        "csg1": ["sc1","sc2"],
        "csg2": ["sc3","sc4"]
    },
    "sg2": {
        "csg3": ["sc5","sc6"],
        "csg4": ["sc7","sc8"]
    }
}

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接