I have a pandas dataframe (sample) as follows
df = pd.DataFrame({'Country':['India', 'China', 'Nepal'],
'Habitat':[[{'city1':'Ind1','city2':'Ind2'},{'town1':'IndT1','town2':'IndT2'}],
[{'city1':'Chi1','city2':'Chi2'},{'town1':'ChiT1','town2':'ChiT2'}],
[{'city1':'Nep1','city2':'Nep2'},{'town1':'NepT1','town2':'NepT2'}]],
'num':[1,2,3]
})
df
Country Habitat num
0 India [{'city1':'Ind1','city2':'Ind2'},{'town1':'IndT1','town2':'IndT2'}] 1
1 China [{'city1':'Chi1','city2':'Chi2'},{'town1':'ChiT1','town2':'ChiT2'}] 2
2 Nepal [{'city1':'Nep1','city2':'Nep2'},{'town1':'NepT1','town2':'NepT2'}] 3
我需要以这种格式将其展平。
result_df = pd.DataFrame({'Country':['India', 'China', 'Nepal'],
'Habitat.city1':['Ind1','Chi1','Nep1'],
'Habitat.city2':['Ind2','Chi2','Nep2'],
'Habitat.town1':['IndT1','ChiT1','NepT1'],
'Habitat.town2':['IndT2','ChiT2','NepT2'],
'num':[1,2,3]
})
result_df
Country Habitat.city1 Habitat.city2 Habitat.town1 Habitat.town2 num
India Ind1 Ind2 IndT1 IndT2 1
China Chi1 Chi2 ChiT1 ChiT2 2
Nepal Nep1 Nep2 NepT1 NepT2 3
我尝试过 pd.json_normalize(df.explode('Habitat')['Habitat'])
,但它会创建我不需要的新行。
我的观察:
某种形式的groupby和transpose可以很好地构建在pd.json_normalize(df.explode('Habitat')['Habitat])
的基础上来解决我的问题,但到目前为止我还没有找到合适的方法。