我看到了很多在stackoverflow上使用pandas读取json文件的问题,但是我仍然无法解决这个简单的问题。
数据
{"session_id":{"0":["X061RFWB06K9V"],"1":["5AZ2X2A9BHH5U"]},"unix_timestamp":{"0":[1442503708],"1":[1441353991]},"cities":{"0":["New York NY, Newark NJ"],"1":["New York NY, Jersey City NJ, Philadelphia PA"]},"user":{"0":[[{"user_id":2024,"joining_date":"2015-03-22","country":"UK"}]],"1":[[{"user_id":2853,"joining_date":"2015-03-28","country":"DE"}]]}}
我的尝试
import numpy as np
import pandas as pd
import json
from pandas.io.json import json_normalize
# attempt1
df = pd.read_json('a.json')
# attempt2
with open('a.json') as fi:
data = json.load(fi)
df = json_normalize(data,record_path='user',meta=['session_id','unix_timestamp','cities'])
Both of them do not give me the required output.
需要的输出
session_id unix_timestamp cities user_id joining_date country
0 X061RFWB06K9V 1442503708 New York NY 2024 2015-03-22 UK
0 X061RFWB06K9V 1442503708 Newark NJ 2024 2015-03-22 UK
首选方法
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.io.json.json_normalize.html
这是一个关于“json_normalize”函数的详细说明,该函数是用于将JSON数据规范化为扁平的表格格式。更多信息请访问以上链接。I would love to see implementation of pd.io.json.json_normalize
pandas.io.json.json_normalize(data: Union[Dict, List[Dict]], record_path: Union[str, List, NoneType] = None, meta: Union[str, List, NoneType] = None, meta_prefix: Union[str, NoneType] = None, record_prefix: Union[str, NoneType] = None, errors: Union[str, NoneType] = 'raise', sep: str = '.', max_level: Union[int, NoneType] = None)
cities
和user
? - Jonnyboi