我有一个文本文件,其中的数据格式如下:
{"date":"Jan 6"; "time":"07:00:01"; "ip":"178.41.163.99"; "user":"null"; "country":"Slovakia"; "city":"Miloslavov"; "lat":48.1059; "lon":17.3}
{"date":"Jan 6"; "time":"07:05:26"; "ip":"37.123.163.124"; "user":"postgres"; "country":"Sweden"; "city":"Gothenburg"; "lat":57.7072; "lon":11.9668}
{"date":"Jan 6"; "time":"07:05:26"; "ip":"37.123.163.124"; "user":"null"; "country":"Sweden"; "city":"Gothenburg"; "lat":57.7072; "lon":11.9668}
我需要将它读入pandas DataFrame中,使用键作为列名,值作为项。这是我用来读取数据的代码:
columns = ['date', 'time', 'ip', 'user', 'country', 'city', 'lat', 'lon']
df = pd.read_csv("log.txt", sep=';', header=None, names=columns)
有点沮丧,因为我所做的只有这个:
date time ... lat lon
0 {"date":"Jan 6" "time":"07:00:01" ... "lat":48.1059 "lon":17.3}
1 {"date":"Jan 6" "time":"07:05:26" ... "lat":57.7072 "lon":11.9668}
2 {"date":"Jan 6" "time":"07:05:26" ... "lat":57.7072 "lon":11.9668}
我已经从头到尾阅读了文档,但仍然无法达到所需的结果,就像下面这样:
date time ... lat lon
0 Jan 6 07:00:01 ... 48.1059 17.3
1 Jan 6 07:05:26 ... 57.7072 11.9668
2 Jan 6 07:05:26 ... 57.7072 11.9668
这是否有可能呢?任何建议都将不胜感激。谢谢。
read_json
,但是因为没有使用lines=True
,所以它对我没起作用。现在它像魔法一样奏效了,因为我自己生成源文件并将;
改为,
:) 问题是一些城市名字包含,
,而read_json
无法正确解析它们...不管怎样,非常感谢! - Denis Rasulev