如何使用pandas合并两个JSON文件

Question

如何使用pandas合并两个JSON文件

6

我正在尝试编写一个Python脚本，合并两个JSON文件，例如：

第一个文件：students.json

{"John Smith":{"age":16, "id": 1}, ...., "Paul abercom":{"age":18, "id": 764}}

第二个文件：teacher.json

{"Agathe Magesti":{"age":36, "id": 765}, ...., "Tom Ranliver":{"age":54, "id": 801}}

所以首先，为了不丢失任何信息，我修改文件并添加每个人的状态，就像这样：

{"John Smith":{"age":16, "id": 1, "status":"student"}, ...., "Paul abercom":{"age":18, "id": 764, "status":"student"}}

{"Agathe Magesti":{"age":36, "id": 765, "status":"teacher"}, ...., "Tom Ranliver":{"age":54, "id": 801, "status":"teacher"}}

我做了以下代码来实现这个：

import pandas as pd
type_student = pd.read_json('student.json')
type_student.loc["status"] = "student"
type_student.to_json("testStudent.json")
type_teacher = pd.read_json('teacher.json')
type_teacher.loc["status"] = "teacher"
type_teacher.to_json("testTeacher.json")
with open("testStudent.json") as data_file:
   data_student = json.load(data_file)
with open("testTeacher.json") as data_file:
   data_teacher = json.load(data_file)

我想做的是将data_student和data_teacher合并，并将结果JSON打印到一个JSON文件中，但我只能使用标准库、pandas、numpy和scipy。

经过一些测试，我意识到有些老师也是学生，这可能会对合并造成问题。

- mel

1

你不需要使用pandas来处理JSON数据。 - OneCricketeer

2个回答

1

你需要在将数据框转换为 JSON 之前连接这两个数据框：

pd.concat([data_teacher, data_student], axis=1).to_json()

- Régis B.

我无法编辑您的帖子，因为它少于6个字符，但是您能否更正：type_pd.concat为pd.concat。我收到以下错误：ValueError：orient ='columns'时DataFrame列必须唯一。我认为这是因为某些教师也是学生，当我将我的JSON加载到数据框中时，pandas会颠倒索引和列，我猜是因为它是嵌套的JSON导致的。 - mel

没有看到实际数据，很难知道问题的原因。建议使用 type_student 和 type_teacher 而不是 data_student 和 data_teacher 来解决问题。我建议您查看 concat 的文档。 - Régis B.

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Keith · Accepted Answer

看起来你的JSON文件包含顶级结构"对象"。它们对应于Python字典。因此，只需使用Python就可以轻松完成此操作。只需使用第二个字典更新第一个字典即可。

import json

with open("mel1.json") as fo:
    data1 = json.load(fo)

with open("mel2.json") as fo:
    data2 = json.load(fo)

data1.update(data2)

with open("melout.json", "w") as fo:
    json.dump(data1, fo)