Python：将CSV转换为字典 - 使用标题作为键

Question

Python：将CSV转换为字典 - 使用标题作为键

4

Python: 3.x

您好。我有一个包含标题和行的CSV文件，行数可能因文件而异。我正在尝试将此CSV转换为字典格式，但第一行的数据重复出现。

"cdrRecordType","globalCallID_callManagerId","globalCallID_callId"
1,3,9294899
1,3,9294933

代码：

parserd_list = []
output_dict = {}
with open("files\\CUCMdummy.csv") as myfile:
    firstline = True
    for line in myfile:
        if firstline:
            mykeys = ''.join(line.split()).split(',')
            firstline = False
        else:
            values = ''.join(line.split()).split(',')
            for n in range(len(mykeys)):
                output_dict[mykeys[n].rstrip('"').lstrip('"')] = values[n].rstrip('"').lstrip('"')
                print(output_dict)
                parserd_list.append(output_dict)
#print(parserd_list)

通常我的CSV列数超过20列，但我已经提供了一个示例文件。

我使用rstrip/lstrip来去除双引号。

输出结果如下：

{'cdrRecordType': '1'}
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3'}
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294899'}
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294899'}
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294899'}
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294933'}

这是在 for 循环中使用 print 的输出结果。最终的输出结果也是一样的。

我不知道我犯了什么错误。请有人帮忙纠正一下。

提前感谢。

- Maria628

你实际上是在反复重用和添加同一个字典。将output_dict = {}直接移动到for n in range(len(mykeys)):循环之前。此外，你应该在这个循环之后将字典添加到列表中，而不是在每次迭代中添加。 - Michael Butscher

嗨，迈克尔，谢谢你的帮助。我把out_dict移出了FOR循环，它说“n”未定义。我已经用p = len(mykeys)代替了“n”，但它显示“列表索引超出范围”。 - Maria628

你不应该将 output_dict[mykeys[n]... 移出 for 循环，而是将字典添加到列表中。 - Michael Butscher

太好了，现在我明白了。将添加步骤移出FOR循环后它可以工作...谢谢。 - Maria628

3个回答

3

使用 `csv.DictReader`。

import csv

with open("files\\CUCMdummy.csv", mode='r',newline='\n') as myFile:
    reader = list(csv.DictReader(myFile, delimiter=',',quotechar='"'))

- Lambo

0

你的代码缩进不正确。这两行：

  print(output_dict)
  parserd_list.append(output_dict)

可以将它们的缩进取消以使它们与上面的for循环在同一行。除此之外，您需要为每个新文件行设置一个新字典。

您可以在键的for循环之前这样做：output_dict = {}。

如上所述，有一些库可以使生活更轻松。但是，如果您想坚持附加字典，则还可以加载文件的行，关闭它，并将行处理为以下内容：

with open("scratch.txt") as myfile:
    data = myfile.readlines()

keys = data[0].replace('"','').strip().split(',')

output_dicts = []
for line in data[1:]:
    values = line.strip().split(',')
    output_dicts.append(dict(zip(keys, values)))

print output_dicts 


[{'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294899', 'cdrRecordType': '1'}, {'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294933', 'cdrRecordType': '1'}]

- LeKhan9

你好LeKhan，太棒了。我已经得到所需的输出。但是请问你能帮我如何摆脱最终输出中的双引号吗？ - Maria628

你以前用过rstrip和lstrip做过这个。我的个人方法是这样的：keys = data[0].replace('"','').strip().split(',')。希望有所帮助 :) - LeKhan9

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- chuckx · Accepted Answer

不要手动解析CSV文件，应该使用csv模块。

这将导致更简单的脚本，并且能够优雅地处理边缘情况（例如标题行、不一致的引号字段等）。

import csv

with open('example.csv') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        print(row)

输出：

$ python3 parse-csv.py
OrderedDict([('cdrRecordType', '1'), ('globalCallID_callManagerId', '3'), ('globalCallID_callId', '9294899')])
OrderedDict([('cdrRecordType', '1'), ('globalCallID_callManagerId', '3'), ('globalCallID_callId', '9294933')])

如果您想手动解析，这是一个可行的方法：

parsed_list = []
with open('example.csv') as myfile:
    firstline = True
    for line in myfile:
        # Strip leading/trailing whitespace and split into a list of values.
        values = line.strip().split(',')

        # Remove surrounding double quotes from each value, if they exist.
        values = [v.strip('"') for v in values]

        # Use the first line as keys.
        if firstline:
            keys = values
            firstline = False
            # Skip to the next iteration of the for loop.
            continue

        parsed_list.append(dict(zip(keys, values)))

for p in parsed_list:
    print(p)

输出：

$ python3 manual-parse-csv.py
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294899'}
{'cdrRecordType': '1', 'globalCallID_callManagerId': '3', 'globalCallID_callId': '9294933'}