将嵌套字典写入csv文件

Question

将嵌套字典写入csv文件

5

我有一个字典：

dic = {"Location1":{"a":1,"b":2,"c":3},"Location2":{"a":4,"b":5,"c":6}}

我想将这个字典制成一个csv表格，其中最上方的键位于左侧列，子键在最上面的行中作为标题，每个后续行都会填入子键值，如下所示：

Location    a   b   c
Location1   1   2   3
Location2   4   5   6

我已经成功地使用以下脚本完成了这个任务：

import csv

dic = {"Location1":{"a":1,"b":2,"c":3},"Location2":{"a":4,"b":5,"c":6}}
fields = ["Location","a","b","c"]

with open(r"C:\Users\tyler.cowan\Desktop\tabulated.csv", "w", newline='') as f:
    w = csv.DictWriter(f, extrasaction='ignore', fieldnames = fields)
    w.writeheader()
    for k in dic:
        w.writerow({field: dic[k].get(field) or k for field in fields})

有趣的是，我将这个测试案例写成了一个真实案例，并最终导致我的位置键被分布到其他列中。起初我想，我一定是在构建字典时搞错了，但检查后发现我的字典格式完全相同，只是键值更多了而已。然而输出结果却像这样：

Location    a   b   c   d           e   f   g   h
Location1   1   2   3   Location1   7   8   9   10
Location2   4   5   6   Location2   2   3   4   5

以下是我完整的脚本。

# -*- coding: utf-8 -*-

import os
import csv


def pretty(d, indent=0):
    #prettify dict for visual Inspection
   for key, value in d.items():
      print('\t' * indent + str(key))
      if isinstance(value, dict):
         pretty(value, indent+1)
      else:
         if value == "":
             print("fubar")
         print('\t' * (indent+1) + str(value))



inFolder = "Folder"
dirList = os.listdir(inFolder)

#print(dirList)
fields = [ 'Lat-Long']
allData = {}
for file in dirList:
    fname, ext = os.path.splitext(file)
    if fname not in fields:
        fields.append(fname)

    #handle .dat in this block
    if ext.lower() == ".dat":
        #print("found dat ext: " + str(ext))
        with open(os.path.join(inFolder,file), "r") as f:
            for row in f:
                try:
                    row1 = row.split(" ")
                    if str(row1[0])+"-"+str(row1[1]) not in allData:
                        allData[str(row1[0])+"-"+str(row1[1])] = {}
                    else:
                        allData[str(row1[0])+"-"+str(row1[1])][fname] = row1[2]

                except IndexError:
                    row2 = row.split("\t")
                    if str(row2[0])+"-"+str(row2[1]) not in allData:
                        allData[str(row2[0])+"-"+str(row2[1])] = {}
                    else:
                        allData[str(row2[0])+"-"+str(row2[1])][fname] = "NA"

    elif ext.lower() == ".csv":
        with open(os.path.join(inFolder,file), "r") as f:
            for row in f:
                row1 = row.split(",")
                if str(row1[0])+"-"+str(row1[1]) not in allData:
                    allData[str(row1[0])+"-"+str(row1[1])] = {}
                else:
                    allData[str(row1[0])+"-"+str(row1[1])][fname] = row1[2]



pretty(allData)

with open("testBS.csv", "w", newline='') as f:
    w = csv.DictWriter(f, extrasaction='ignore', fieldnames = fields)
    w.writeheader()
    for k in allData:
        w.writerow({field: allData[k].get(field) or k for field in fields})

输入数据如下：

"example.dat"

32.1    101.3   65
32.1    101.3   66
32.1    101.3   67
32.1    101.3   68
32.1    101.3   69
32.1    101.3   70
32.1    101.3   71

我希望找到方法来诊断和解决这种行为问题，因为我似乎无法弄清测试和实际情况之间的区别。

- Tyler Cowan

1

如果你有的话，我会推荐使用pandas。 - cs95

我有pandas，但会退而求其次，我想了解原始的解决方案。 - Tyler Cowan

2个回答

2

你可以使用pandas来完成这个任务。

import pandas as pd
dic = {"Location1":{"a":1,"b":2,"c":3},"Location2":{"a":4,"b":5,"c":6}, "Location3":{'e':7,'f':8, 'g':9, 'h':10}, "Location4":{'e': 2, 'f': 3, 'g': 4, 'h': 5}}
pd.DataFrame.from_dict(dic, orient='index').to_csv('temp.csv')

输出：

 ,a,b,c,e,f,g,h
 Location1,1.0,2.0,3.0,,,,
 Location2,4.0,5.0,6.0,,,,
 Location3,,,,7.0,8.0,9.0,10.0
 Location4,,,,2.0,3.0,4.0,5.0

- wcsit

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Ajax1234 · Accepted Answer

一个可能的解决方案是创建一个包含位置值和所有子字典键的完整列表的 csv 表头。这样，所有子字典值都可以写入其正确的“键”列下面：

import csv
dic = {"Location1":{"a":1,"b":2,"c":3},"Location2":{"a":4,"b":5,"c":6}, "Location3":{'e':7,'f':8, 'g':9, 'h':10}, "Location4":{'e': 2, 'f': 3, 'g': 4, 'h': 5}}
header = sorted(set(i for b in map(dict.keys, dic.values()) for i in b))
with open('filename.csv', 'w', newline="") as f:
  write = csv.writer(f)
  write.writerow(['location', *header])
  for a, b in dic.items():
     write.writerow([a]+[b.get(i, '') for i in header])

输出：

location,a,b,c,e,f,g,h
Location1,1,2,3,,,,
Location2,4,5,6,,,,
Location3,,,,7,8,9,10
Location4,,,,2,3,4,5