读取CSV文件并转换为数组时出现KeyError错误

5

我有一个名为'r2.csv'的示例csv文件:

Factory | Product_Number |   Date     | Avg_Noshow | Walk_Cost | Room_Rev
-------------------------------------------------------------------------
   A    |      1         | 01APR2017  |   5.6      |  125      |  275
-------------------------------------------------------------------------
   A    |      1         | 02APR2017  |   4.5      |  200      |  300
-------------------------------------------------------------------------
   A    |      1         | 03APR2017  |   6.6      |  150      |  250
-------------------------------------------------------------------------
   A    |      1         | 04APR2017  |   7.5      |  175      |  325
-------------------------------------------------------------------------

我有以下Python代码用于读取CSV文件并将列转换为数组:

# Read csv file
import csv
with open('r2.csv', 'r') as infile:

   reader = csv.DictReader(infile)
   data = {}
   for row in reader:
       for header, value in row.items():
          try:
                data[header].append(value)
          except KeyError:
                data[header] = [value]

 # Transfer the column from list to arrays for later computation.

mu = data['Avg_Noshow']
cs = data['Walk_Cost']
co = data['Room_Rev']

mu = map(float,mu)
cs = map(float,cs)
co = map(float,co)

除了最后一行,它运行得很好,但出现以下错误消息:

File "<stdin>", line 1, in <module>
  KeyError: 'Room_Rev'

我该如何避免它?


乍一看,应该使用逗号分隔值而不是竖线吧? - Garrett Kadillak
@GarrettKadillak 你是指CSV文件吗?它是逗号分隔值文件。 - Chenxi
2个回答

1
我只使用了您的CSV文件中的前两行,但这样可以给您想要的输出结果。
with open('r2.csv', 'rb') as fin:
    reader = csv.DictReader(fin)
    data = {}
    for row in reader:
        for k, v in row.iteritems():
            if k in data:
                data[k] = [data[k],v]
            else:
                data[k] = v

And this returns:

{'Avg_Noshow': ['5.6', '4.5'],
 'Date': ['1-Apr-17', '2-Apr-17'],
 'Factory': ['A', 'A'],
 'Product_Number': ['1', '1'],
 'Room_Rev': ['275', '300'],
 'Walk_Cost': ['125', '200']}

谢谢@Dmitry Polonskiy,我尝试了这段代码,但它返回的错误信息与我发布的相同。 - Chenxi
谢谢!来自我的输入数据的错误。 - Chenxi

0

我无法使用这个经过清理的代码版本重现问题:

# Read csv file
import csv
with open('r2.csv', 'r') as infile:
    reader = csv.DictReader(infile)
    data = {}
    for row in reader:
        print('row: {}'.format(row))
        for header, value in row.items():
            try:
                data[header].append(value)
            except KeyError:
                data[header] = [value]

print('')
from pprint import pprint
pprint(data)

# Transfer the column from list to arrays for later computation.

mu = data['Avg_Noshow']
cs = data['Walk_Cost']
co = data['Room_Rev']

mu = map(float, mu)
cs = map(float, cs)
co = map(float, co)

这是它产生的打印输出:

 row: {'Walk_Cost': '125', 'Factory': 'A', 'Avg_Noshow': '5.6', 'Product_Number': '1', 'Date': '01APR2017', 'Room_Rev': '275'}
 row: {'Walk_Cost': '200', 'Factory': 'A', 'Avg_Noshow': '4.5', 'Product_Number': '1', 'Date': '02APR2017', 'Room_Rev': '300'}
 row: {'Walk_Cost': '150', 'Factory': 'A', 'Avg_Noshow': '6.6', 'Product_Number': '1', 'Date': '03APR2017', 'Room_Rev': '250'}
 row: {'Walk_Cost': '175', 'Factory': 'A', 'Avg_Noshow': '7.5', 'Product_Number': '1', 'Date': '04APR2017', 'Room_Rev': '325'}

 {'Avg_Noshow': ['5.6', '4.5', '6.6', '7.5'],
  'Date': ['01APR2017', '02APR2017', '03APR2017', '04APR2017'],
  'Factory': ['A', 'A', 'A', 'A'],
  'Product_Number': ['1', '1', '1', '1'],
  'Room_Rev': ['275', '300', '250', '325'],
  'Walk_Cost': ['125', '200', '150', '175']}

这是我自己创建并使用的r2.csv测试,因为您没有提供一个:

Factory,Product_Number,Date,Avg_Noshow,Walk_Cost,Room_Rev
A,1,01APR2017,5.6,125,275
A,1,02APR2017,4.5,200,300
A,1,03APR2017,6.6,150,250
A,1,04APR2017,7.5,175,325

谢谢@martineau。通过您提供的检查信息,我发现在列名“Room_rev”中输入了一个多余的空格。现在已经解决了。 - Chenxi
陈曦:不用谢。尽管有时会被嘲笑,但在代码的某些关键点上打印东西是一种非常古老且被低估的调试技术。这对Python来说尤为适合,因为没有需要经过漫长的代码编译和链接步骤。 - martineau

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接