"ascii"编解码器无法解码位于位置0xef的字节

Question

"ascii"编解码器无法解码位于位置0xef的字节

4

我在这一行遇到了一个烦人的错误：

    r += '\n<Placemark><name>'+row[3].encode('utf-8','xmlcharrefreplace')+'</name>' \
         '\n<description>'+desc.encode('utf-8','xmlcharrefreplace')+'</description>\n' \
         '<Point><coordinates>'+row[clat].encode('utf-8','xmlcharrefreplace')+
         ','+row[clongitude].encode('utf-8','xmlcharrefreplace')+'</coordinates></Point>\n' \
         '<address>'+row[4].encode('utf-8','xmlcharrefreplace')+'</address>\n' \
         '<styleUrl>'+row[cstyleID].encode('utf-8','xmlcharrefreplace')+'</styleUrl>\n' \
         '</Placemark>'

这里有一个错误：

Traceback (most recent call last):
  File "<pyshell#38>", line 1, in <module>
    doStuff()
  File "C:\Python27\work\GenerateKML.py", line 5, in doStuff
    createFiles('together.csv')
  File "C:\Python27\work\GenerateKML.py", line 55, in createFiles
    '<styleUrl>'+row[cstyleID].encode('utf-8','xmlcharrefreplace')+'</styleUrl>\n' \
UnicodeDecodeError: 'ascii' codec can't decode byte 0xef in position 60: ordinal not in range(128)

我做错了什么？

谢谢你的帮助。

以下是完整的源代码：

import hashlib
import csv

def doStuff():
  createFiles('together.csv')

def readFile(fileName):
  a=open(fileName)
  fileContents=a.read()
  a.close()
  return fileContents

def readCSVFile(fileName):
  return list(csv.reader(open(fileName, 'rb'), delimiter=',', quotechar='"'))

def GetDistinctValues(theFile, theColumn):
    with open(theFile, "rb") as fp:
        reader = csv.reader(fp)
        return list(set(line[theColumn] for line in reader))

def createFiles(inputFile):
  cNAME=3
  clat=0
  clongitude=1
  caddress1=4
  caddress2=5
  cplace=6
  ccity=7
  cstate=8
  czip=9
  cphone=10
  cwebsite=11
  cstyleID=18
  inputFileText=readCSVFile(inputFile)
  headerFile = readFile('header.txt')
  footerFile = readFile('footer.txt')
  r=headerFile
  DISTINCTCOLUMN=12
  dValues = GetDistinctValues(inputFile,DISTINCTCOLUMN)
  counter=0

  for uniqueValue in dValues:
    counter+=1
    print uniqueValue
    theHash=hashlib.sha224(uniqueValue).hexdigest()
    for row in inputFileText:
      if uniqueValue==row[DISTINCTCOLUMN]:
        for eachElement in row:
          eachElement=eachElement.replace('&','&amp;')            
        desc = ' '.join(row[3:])
        r += '\n<Placemark><name>'+row[3].encode('utf-8','xmlcharrefreplace')+'</name>' \
             '\n<description>'+desc.encode('utf-8','xmlcharrefreplace')+'</description>\n' \
             '<Point><coordinates>'+row[clat].encode('utf-8','xmlcharrefreplace')+
             ','+row[clongitude].encode('utf-8','xmlcharrefreplace')+'</coordinates></Point>\n' \
             '<address>'+row[4].encode('utf-8','xmlcharrefreplace')+'</address>\n' \
             '<styleUrl>'+row[cstyleID].encode('utf-8','xmlcharrefreplace')+'</styleUrl>\n' \
             '</Placemark>'      
    r += footerFile

    f = open(theHash+'.kml','w')
    f.write(r)
    f.close()
    r=headerFile

- Alex Gordon

1

for eachElement in row 循环不会改变 row。 - jfs

在循环之前和之后检查 row。请注意，即使其中有 &，row 也不会改变。 - jfs

你尝试过打印 row[cstyleID] 吗？周围会不会有非ASCII字符（如果异常是 UnicodeDecodeError，那么应该涉及Unicode或ASCII的问题，对吧？）？你应该尝试确保它是Unicode编码的。 - Pierre GM

1

就此而言，那一行代码太长了。如果将其细分为多个较小的行，您可能已经找到了问题。另外，对于像 x + y + z + a + b + c 这样的字符串，在一些 Python 解释器中是 O(n^2) 的；最好将它们全部放入列表中，然后使用 ''.join(list_)。 - dstromberg

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- jfs · Accepted Answer

您可能正在尝试编码字节。在这种情况下，Python首先使用默认编码（ASCII）解码字节，然后使用您提供的编码对结果Unicode字符串进行编码。

解决方案是：不要编码字节，即仅在Unicode字符串上使用encode()。在您的情况下完全不使用它。

为了创建有效的XML文档，您可以使用xml.etree.ElementTree模块。