在Python的for循环中捕获错误

4
我有一个在avro数据读取器对象上的for循环。
for i in reader:
    print i

然后在for循环语句中我遇到了Unicode解码错误,所以我想忽略那个特定的记录。于是我做了如下操作:

try:
    for i in reader:
        print i
except:
    pass

但它不会继续执行下去。我该如何解决这个问题?
编辑:添加错误追踪。
    Traceback (most recent call last):
  File "modify.py", line 22, in <module>
    for record in reader:
  File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/datafile.py", line 362, in next
    datum = self.datum_reader.read(self.datum_decoder) 
  File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/io.py", line 445, in read
    return self.read_data(self.writers_schema, self.readers_schema, decoder)
  File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/io.py", line 490, in read_data
    return self.read_record(writers_schema, readers_schema, decoder)
  File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/io.py", line 690, in read_record
    field_val = self.read_data(field.type, readers_field.type, decoder)
  File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/io.py", line 468, in read_data
    return decoder.read_utf8()
  File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/io.py", line 233, in read_utf8
    return unicode(self.read_bytes(), "utf-8")
UnicodeDecodeError: 'utf8' codec can't decode byte 0xb4 in position 14: invalid start byte

可能是因为文件已经损坏了导致的吗?

编辑2:根据答案中的建议,我修改了代码并得到了这个错误。

    Traceback (most recent call last):
  File "modify.py", line 28, in <module>
    print next(iterobject)["filepath"]
  File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/datafile.py", line 362, in next
    datum = self.datum_reader.read(self.datum_decoder) 
  File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/io.py", line 445, in read
    return self.read_data(self.writers_schema, self.readers_schema, decoder)
  File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/io.py", line 490, in read_data
    return self.read_record(writers_schema, readers_schema, decoder)
  File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/io.py", line 690, in read_record
    field_val = self.read_data(field.type, readers_field.type, decoder)
  File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/io.py", line 468, in read_data
    return decoder.read_utf8()
  File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/io.py", line 233, in read_utf8
    return unicode(self.read_bytes(), "utf-8")
  File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/io.py", line 226, in read_bytes
    return self.read(self.read_long())
  File "/usr/lib/python2.6/site-packages/avro-1.7.7-py2.6.egg/avro/io.py", line 184, in read_long
    b = ord(self.read(1))
TypeError: ord() expected a character, but string of length 0 found

1
你能否同时包含异常的回溯信息?并且请修复你代码片段的缩进。 - Tanveer Alam
请问您能否发布一下您在读取器中的数值? - user4711157
3个回答

4
你需要在循环内部使用try/except:
    for i in reader:
       try: 
           print i
       except UnicodeEncodeError:
           pass

顺便说一下,指定要捕获的特定错误类型是一个很好的实践方法(就像我在except UnicodeEncodeError:中所做的那样),否则你可能会使你的代码非常难以调试!


@iLoveCamelCase 如果是这样,那么avro读取器中存在一个bug,除非你想要修改它(修复这个bug),否则在你的脚本中几乎无能为力。 - user707650
我在想如果Avro文件损坏了。 - iLoveCamelCase

4
如果你的错误出现在for i in中,那么尝试这样做,它会在迭代器中跳过出现UnicodeDecodeError的元素。
iterobject = iter(reader)
while iterobject:
    try:
        print(next(iterobject))
    except StopIteration:
        break
    except UnicodeDecodeError:
        pass

1

你可以期待特定错误的出现,避免未知错误被忽略。

Python 3.x:

try:
    for i in reader:
        print i
except UnicodeDecodeError as ue:
    print(str(ue))

Python 2.x:

try:
    for i in reader:
        print i
except UnicodeDecodeError, ue:
    print(str(ue))

通过打印错误信息,可以知道发生了什么。当你只使用 except 时,你会接受任何异常(包括晦涩的 RuntimeError),而你永远不会知道发生了什么。有时候这可能是有用的,但它是危险的,通常是一种不好的做法。

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接