捕获异常出现了UnboundLocalError

Question

捕获异常出现了UnboundLocalError

7

我写了一个爬虫，从问答网站上获取信息。由于并非所有字段都始终在页面上显示，因此我使用了多个 try-except 语句来处理这种情况。

def answerContentExtractor( loginSession, questionLinkQueue , answerContentList) :
    while True:
        URL = questionLinkQueue.get()
        try:
            response   = loginSession.get(URL,timeout = MAX_WAIT_TIME)
            raw_data   = response.text

            #These fields must exist, or something went wrong...
            questionId = re.findall(REGEX,raw_data)[0]
            answerId   = re.findall(REGEX,raw_data)[0]
            title      = re.findall(REGEX,raw_data)[0]

        except requests.exceptions.Timeout ,IndexError:
            print >> sys.stderr, URL + " extraction error..."
            questionLinkQueue.task_done()
            continue

        try:
            questionInfo = re.findall(REGEX,raw_data)[0]
        except IndexError:
            questionInfo = ""

        try:
            answerContent = re.findall(REGEX,raw_data)[0]
        except IndexError:
            answerContent = ""

        result = {
                  'questionId'   : questionId,
                  'answerId'     : answerId,
                  'title'        : title,
                  'questionInfo' : questionInfo,
                  'answerContent': answerContent
                  }

        answerContentList.append(result)
        questionLinkQueue.task_done()

有时候，这段代码在运行时可能会抛出以下异常：

UnboundLocalError: local variable 'IndexError' referenced before assignment

行号指出第二个except IndexError:发生错误的位置。

感谢大家的建议，很想给予你们应得的评分，但很遗憾我只能将一个标记为正确答案...

- Paul Liang

错别字，我手打了一些不必要的行。已经编辑过了。 - Paul Liang

1

相关：一行代码中捕获多个异常（except块） - thefourtheye

这里的具体问题是2.x特有的，因为在3.x中必须使用as关键字来捕获异常。 - Karl Knechtel

3个回答

2

当你说

except requests.exceptions.Timeout ,IndexError:

Python会抛出requests.exceptions.Timeout错误，错误对象应该是TimeoutError而不是IndexError。

except (requests.exceptions.Timeout ,IndexError) as e:

- thefourtheye

1

except requests.exceptions.Timeout ,IndexError:

意思与 except requests.exceptions.Timeout as IndexError 相同。

你应该使用

except (requests.exceptions.Timeout, IndexError):

代替

- Kimvais

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Ashwini Chaudhary · Accepted Answer

在Python 2.x中，该行

except requests.exceptions.Timeout, IndexError:

等同于

except requests.exceptions.Timeout as IndexError:

因此，被 requests.exceptions.Timeout 捕获的异常被分配给了 IndexError。一个更简单的例子：

try:
    true
except NameError, IndexError:
    print IndexError
    #name 'true' is not defined

为了捕获多个异常，将名称放在括号中：

except (requests.exceptions.Timeout, IndexError):

稍后，一个 UnboundLocalError 可能会发生，因为对 IndexError 的赋值使其成为局部变量（遮蔽了内置名称）：可以发生。

>>> 'IndexError' in answerContentExtractor.func_code.co_varnames
True

因此，如果未引发requests.exceptions.Timeout，当代码尝试except IndexError:时，IndexError将不会被（错误地）定义。

再举一个简单的例子：

def func():
    try:
        func # defined, so the except block doesn't run,
    except NameError, IndexError: # so the local `IndexError` isn't assigned
        pass
    try:
        [][1]
    except IndexError:
        pass
func()
#UnboundLocalError: local variable 'IndexError' referenced before assignment

在3.x中，即使第一个异常被捕获，问题仍会发生（在修复了except语法后，这使得错误更加明显）。这是因为局部名称IndexError将在第一个try/except块之后明确地del。