使用Python库`requests`时结果不可预测

3
为什么这段代码会出现问题:

import requests


response = requests.post('http://evds.tcmb.gov.tr/cgi-bin/famecgi', data={
    'cgi': '$ozetweb',
    'ARAVERIGRUP': 'bie_yymkpyuk.db',
    'DIL': 'UK',
    'ONDALIK': '5',
    'wfmultiple_selection': 'ZAMANSERILERI',
    'f_begdt': '07-01-2005',
    'f_enddt': '07-10-2016',
    'ZAMANSERILERI': ['TP.PYUK1', 'TP.PYUK2', 'TP.PYUK21', 'TP.PYUK22', 'TP.PYUK3', 'TP.PYUK4', 'TP.PYUK5', 'TP.PYUK6'],
    'YON': '3',
    'SUBMITDEG': 'Report',
    'GRTYPE': '1',
    'EPOSTA': 'xxx',
    'RESIMPOSTA': '***',
})

print(response.text)

在Python 2(2.7.12)和Python 3(3.5.2)中,使用requests==2.11.1库的相同API产生不同的结果?由于requests库支持两个Python版本,因此我认为结果应该是相同的。
预期结果是在Python 2中运行代码时获得的结果。它每次都有效。当在Python 3上运行时,服务器有时会返回错误,有时会正常工作。(这是有趣的部分。)
由于它在Python 2中有效,我认为错误必须发生在客户端。 Python 3处理编码或通过套接字发送数据的方式是否存在任何注意事项,我应该知道吗? 编辑: 在下面的评论中,一个人能够重现这个问题并确认存在这个问题。

1
什么版本的requests? - Padraic Cunningham
2
POST 通常会修改服务器上的数据。你为什么期望它每次都返回相同的结果呢? - wim
2
你可以尝试使用 from __future__ import unicode_literals,然后再次检查 Python2 的行为。因为如果不这样做,实际上请求并不相同(Python2 具有字节数据,而 Python3 具有字符串数据)。 - wim
2
请求的主体在编码等方面是相同的,除非由于某种原因顺序影响了输出,否则我看不出还有什么不同。 - Padraic Cunningham
2
我认为我们可能有线索了,运行这两个 http://pastebin.com/BRFfEKdE,在底部添加 print(good.text)print(bad.text),我敢打赌只有好的会输出你想要的结果。我认为顺序确实很重要,因为Python3和Python2中哈希算法的工作方式不同,这就是你看到差异的原因。 - Padraic Cunningham
显示剩余18条评论
1个回答

1

在python2和python3中,似乎存在不同之处,涉及到默认情况下启用哈希随机化,自python3.3以来,服务器需要至少将cgi字段放在最前面,以下代码可以复现:

good = requests.post('http://evds.tcmb.gov.tr/cgi-bin/famecgi', data=([
    ('cgi', '$ozetweb'),
    ('ARAVERIGRUP', 'bie_yymkpyuk.db'),
    ('DIL', 'UK'),
    ('ONDALIK', '5'),
    ('wfmultiple_selection', 'ZAMANSERILERI'),
    ('f_begdt', '07-01-2005'),
    ('f_enddt', '07-10-2016'),
    ('ZAMANSERILERI',
     ['TP.PYUK1', 'TP.PYUK2', 'TP.PYUK21', 'TP.PYUK22', 'TP.PYUK3', 'TP.PYUK4', 'TP.PYUK5', 'TP.PYUK6']),
    ('YON', '3'),
    ('SUBMITDEG', 'Report'),
    ('GRTYPE', '1'),
    ('EPOSTA', 'xxx'),
    ('RESIMPOSTA', '***')]))


bad = requests.post('http://evds.tcmb.gov.tr/cgi-bin/famecgi', data=([
    ('ARAVERIGRUP', 'bie_yymkpyuk.db'),
    ('cgi', '$ozetweb'),
    ('DIL', 'UK'),
    ('wfmultiple_selection', 'ZAMANSERILERI'),
    ('ONDALIK', '5'),
    ('f_begdt', '07-01-2005'),
    ('f_enddt', '07-10-2016'),
    ('ZAMANSERILERI',
     ['TP.PYUK1', 'TP.PYUK2', 'TP.PYUK21', 'TP.PYUK22', 'TP.PYUK3', 'TP.PYUK4', 'TP.PYUK5', 'TP.PYUK6']),
    ('YON', '3'),
    ('SUBMITDEG', 'Report'),
    ('GRTYPE', '1'),
    ('EPOSTA', 'xxx'),
    ('RESIMPOSTA', '***')]))

使用Python2运行上述代码:
In [6]: print(good.request.body)
   ...: print(bad.request.body)
   ...: 
   ...: print(len(good.text), len(bad.text))
   ...: 
cgi=%24ozetweb&ARAVERIGRUP=bie_yymkpyuk.db&DIL=UK&ONDALIK=5&wfmultiple_selection=ZAMANSERILERI&f_begdt=07-01-2005&f_enddt=07-10-2016&ZAMANSERILERI=TP.PYUK1&ZAMANSERILERI=TP.PYUK2&ZAMANSERILERI=TP.PYUK21&ZAMANSERILERI=TP.PYUK22&ZAMANSERILERI=TP.PYUK3&ZAMANSERILERI=TP.PYUK4&ZAMANSERILERI=TP.PYUK5&ZAMANSERILERI=TP.PYUK6&YON=3&SUBMITDEG=Report&GRTYPE=1&EPOSTA=xxx&RESIMPOSTA=%2A%2A%2A
ARAVERIGRUP=bie_yymkpyuk.db&cgi=%24ozetweb&DIL=UK&wfmultiple_selection=ZAMANSERILERI&ONDALIK=5&f_begdt=07-01-2005&f_enddt=07-10-2016&ZAMANSERILERI=TP.PYUK1&ZAMANSERILERI=TP.PYUK2&ZAMANSERILERI=TP.PYUK21&ZAMANSERILERI=TP.PYUK22&ZAMANSERILERI=TP.PYUK3&ZAMANSERILERI=TP.PYUK4&ZAMANSERILERI=TP.PYUK5&ZAMANSERILERI=TP.PYUK6&YON=3&SUBMITDEG=Report&GRTYPE=1&EPOSTA=xxx&RESIMPOSTA=%2A%2A%2A
(71299, 134)

还有 Python3:

In [4]: print(good.request.body)
   ...: print(bad.request.body)
   ...: 
   ...: print(len(good.text), len(bad.text))
   ...: 
cgi=%24ozetweb&ARAVERIGRUP=bie_yymkpyuk.db&DIL=UK&ONDALIK=5&wfmultiple_selection=ZAMANSERILERI&f_begdt=07-01-2005&f_enddt=07-10-2016&ZAMANSERILERI=TP.PYUK1&ZAMANSERILERI=TP.PYUK2&ZAMANSERILERI=TP.PYUK21&ZAMANSERILERI=TP.PYUK22&ZAMANSERILERI=TP.PYUK3&ZAMANSERILERI=TP.PYUK4&ZAMANSERILERI=TP.PYUK5&ZAMANSERILERI=TP.PYUK6&YON=3&SUBMITDEG=Report&GRTYPE=1&EPOSTA=xxx&RESIMPOSTA=%2A%2A%2A
ARAVERIGRUP=bie_yymkpyuk.db&cgi=%24ozetweb&DIL=UK&wfmultiple_selection=ZAMANSERILERI&ONDALIK=5&f_begdt=07-01-2005&f_enddt=07-10-2016&ZAMANSERILERI=TP.PYUK1&ZAMANSERILERI=TP.PYUK2&ZAMANSERILERI=TP.PYUK21&ZAMANSERILERI=TP.PYUK22&ZAMANSERILERI=TP.PYUK3&ZAMANSERILERI=TP.PYUK4&ZAMANSERILERI=TP.PYUK5&ZAMANSERILERI=TP.PYUK6&YON=3&SUBMITDEG=Report&GRTYPE=1&EPOSTA=xxx&RESIMPOSTA=%2A%2A%2A
71299 134

将您的字典作为Python2中发布的方式传递:

In [4]: response.request.body
Out[4]: 'cgi=%24ozetweb&DIL=UK&f_enddt=07-10-2016&YON=3&RESIMPOSTA=%2A%2A%2A&wfmultiple_selection=ZAMANSERILERI&ARAVERIGRUP=bie_yymkpyuk.db&GRTYPE=1&SUBMITDEG=Report&f_begdt=07-01-2005&ZAMANSERILERI=TP.PYUK1&ZAMANSERILERI=TP.PYUK2&ZAMANSERILERI=TP.PYUK21&ZAMANSERILERI=TP.PYUK22&ZAMANSERILERI=TP.PYUK3&ZAMANSERILERI=TP.PYUK4&ZAMANSERILERI=TP.PYUK5&ZAMANSERILERI=TP.PYUK6&ONDALIK=5&EPOSTA=xxx'

In [5]: len(response.text)
Out[5]: 71299

同样的字典在Python3中:

In [3]: response.request.body
Out[3]: 'EPOSTA=xxx&ARAVERIGRUP=bie_yymkpyuk.db&DIL=UK&SUBMITDEG=Report&cgi=%24ozetweb&GRTYPE=1&f_enddt=07-10-2016&wfmultiple_selection=ZAMANSERILERI&ONDALIK=5&f_begdt=07-01-2005&RESIMPOSTA=%2A%2A%2A&YON=3&ZAMANSERILERI=TP.PYUK1&ZAMANSERILERI=TP.PYUK2&ZAMANSERILERI=TP.PYUK21&ZAMANSERILERI=TP.PYUK22&ZAMANSERILERI=TP.PYUK3&ZAMANSERILERI=TP.PYUK4&ZAMANSERILERI=TP.PYUK5&ZAMANSERILERI=TP.PYUK6'

In [4]: len(response.text)
Out[4]: 134

在启动另一个ipython2 shell之前运行~$ export PYTHONHASHSEED=1234

In [4]: response.request.body
Out[4]: 'DIL=UK&GRTYPE=1&ARAVERIGRUP=bie_yymkpyuk.db&f_begdt=07-01-2005&RESIMPOSTA=%2A%2A%2A&ONDALIK=5&EPOSTA=xxx&YON=3&SUBMITDEG=Report&wfmultiple_selection=ZAMANSERILERI&cgi=%24ozetweb&ZAMANSERILERI=TP.PYUK1&ZAMANSERILERI=TP.PYUK2&ZAMANSERILERI=TP.PYUK21&ZAMANSERILERI=TP.PYUK22&ZAMANSERILERI=TP.PYUK3&ZAMANSERILERI=TP.PYUK4&ZAMANSERILERI=TP.PYUK5&ZAMANSERILERI=TP.PYUK6&f_enddt=07-10-2016'

In [5]: os.environ["PYTHONHASHSEED"]
Out[5]: '1234'
In [6]: len(response.text)
Out[6]: 134

你可以运行代码多次以达到相同的目的,但是一定要让 ('cgi', '$ozetweb') 首先出现才能使代码正常工作,有时候使用 Python3 时由于键的顺序不确定,cgi 会首先出现从而让代码偶尔能够正常工作。关于哈希主题还有更多内容,请参考 hashing topic

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接