我正在移植一个使用curl并将payloads发送到URL的bash脚本,现在遇到了问题。使用Robobrowser登录网站时,使用页面表单进行POST时出现问题。
步骤如下:
- 登录 /SubLogin.aspx - 成功后重定向到 /OptionsSummary.aspx - 使用参数GET /FindMe.aspx - 点击按钮“Phone Lists”(应该会加载包含项目“工作”的“Phone Lists”表格)POST /FindMe.aspx - 选择“工作”项会执行一个POST请求到/PhoneLists.aspx(这应该会加载一个名为“工作”的用户列表)
我已成功地通过RoboBrowser和Requests+bs4进行了身份验证并执行了GET操作,但是我对如何POST回到 页面本身感到困惑。
使用RoboBrowser(liboncall.py)
打开 "/FindMe.aspx":
获取"PhoneLists.aspx"表单,删除不必要的字段,填写并提交。
如果成功,上面的内容将返回:
除了表格中的以下内容(
在这个项目中,我发现Robobrowser没有包含所有所需的表单数据,无法按预期工作,“PhoneLists.aspx”中的提交(
步骤如下:
- 登录 /SubLogin.aspx - 成功后重定向到 /OptionsSummary.aspx - 使用参数GET /FindMe.aspx - 点击按钮“Phone Lists”(应该会加载包含项目“工作”的“Phone Lists”表格)POST /FindMe.aspx - 选择“工作”项会执行一个POST请求到/PhoneLists.aspx(这应该会加载一个名为“工作”的用户列表)
我已成功地通过RoboBrowser和Requests+bs4进行了身份验证并执行了GET操作,但是我对如何POST回到 页面本身感到困惑。
使用RoboBrowser(liboncall.py)
#!/usr/bin/python
from robobrowser import RoboBrowser
from bs4 import BeautifulSoup as BS
oc_mailbox = '123456'
oc_password_hashed = 'ABCDEFG'
base_uri = 'http://example.com'
auth_uri = oc_base_uri + '/SubLogin.aspx'
find_uri = oc_base_uri + '/FindMe.aspx'
phne_uri = oc_base_uri + '/PhoneLists.aspx'
p_auth_payload = {
'SubLoginControl:javascriptTest': 'true',
'SubLoginControl:mailbox': mailbox,
'SubLoginControl:phoneNumber': '',
'SubLoginControl:password': password_hashed,
'SubLoginControl:btnLogOn': 'Logon',
'SubLoginControl:webLanguage': 'en-US',
'SubLoginControl:initialLanguage': 'en-US',
'SubLoginControl:errorCallBackNumber': 'Entered telephone number contains non-dialable characters.',
'SubLoginControl:cookieMailbox': 'mailbox',
'SubLoginControl:cookieCallbackNumber': 'callbackNumber',
'SubLoginControl:serverDomain': ''
}
p_find_payload = {
'FindMeControl:enableFindMe': 'on',
'FindMeControl:MasterDataControl:focusElement': '',
'FindMeControl:MasterDataControl:masterList:_ctl0:enabled': 'on',
'FindMeControl:MasterDataControl:masterList:_ctl0:itemGuid': '',
'FindMeControl:MasterDataControl:hidSelectedScheduleName': '',
'FindMeControl:MasterDataControl:hidbtnStatus': '',
'FindMeControl:MasterDataControl:hidScheduleXML': '',
'FindMeControl:MasterDataControl:tempScheduleXML': '',
'FindMeControl:MasterDataControl:hidSelectedScheduleGUID': '',
'FindMeControl:MasterDataControl:hidChangedScheduleList': '',
'FindMeControl:btnPhoneLists': 'Phone Lists',
'FindMeControl:enableFindMeHidden': '',
'FindMeControl:applySet': 'false'
}
p_phne_payload = {
'__EVENTARGUMENT': '',
'__EVENTTARGET': 'PhoneListsControl$MasterDataControl$masterList$_ctl0$SelectButton',
'PhoneListsControl:MasterDataControl:focusElement': '',
'PhoneListsControl:MasterDataControl:masterList:_ctl0:itemGuid': '',
'PhoneListsControl:MasterDataControl:hidSelectedScheduleName': '',
'PhoneListsControl:MasterDataControl:hidbtnStatus': '',
'PhoneListsControl:MasterDataControl:hidScheduleXML': '',
'PhoneListsControl:MasterDataControl:tempScheduleXML': '',
'PhoneListsControl:MasterDataControl:hidSelectedScheduleGUID': '',
'PhoneListsControl:MasterDataControl:hidChangedScheduleList': '',
'PhoneListsControl:applySet': 'false'
}
def auth(mailbox, password):
browser = RoboBrowser(history=False)
browser.open(oc_auth_uri)
signin = browser.get_form(id='aspnetForm')
signin['SubLoginControl:mailbox'].value = mailbox
signin['SubLoginControl:password'].value = password
signin['SubLoginControl:javascriptTest'].value = 'true'
signin['SubLoginControl:btnLogOn'].value = 'Logon'
signin['SubLoginControl:webLanguage'].value = 'en-US'
signin['SubLoginControl:initialLanguage'].value = 'en-US'
signin['SubLoginControl:errorCallBackNumber'].value = 'Entered+telephone+number+contains+non-dialable+characters.'
signin['SubLoginControl:cookieMailbox'].value = 'mailbox'
signin['SubLoginControl:cookieCallbackNumber'].value = 'callbackNumber'
signin['SubLoginControl:serverDomain'].value = ''
browser.submit_form(signin)
return browser
登录网站并显示URL以验证我们是否在其中:
In [20]: from liboncall import *
In [21]: m = auth(oc_mailbox, oc_password_hashed)
In [22]: m.url
Out[22]: u'http://example.com/OptionsSummary.aspx'
打开 "/FindMe.aspx":
In [24]: m.open(find_uri)
In [25]: m.url
Out[25]: u'http://example.com/FindMe.aspx'
最初,“/FindMe.aspx”将加载一个表单和一个名为“电话列表”的按钮(FindMeControl:btnPhoneLists
)。
In [26]: m.select('title')
Out[26]: [<title>Find Me</title>]
In [27]: form_find_a = m.get_form(action="FindMe.aspx")
In [28]: for i in form_find_a.keys():
print(i)
....:
__VIEWSTATE
__EVENTVALIDATION
FindMeControl:enableFindMe
FindMeControl:MasterDataControl:focusElement
FindMeControl:MasterDataControl:masterList:_ctl0:enabled
FindMeControl:MasterDataControl:masterList:_ctl0:itemGuid
FindMeControl:MasterDataControl:btnAdd
FindMeControl:MasterDataControl:btnDelete
FindMeControl:MasterDataControl:btnRename
FindMeControl:MasterDataControl:btnCancel
FindMeControl:MasterDataControl:btnEnter
FindMeControl:MasterDataControl:btnUpdate
FindMeControl:MasterDataControl:hidSelectedScheduleName
FindMeControl:MasterDataControl:hidbtnStatus
FindMeControl:MasterDataControl:hidScheduleXML
FindMeControl:MasterDataControl:tempScheduleXML
FindMeControl:MasterDataControl:hidSelectedScheduleGUID
FindMeControl:MasterDataControl:hidChangedScheduleList
FindMeControl:btnApply
FindMeControl:btnSchedules
FindMeControl:btnPhoneLists
FindMeControl:enableFindMeHidden
FindMeControl:applySet
删除不必要的表单字段,填写表单并提交:
In [29]: find_remove = (
'FindMeControl:MasterDataControl:btnAdd',
'FindMeControl:MasterDataControl:btnDelete',
'FindMeControl:MasterDataControl:btnRename',
'FindMeControl:MasterDataControl:btnCancel',
'FindMeControl:MasterDataControl:btnEnter',
'FindMeControl:MasterDataControl:btnUpdate',
'FindMeControl:btnApply',
'FindMeControl:btnSchedules')
In [30]: for i in find_remove:
form_find_a.fields.pop(i)
In [31]: form_find_a['FindMeControl:enableFindMe'].value = 'on'
form_find_a['FindMeControl:MasterDataControl:focusElement'].value = ''
form_find_a['FindMeControl:MasterDataControl:masterList:_ctl0:enabled'].value = 'on'
form_find_a['FindMeControl:MasterDataControl:masterList:_ctl0:itemGuid'].value = ''
form_find_a['FindMeControl:MasterDataControl:hidSelectedScheduleName'].value = ''
form_find_a['FindMeControl:MasterDataControl:hidbtnStatus'].value = ''
form_find_a['FindMeControl:MasterDataControl:hidScheduleXML'].value = ''
form_find_a['FindMeControl:MasterDataControl:tempScheduleXML'].value = ''
form_find_a['FindMeControl:MasterDataControl:hidSelectedScheduleGUID'].value = ''
form_find_a['FindMeControl:MasterDataControl:hidChangedScheduleList'].value = ''
form_find_a['FindMeControl:btnPhoneLists'].value = 'Phone Lists'
form_find_a['FindMeControl:enableFindMeHidden'].value = ''
form_find_a['FindMeControl:applySet'].value = 'false'
Out [31]: ...
In [32]: m.submit_form(form_find_a)
验证页面是否已更新并具有“工作”列表项:
In [33]: m.parsed.find('title')
Out[33]: <title>Phone Lists</title>
In [34]: m.parsed.find('a', id='PhoneListsControl_MasterDataControl_masterList__ctl0_SelectButton')
Out[34]: <a class="linkButtonItem" href="javascript:__doPostBack('PhoneListsControl$MasterDataControl$masterList$_ctl0$SelectButton','')" id="PhoneListsControl_MasterDataControl_masterList__ctl0_SelectButton" onclick="javascript:onClick();">Work</a>
获取"PhoneLists.aspx"表单,删除不必要的字段,填写并提交。
In [35]: form_find_b = m.get_form(action='PhoneLists.aspx')
In [36]: phne_remove = (
'PhoneListsControl:MasterDataControl:btnAdd',
'PhoneListsControl:MasterDataControl:btnDelete',
'PhoneListsControl:MasterDataControl:btnRename',
'PhoneListsControl:MasterDataControl:btnCancel',
'PhoneListsControl:MasterDataControl:btnEnter',
'PhoneListsControl:MasterDataControl:btnUpdate',
'PhoneListsControl:btnApply',
'PhoneListsControl:btnBack')
In [37]: for i in phne_remove:
form_find_b.fields.pop(i)
In [38]: form_find_b['PhoneListsControl:MasterDataControl:focusElement'].value = ''
form_find_b['PhoneListsControl:MasterDataControl:hidChangedScheduleList'].value = ''
form_find_b['PhoneListsControl:MasterDataControl:hidScheduleXML'].value = ''
form_find_b['PhoneListsControl:MasterDataControl:hidSelectedScheduleGUID'].value = ''
form_find_b['PhoneListsControl:MasterDataControl:hidSelectedScheduleName'].value = ''
form_find_b['PhoneListsControl:MasterDataControl:hidbtnStatus'].value = ''
form_find_b['PhoneListsControl:MasterDataControl:masterList:_ctl0:itemGuid'].value = ''
form_find_b['PhoneListsControl:MasterDataControl:tempScheduleXML'].value = ''
form_find_b['PhoneListsControl:applySet'].value = 'false'
In [39]: m.submit_form(form_find_b)
查看帖子以查看用户列表是否已加载。在这种情况下,用户列表未加载。
In [40]: m.parsed.findAll('div', id='PhoneListsControl_phoneListMembersText')
Out[41]: [<div class="displayText" id="PhoneListsControl_phoneListMembersText"></div>]
如果成功,上面的内容将返回:
<div id="PhoneListsControl_phoneListMembersText" class="displayText" style="top: 315px; left: 281px;"> Work </div>
除了表格中的以下内容(
PhoneListsControl_phoneListDetail
):<input name="PhoneListsControl:phoneListDetail:_ctl2:number" type="text" value="95551234567" maxlength="50" id="PhoneListsControl_phoneListDetail__ctl2_number" onkeyup="enableApplyButton('PhoneListsControl_')" style="width:140px;">
...
<input name="PhoneListsControl:phoneListDetail:_ctl3:number" type="text" value="95551236789" maxlength="50" id="PhoneListsControl_phoneListDetail__ctl2_number" onkeyup="enableApplyButton('PhoneListsControl_')" style="width:140px;">
...
在这个项目中,我发现Robobrowser没有包含所有所需的表单数据,无法按预期工作,“PhoneLists.aspx”中的提交(
'__ EVENTTARGET':'PhoneListsControl $ MasterDataControl $ masterList $ _ctl0 $ SelectButton'
和__EVENTARGUMENT
)。然后设置参数并执行submit_form(form_find_b)
也无法实现预期的结果。我想知道是否可以使用robobrowser.forms.form
中的add_field()
,但我不理解如何正确地使用它,(如果要使用它,例如添加 __ EVENTTARGET 和 __ EVENTARGUMENT 隐藏输入字段到表单)。
还是说我忽略了其他什么东西,RoboBrowser / Requests不支持此类帖子?
正如在mechanize中提到的那样,表单需要执行javascript吗?这里。