使用Python在代理下运行Selenium Webdriver

124

我正在尝试在Python中运行一个Selenium Webdriver脚本来完成一些基本任务。当通过Selenium IDE界面运行机器人时(即:仅重复我的操作),我可以完美地让机器人运行。但是,当将代码导出为Python脚本并尝试从命令行执行它时,Firefox浏览器将打开,但无法访问起始URL(错误返回给命令行,程序停止)。无论我试图访问什么网站,这种情况都会发生。

我在此处提供了一个非常基本的代码以进行演示。我认为我没有正确地包括代码的代理部分,因为返回的错误似乎是由代理生成的。

任何帮助都将不胜感激。

以下代码仅旨在打开www.google.ie并搜索单词“selenium”。对我而言,它会打开一个空白的Firefox浏览器,并停止运行。

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import Select
from selenium.common.exceptions import NoSuchElementException
import unittest, time, re
from selenium.webdriver.common.proxy import *

class Testrobot2(unittest.TestCase):
    def setUp(self):

        myProxy = "http://149.215.113.110:70"

        proxy = Proxy({
        'proxyType': ProxyType.MANUAL,
        'httpProxy': myProxy,
        'ftpProxy': myProxy,
        'sslProxy': myProxy,
        'noProxy':''})

        self.driver = webdriver.Firefox(proxy=proxy)
        self.driver.implicitly_wait(30)
        self.base_url = "https://www.google.ie/"
        self.verificationErrors = []
        self.accept_next_alert = True

    def test_robot2(self):
        driver = self.driver
        driver.get(self.base_url + "/#gs_rn=17&gs_ri=psy-ab&suggest=p&cp=6&gs_id=ix&xhr=t&q=selenium&es_nrs=true&pf=p&output=search&sclient=psy-ab&oq=seleni&gs_l=&pbx=1&bav=on.2,or.r_qf.&bvm=bv.47883778,d.ZGU&fp=7c0d9024de9ac6ab&biw=592&bih=665")
        driver.find_element_by_id("gbqfq").clear()
        driver.find_element_by_id("gbqfq").send_keys("selenium")

    def is_element_present(self, how, what):
        try: self.driver.find_element(by=how, value=what)
        except NoSuchElementException, e: return False
        return True

    def is_alert_present(self):
        try: self.driver.switch_to_alert()
        except NoAlertPresentException, e: return False
        return True

    def close_alert_and_get_its_text(self):
        try:
            alert = self.driver.switch_to_alert()
            alert_text = alert.text
            if self.accept_next_alert:
                alert.accept()
            else:
                alert.dismiss()
            return alert_text
        finally: self.accept_next_alert = True

    def tearDown(self):
        self.driver.quit()
        self.assertEqual([], self.verificationErrors)

if __name__ == "__main__":
    unittest.main()
18个回答

73

对我来说这种方式可行(类似于@Amey和@user4642224的代码,但更简短一点):

from selenium import webdriver
from selenium.webdriver.common.proxy import Proxy, ProxyType

prox = Proxy()
prox.proxy_type = ProxyType.MANUAL
prox.http_proxy = "ip_addr:port"
prox.socks_proxy = "ip_addr:port"
prox.ssl_proxy = "ip_addr:port"

capabilities = webdriver.DesiredCapabilities.CHROME
prox.add_to_capabilities(capabilities)

driver = webdriver.Chrome(desired_capabilities=capabilities)

2
这个有效,谢谢。奇怪的是文档说你需要使用远程驱动器。 - Mans
driver = webdriver.Firefox(desired_capabilities=capabilities) 为什么会出现 TypeError: init() got an unexpected keyword argument 'desired_capabilities'? - Rimo
11
这个答案对我没用,我收到了一个错误信息:"指定'socksProxy'需要一个整数作为'socksVersion'"。 - Alex
3
@Alex 根据你使用的代理,只需添加 prox.socks_version = 5prox.socks_version = 4 即可解决错误。 - Mew
2
非常感谢。这对我来说完全有效。我删除了prox.socks_proxy = "ip_addr:port" prox.ssl_proxy = "ip_addr:port"并添加了prox.https_proxy = "ip_addr:port"。 - user1859723

41

这样的东西怎么样?

PROXY = "149.215.113.110:70"

webdriver.DesiredCapabilities.FIREFOX['proxy'] = {
    "httpProxy":PROXY,
    "ftpProxy":PROXY,
    "sslProxy":PROXY,
    "noProxy":None,
    "proxyType":"MANUAL",
    "class":"org.openqa.selenium.Proxy",
    "autodetect":False
}

# you have to use remote, otherwise you'll have to code it yourself in python to 
driver = webdriver.Remote("http://localhost:4444/wd/hub", webdriver.DesiredCapabilities.FIREFOX)

您可以在这里阅读更多相关信息。


这个答案对我很有帮助。如果其他人也在尝试使用Edge,那么webdriver.DesiredCapabilities.EDGE['proxy']是无效的,因为Microsoft Edge目前没有设置来配置代理服务器(要使用代理访问Edge,必须在Windows网络连接设置下配置代理)。 - Steve HHH
1
完整详细文档请查看:https://github.com/SeleniumHQ/selenium/wiki/DesiredCapabilities#proxy-json-object - LeckieNi

17

我的解决方案:

def my_proxy(PROXY_HOST,PROXY_PORT):
        fp = webdriver.FirefoxProfile()
        # Direct = 0, Manual = 1, PAC = 2, AUTODETECT = 4, SYSTEM = 5
        print PROXY_PORT
        print PROXY_HOST
        fp.set_preference("network.proxy.type", 1)
        fp.set_preference("network.proxy.http",PROXY_HOST)
        fp.set_preference("network.proxy.http_port",int(PROXY_PORT))
        fp.set_preference("general.useragent.override","whater_useragent")
        fp.update_preferences()
        return webdriver.Firefox(firefox_profile=fp)
然后在你的代码中调用:
my_proxy(PROXY_HOST,PROXY_PORT)

我在这段代码中遇到问题,因为我将字符串作为端口号进行传递:

 PROXY_PORT="31280"

这很重要:

int("31280")

如果您传递的是字符串而不是整数,您的Firefox配置文件将无法正确设置端口,并且通过代理的连接将无法工作。


1
端口需要转换为整数吗?这将使官方页面上Firefox代理示例错误:http://www.seleniumhq.org/docs/04_webdriver_advanced.jsp 在他们的示例中,PROXYHOST:PROXYPORT被传递为字符串。 - Pyderman
@Pyderman,你把Proxy()类和FirefoxProfile()类搞混了。使用配置文件偏好设置时,你必须单独传递IP和端口,并将port转换为int()。在Proxy()类中,你只需传递包含“IP:PORT”的字符串,它肯定会为你完成其余的工作。 - m3nda
同时,firefox_profile已经被弃用,请传入一个Options对象。详情请参见https://dev59.com/t2Qn5IYBdhLWcg3wIUFn#70937472。@Zyy的回答。 - Arfath Yahiya

13

虽然这是一篇相对古老的帖子,但对于其他人来说,仍然可以通过提供最新答案使其受益,原作者也非常接近一个可行的解决方案。

首先,目前不再支持ftpProxy设置,会产生错误。

proxy = Proxy({
        'proxyType': ProxyType.MANUAL,
        'httpProxy': myProxy,
        'ftpProxy': myProxy, # this will throw an error
        'sslProxy': myProxy,
        'noProxy':''})

接下来,你应该使用Firefox选项代替设置代理属性,代码如下:

proxy = Proxy({
    'proxyType': ProxyType.MANUAL,
    'httpProxy': myProxy,
    'sslProxy': myProxy,
    'noProxy': ''})

options = Options()
options.proxy = proxy
driver = webdriver.Firefox(options=options)

此外,当指定代理时,不要定义方案(scheme),特别是如果您想为多个协议使用同一个代理。

myProxy = "149.215.113.110:70"

整体看起来是这样的

from selenium import webdriver
from selenium.webdriver.common.proxy import *
from selenium.webdriver.firefox.options import Options

myProxy = "149.215.113.110:70"
proxy = Proxy({
    'proxyType': ProxyType.MANUAL,
    'httpProxy': myProxy,
    'sslProxy': myProxy,
    'noProxy': ''})

options = Options()
options.proxy = proxy
driver = webdriver.Firefox(options=options)
driver.get("https://www.google.ie")

在身份验证的情况下呢? - DataMinion

9

带验证的代理。这是一个全新的Python脚本,参考了Mykhail Martsyniuk的示例脚本。

# Load webdriver
from selenium import webdriver

# Load proxy option
from selenium.webdriver.common.proxy import Proxy, ProxyType

# Configure Proxy Option
prox = Proxy()
prox.proxy_type = ProxyType.MANUAL

# Proxy IP & Port
prox.http_proxy = “0.0.0.0:00000prox.socks_proxy = “0.0.0.0:00000prox.ssl_proxy = “0.0.0.0:00000# Configure capabilities 
capabilities = webdriver.DesiredCapabilities.CHROME
prox.add_to_capabilities(capabilities)

# Configure ChromeOptions
driver = webdriver.Chrome(executable_path='/usr/local/share chromedriver',desired_capabilities=capabilities)

# Verify proxy ip
driver.get("http://www.whatsmyip.org/")

8
如果有人正在寻找解决方案,以下是方法:

如果您正在寻找解决方案,请按照以下步骤操作:

from selenium import webdriver
PROXY = "YOUR_PROXY_ADDRESS_HERE"
webdriver.DesiredCapabilities.FIREFOX['proxy']={
    "httpProxy":PROXY,
    "ftpProxy":PROXY,
    "sslProxy":PROXY,
    "noProxy":None,
    "proxyType":"MANUAL",
    "autodetect":False
}
driver = webdriver.Firefox()
driver.get('http://www.whatsmyip.org/')

8

尝试设置sock5代理。我曾经遇到过同样的问题,使用socks代理解决了这个问题。

def install_proxy(PROXY_HOST,PROXY_PORT):
        fp = webdriver.FirefoxProfile()
        print PROXY_PORT
        print PROXY_HOST
        fp.set_preference("network.proxy.type", 1)
        fp.set_preference("network.proxy.http",PROXY_HOST)
        fp.set_preference("network.proxy.http_port",int(PROXY_PORT))
        fp.set_preference("network.proxy.https",PROXY_HOST)
        fp.set_preference("network.proxy.https_port",int(PROXY_PORT))
        fp.set_preference("network.proxy.ssl",PROXY_HOST)
        fp.set_preference("network.proxy.ssl_port",int(PROXY_PORT))  
        fp.set_preference("network.proxy.ftp",PROXY_HOST)
        fp.set_preference("network.proxy.ftp_port",int(PROXY_PORT))   
        fp.set_preference("network.proxy.socks",PROXY_HOST)
        fp.set_preference("network.proxy.socks_port",int(PROXY_PORT))   
        fp.set_preference("general.useragent.override","Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.75.14 (KHTML, like Gecko) Version/7.0.3 Safari/7046A194A")
        fp.update_preferences()
        return webdriver.Firefox(firefox_profile=fp)

然后从你的程序中调用install_proxy ( ip , port )


你会如何修改代码以接受代理凭据? - nomaam

6

上述结果可能是正确的,但不适用于最新的webdriver。以下是我对上述问题的解决方案。简单而甜美。


        http_proxy  = "ip_addr:port"
        https_proxy = "ip_addr:port"

        webdriver.DesiredCapabilities.FIREFOX['proxy']={
            "httpProxy":http_proxy,
            "sslProxy":https_proxy,
            "proxyType":"MANUAL"
        }

        driver = webdriver.Firefox()

或者

    http_proxy  = "http://ip:port"
    https_proxy = "https://ip:port"

    proxyDict = {
                    "http"  : http_proxy,
                    "https" : https_proxy,
                }

    driver = webdriver.Firefox(proxy=proxyDict)

5

这对我在2022年9月很有帮助——使用认证用户+密码的Selenium代理

import os
import zipfile

from selenium import webdriver

PROXY_HOST = '192.168.3.2'  # rotating proxy or host
PROXY_PORT = 8080 # port
PROXY_USER = 'proxy-user' # username
PROXY_PASS = 'proxy-password' # password

manifest_json = """
{
    "version": "1.0.0",
    "manifest_version": 2,
    "name": "Chrome Proxy",
    "permissions": [
        "proxy",
        "tabs",
        "unlimitedStorage",
        "storage",
        "<all_urls>",
        "webRequest",
        "webRequestBlocking"
    ],
    "background": {
        "scripts": ["background.js"]
    },
    "minimum_chrome_version":"22.0.0"
}
"""

background_js = """
var config = {
        mode: "fixed_servers",
        rules: {
        singleProxy: {
            scheme: "http",
            host: "%s",
            port: parseInt(%s)
        },
        bypassList: ["localhost"]
        }
    };
chrome.proxy.settings.set({value: config, scope: "regular"}, function() {});
function callbackFn(details) {
    return {
        authCredentials: {
            username: "%s",
            password: "%s"
        }
    };
}
chrome.webRequest.onAuthRequired.addListener(
            callbackFn,
            {urls: ["<all_urls>"]},
            ['blocking']
);
""" % (PROXY_HOST, PROXY_PORT, PROXY_USER, PROXY_PASS)

def get_chromedriver(use_proxy=False, user_agent=None):
    path = os.path.dirname(os.path.abspath(__file__))
    chrome_options = webdriver.ChromeOptions()
    if use_proxy:
        pluginfile = 'proxy_auth_plugin.zip'

        with zipfile.ZipFile(pluginfile, 'w') as zp:
            zp.writestr("manifest.json", manifest_json)
            zp.writestr("background.js", background_js)
        chrome_options.add_extension(pluginfile)
    if user_agent:
        chrome_options.add_argument('--user-agent=%s' % user_agent)
    driver = webdriver.Chrome(
        os.path.join(path, 'chromedriver'),
        chrome_options=chrome_options)
    return driver

def main():
    driver = get_chromedriver(use_proxy=True)
    driver.get('https://ifconfig.me/)

if __name__ == '__main__':
    main()

source link


4

通过设置FirefoxProfile来尝试

from selenium import webdriver
import time


"Define Both ProxyHost and ProxyPort as String"
ProxyHost = "54.84.95.51" 
ProxyPort = "8083"



def ChangeProxy(ProxyHost ,ProxyPort):
    "Define Firefox Profile with you ProxyHost and ProxyPort"
    profile = webdriver.FirefoxProfile()
    profile.set_preference("network.proxy.type", 1)
    profile.set_preference("network.proxy.http", ProxyHost )
    profile.set_preference("network.proxy.http_port", int(ProxyPort))
    profile.update_preferences()
    return webdriver.Firefox(firefox_profile=profile)


def FixProxy():
    ""Reset Firefox Profile""
    profile = webdriver.FirefoxProfile()
    profile.set_preference("network.proxy.type", 0)
    return webdriver.Firefox(firefox_profile=profile)


driver = ChangeProxy(ProxyHost ,ProxyPort)
driver.get("http://whatismyipaddress.com")

time.sleep(5)

driver = FixProxy()
driver.get("http://whatismyipaddress.com")

这个程序在Windows 8和Mac OSX上都进行了测试。如果你正在使用Mac OSX并且没有更新selenium,那么你可能会遇到selenium.common.exceptions.WebDriverException。如果是这样,请在升级selenium后再尝试。

pip install -U selenium

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接