Python - WebDriver 和 Asyncio

Question

Python - WebDriver 和 Asyncio

pythonseleniumasynchronouswebdriverpython-asyncio

12

是否可以首先为每个任务打开浏览器，然后再加载链接？这段代码引发了一个错误。

import asyncio
from selenium import webdriver

async def get_html(url):
    driver = await webdriver.Chrome()
    response = await driver.get(url)

类型错误: 对象WebDriver无法在 'await' 表达式中使用

- Valek Potapov

3个回答

20

如果您想以异步方式使用Selenium，我建议使用多个驱动程序实例和这样的执行者:

import asyncio
from concurrent.futures.thread import ThreadPoolExecutor

from selenium import webdriver

executor = ThreadPoolExecutor(10)


def scrape(url, *, loop):
    loop.run_in_executor(executor, scraper, url)


def scraper(url):
    driver = webdriver.Chrome("./chromedriver")
    driver.get(url)


loop = asyncio.get_event_loop()
for url in ["https://google.de"] * 2:
    scrape(url, loop=loop)

loop.run_until_complete(asyncio.gather(*asyncio.all_tasks(loop)))

请注意，您可以在无头模式下运行selenium，因此您不需要为调用一些简单的url生成整个GUI。

- throws_exceptions_at_you

3

请注意，这不是使用“asyncio”的明智方式。在此处使用它只是对“ThreadPoolExecutor”的过度包装，没有添加任何内容。可以直接使用池的“map”或“submit”方法。 - MisterMiyagi

0

现在有一个新的库可以使用webdrivers执行异步操作，它叫做caqui。我已经在个人爬虫中使用它了。

- Douglas Cardoso

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Hieu · Accepted Answer

问题已在以下位置讨论：https://github.com/SeleniumHQ/selenium/issues/3399 如果您想要异步 WebDrivers，可以使用两个库：