Selenium和asyncio

Question

Selenium和asyncio

python-3.xselenium-webdriverpython-asynciodiscord.py

3

什么是最好的方法来使这个工作起来？由于Selenium与asyncio不兼容，机器人最终会崩溃。它按预期工作，但最终会崩溃。我确实了解问题所在，特别是因为selenium非常沉重，但是否有一个好的解决方法？

我收到以下错误消息：任务已被销毁，但仍处于挂起状态！

我没有其他东西，没有回溯信息。我必须重新启动机器人（discord），然后它会再次工作一段时间。

@bot.command(pass_context=True)
async def weather(ctx, arg):
    if ctx:
        list = ('lf','oahu')
        await bot.say('*Loading webdriver.....*')
        if arg in list:
            if arg == 'lf':
                website = '--'
                name = 'Los Feliz Map'
            if arg == 'oahu':
                website = '--'
                name = 'Oahu Map'
            load_site = webdriver.Chrome()
            # load site on chrome
            load_site.get(website)
            await bot.say('*Loading ' + str(name) + '...*')
            load_site.find_element_by_xpath('//*[@id="main-map"]/div[2]/div[1]/div/a[2]').click()
            load_site.find_element_by_xpath('//*[@id="main-map"]/div[2]/div[2]').click()
            load_site.find_element_by_xpath('//*[@id="main-map"]/div[2]/div[2]/div/form/div[3]/label[6]/div/span').click()
            await asyncio.sleep(2) #sleep to ensure weather loads on site before screenshot
            load_site.save_screenshot('weather.png')
            await bot.say('*Taking screenshot and saving image....*')
            await bot.send_file(ctx.message.channel, 'weather.png')
            load_site.quit()
            print('Succesfully sent image of ' + str(arg) + ' - ' + str(ctx.message.author.name))

我忽略了网站，因为它是私有的。

- sb2894

4

我认为你不会取得太大的成功。像selenium这样的阻塞式API在异步应用程序中并不起作用。你最好使用一个专门构建的异步库。arsenic似乎是个不错的选择，但你需要自己试验一下。 - Patrick Haugh

是的，我发现了砷，肯定得去读一下相关资料了。幸运的是，用砷仍然可以使用“click”和“screenshot”，但是快速浏览了一下，好像没有关于使用xpath搜索的信息。晚上稍后再看，谢谢。 - sb2894

3

你可以使用 run_in_executor 在它自己的线程中运行阻塞代码，而不会阻塞其他操作。然后你可以在此线程中运行所有代码并返回所需结果。请参考这里的示例：https://stackoverflow.com/questions/53587063/using-subprocess-to-avoid-long-running-task-from-disconnecting-discord-py-bot/53597795#53597795 - Benjin

你可以尝试使用 playwright，它支持异步使用。 - Jacob Lee

使用浏览器通常不是一个好主意，最好使用aiohttp和beautiful soup来进行网页爬取。 - mcdonalds291

3个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Douglas Cardoso · Answer 1

我知道这是一个老话题，但是现在有一个新的库叫做caqui，可以在webdrivers中执行异步和同步操作。我已经测试过它，可以同时使用3个chromedrivers获取超过2k的链接数据。

- mcdonalds291 · Answer 2

简短回答：不可能实现。如果你设法让它工作起来，也不会有很好的效果。

长篇回答：Selenium是一个阻塞式API，你不能真正将其与异步脚本一起使用。如果你确实想要使用Web浏览器，可以尝试寻找类似https://pypi.org/project/aioselenium/的替代方案。

- Mike Reiche · Answer 3

我使用asynchio.to_thread来完成这项工作。


def my_blocking_selenium_code(number: int):
    pass

tasks = []
for i in range(5):
    task = asyncio.to_thread(my_blocking_selenium_code, i)
    tasks.append(task)
await asyncio.gather(*tasks)