正如其他答案所提到的,这是因为没有指定user-agent
。默认的requests
user-agent
是python-requests,因此Google会阻止请求,因为它知道这是一个机器人而不是一个“真实”的用户访问。
User-agent
通过将此信息添加到HTTP请求标头中来伪造用户访问。可以通过传递自定义标头(检查您的user-agent
)来完成:
headers = {
'User-agent':
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.19582"
}
requests.get("YOUR_URL", headers=headers)
此外,为了获得更准确的结果,您可以传递
URL参数:
params = {
"q": "samurai cop, what does katana mean",
"gl": "in",
"hl": "en"
}
requests.get("YOUR_URL", params=params)
代码和在线IDE中的完整示例(来自另一个答案的代码将因为CSS
选择器的更改而引发错误):
from bs4 import BeautifulSoup
import requests, lxml
headers = {
'User-agent':
"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/70.0.3538.102 Safari/537.36 Edge/18.19582"
}
params = {
"q": "samurai cop what does katana mean",
"gl": "in",
"hl": "en"
}
html = requests.get("https://www.google.com/search", headers=headers, params=params)
soup = BeautifulSoup(html.text, 'lxml')
for result in soup.select('.tF2Cxc'):
title = result.select_one('.DKV0Md').text
link = result.select_one('.yuRUbf a')['href']
print(f'{title}\n{link}\n')
-------
'''
Samurai Cop - He speaks fluent Japanese - YouTube
https://www.youtube.com/watch?v=paTW3wOyIYw
Samurai Cop - What does "katana" mean? - Quotes.net
https://www.quotes.net/mquote/1060647
Samurai Cop (1991) - Mathew Karedas as Joe Marshall - IMDb
https://www.imdb.com/title/tt0130236/characters/nm0360481
...
'''
或者,您可以使用SerpApi的Google有机结果API来实现相同的功能。这是一个付费API,但也有免费计划。
在您的情况下,不同之处在于您只需要迭代结构化JSON并快速获取所需数据,而不是弄清为什么某些事情不像应该那样工作,然后随时间维护解析器。
集成代码:
import os
from serpapi import GoogleSearch
params = {
"engine": "google",
"q": "samurai cop what does katana mean",
"hl": "en",
"gl": "in",
"api_key": os.getenv("API_KEY"),
}
search = GoogleSearch(params)
results = search.get_dict()
for result in results["organic_results"]:
print(result['title'])
print(result['link'])
print()
------
'''
Samurai Cop - He speaks fluent Japanese - YouTube
https://www.youtube.com/watch?v=paTW3wOyIYw
Samurai Cop - What does "katana" mean? - Quotes.net
https://www.quotes.net/mquote/1060647
...
'''
免责声明,我是SerpApi的员工。