如何使用curl欺骗搜索引擎爬虫？

Question

5

如何使用 cURL 向网站发出请求，并让该网站相信我是一个搜索引擎。

- Gabriel Petrovay

使用 curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)'); 设置用户代理。 - naththedeveloper

https://github.com/izniburak/google-bot-curl/blob/master/google-bot.php - user1642018

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Danny Beckett · Accepted Answer

您可以将您的用户代理设置为Googlebot的（更多关于Google KB上使用的确切用户代理的信息，请参考：Google's KB）：

curl_setopt($ch, CURLOPT_USERAGENT, 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)');

但这种方法并不总是有效的！一些网站可能会选择对声称自己是Googlebot的用户进行反向DNS检查。