如何使用 Python requests 获取网站的服务器信息？

Question

如何使用 Python requests 获取网站的服务器信息？

4

我想制作一个网络爬虫，统计保加利亚网站中最流行的服务器软件，例如Apache，nginx等。以下是我的设想：

import requests
r = requests.get('http://start.bg')
print(r.headers)

返回以下内容：

{'Debug': 'unk', 
'Content-Type': 'text/html; charset=utf-8', 
'X-Powered-By': 'PHP/5.3.3', 
'Content-Length': '29761', 
'Connection': 'close', 
'Set-Cookie': 'fbnr=1; expires=Sat, 13-Feb-2016 22:00:01 GMT; path=/; domain=.start.bg', 
'Date': 'Sat, 13 Feb 2016 13:43:50 GMT', 
'Vary': 'Accept-Encoding', 
'Server': 'Apache/2.2.15 (CentOS)', 
'Content-Encoding': 'gzip'}

你可以很容易地看到它运行在Apache/2.2.15上，只需说r.headers ['Server']即可获得此结果。我尝试了几个保加利亚网站，它们都有Server密钥。

然而，当我请求一个更复杂的网站（如www.teslamotors.com）的头信息时，我得到以下信息：

{'Content-Type': 'text/html; charset=utf-8', 
'X-Cache-Hits': '9', 
'Cache-Control': 'max-age=0, no-cache, no-store', 
'X-Content-Type-Options': 'nosniff', 
'Connection': 'keep-alive', 
'X-Varnish-Server': 'sjc04p1wwwvr11.sjc05.teslamotors.com', 
'Content-Language': 'en', 
'Pragma': 'no-cache', 
'Last-Modified': 'Sat, 13 Feb 2016 13:07:50 GMT', 
'X-Server': 'web03a', 
'Expires': 'Sat, 13 Feb 2016 13:37:55 GMT', 
'Content-Length': '10290', 
'Date': 'Sat, 13 Feb 2016 13:37:55 GMT', 
'Vary': 'Accept-Encoding', 
'ETag': '"1455368870-1"', 
'X-Frame-Options': 'SAMEORIGIN', 
'Accept-Ranges': 'bytes', 
'Content-Encoding': 'gzip'}

正如你所看到的，这个字典中没有任何['Server']键（虽然有X-Server和X-Varnish-Server，但我不确定它们的意思，而且它们的值不是像Apache那样的服务器名称）。

所以我想可能还有其他请求可以发送来获得所需的服务器信息，或者他们有自己特定的服务器软件（这对于Facebook来说似乎很合理）。我也尝试了其他的.com网站，例如https://spotify.com，它确实有一个['Server']键。

那么有没有办法找到Facebook和特斯拉汽车使用的服务器信息？

- Boyan Kushlev

1

一个Web服务器可能会返回SERVER头，也可能不会。不要指望它。请参考这个问题：https://dev59.com/5G445IYBdhLWcg3wucio - Selcuk

好的，明白了。 :) - Boyan Kushlev

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- sorin · Accepted Answer

这与Python无关，大多数配置良好的Web服务器由于安全考虑在“server” http标头中不返回信息。

没有理智的开发人员会想让您知道��们正在运行未修补的xxx产品版本。