使用Python,我如何检查一个网站是否可以访问?从我的阅读中得出,我需要检查“HTTP HEAD”,并查看状态码“200 OK”,但是如何做到这一点呢?
谢谢
使用Python,我如何检查一个网站是否可以访问?从我的阅读中得出,我需要检查“HTTP HEAD”,并查看状态码“200 OK”,但是如何做到这一点呢?
谢谢
您可以尝试使用urllib
中的getcode()
来实现此操作。
import urllib.request
print(urllib.request.urlopen("https://www.stackoverflow.com").getcode())
200
对于 Python 2 版本,请使用:
print urllib.urlopen("http://www.stackoverflow.com").getcode()
200
urlopen.getcode
能否获取完整页面内容? - OscarRyzreq = urllib.request.Request(url, headers = headers) resp = urllib.request.urlopen(req)
- james-see我认为最简单的方法是使用Requests模块。
import requests
def url_ok(url):
r = requests.head(url)
return r.status_code == 200
url = "http://foo.example.org/"
,这种方法行不通。我本来期望得到404状态码,但实际上程序崩溃了。 - Jonas Steinimport httplib
conn = httplib.HTTPConnection("www.python.org")
conn.request("HEAD", "/")
r1 = conn.getresponse()
print r1.status, r1.reason
打印
200 OK
当然,前提是www.python.org
正常运行。
from urllib.request import Request, urlopen
from urllib.error import URLError, HTTPError
req = Request("http://stackoverflow.com")
try:
response = urlopen(req)
except HTTPError as e:
print('The server couldn\'t fulfill the request.')
print('Error code: ', e.code)
except URLError as e:
print('We failed to reach a server.')
print('Reason: ', e.reason)
else:
print ('Website is working fine')
适用于 Python 3
import httplib
import socket
import re
def is_website_online(host):
""" This function checks to see if a host name has a DNS entry by checking
for socket info. If the website gets something in return,
we know it's available to DNS.
"""
try:
socket.gethostbyname(host)
except socket.gaierror:
return False
else:
return True
def is_page_available(host, path="/"):
""" This function retreives the status code of a website by requesting
HEAD data from the host. This means that it only requests the headers.
If the host cannot be reached or something else goes wrong, it returns
False.
"""
try:
conn = httplib.HTTPConnection(host)
conn.request("HEAD", path)
if re.match("^[23]\d\d$", str(conn.getresponse().status)):
return True
except StandardError:
return None
import requests
URL = "https://api.github.com"
try:
response = requests.head(URL)
except Exception as e:
print(f"NOT OK: {str(e)}")
else:
if response.status_code == 200:
print("OK")
else:
print(f"NOT OK: HTTP response code {response.status_code}")
requests
库来查找网站是否正常运行,即检查网站的status code
是否为200
。import requests
url = "https://www.google.com"
page = requests.get(url)
print (page.status_code)
>> 200
import urllib2
import socket
def check_url( url, timeout=5 ):
try:
return urllib2.urlopen(url,timeout=timeout).getcode() == 200
except urllib2.URLError as e:
return False
except socket.timeout as e:
print False
print check_url("http://google.fr") #True
print check_url("http://notexist.kc") #False
标准库中httplib
模块的HTTPConnection
对象可能适合您的需求。顺便提一下,如果您在Python中开始进行任何高级HTTP操作,请务必查看httplib2
;这是一个很棒的库。
requests
是我的首选,不过需要这样做:import requests
try:
requests.get(url)
except requests.exceptions.ConnectionError:
print(f"URL {url} not reachable")