如何在Python中从URL读取图像数据？

Question

如何在Python中从URL读取图像数据？

308

我所尝试的是在处理本地文件时相当简单的问题，但当我尝试使用远程URL时就会出现问题。

基本上，我正在尝试从从URL获取的文件创建PIL图像对象。当然，我总是可以只获取URL并将其存储在临时文件中，然后将其打开为图像对象，但那样感觉非常低效。

这是我拥有的：

Image.open(urlopen(url))

它会因为seek()不可用而抛出错误，所以我尝试了这个：

Image.open(urlopen(url).read())

但那也不起作用。有更好的方法来做这件事吗？还是写入临时文件是这种情况下被接受的方式？

- Daniel Quinn

1

参见：如何使用Python将已知URL地址的图像保存到本地？ - Martin Thoma

可能存在问题，请求无法从URL获取图像。尝试从另一个URL进行相同的测试（仅供测试目的）。 - Aashish Chaubey

15个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Mohamed TOUATI · Answer 1

from PIL import Image
import cv2
import numpy as np
import requests
image=Image.open(requests.get("https://previews.123rf.com/images/darrenwhi/darrenwhi1310/darrenwhi131000024/24022179-photo-of-many-cars-with-one-a-different-color.jpg", stream=True).raw)
#image =resize((420,250))

image_array=np.array(image)
image

- Mohamed TOUATI · Answer 2

使用urllib.request.urlretrieve()和PIL.Image.open()来下载和读取图像数据：

import requests
import urllib.request
import PIL

urllib.request.urlretrieve("https://i.imgur.com/ExdKOOz.png", "sample.png")
img = PIL.Image.open("sample.png")
img.show()

或者使用requests.get(url)函数，将url作为要通过GET请求下载的对象文件的地址进行调用。使用io.BytesIO(obj)函数，将obj作为响应内容加载原始数据作为字节对象。要加载图像数据，请调用PIL.Image.open(bytes_obj)函数，其中bytes_obj是字节对象。

import io

response = requests.get("https://i.imgur.com/ExdKOOz.png")
image_bytes = io.BytesIO(response.content)
img = PIL.Image.open(image_bytes)
img.show()

- AdithyaM · Answer 3

直接获取图像的numpy数组，无需使用PIL

import requests, io
import matplotlib.pyplot as plt 

response = requests.get(url).content
img = plt.imread(io.BytesIO(response), format='JPG')
plt.imshow(img)

- Anthony Mooz · Answer 4

针对使用OpenCV的Python 3：

import cv2
from urllib.request import urlopen

image_url = "IMAGE-URL-GOES-HERE"
resp = urlopen(image_url)
image = np.asarray(bytearray(resp.read()), dtype="uint8")
image = cv2.imdecode(image, cv2.IMREAD_COLOR) # The image object

# Optional: For testing & viewing the image
cv2.imshow('image',image)

针对使用OpenCV和Google Colab/Jupyter Notebook的Python 3：

import cv2
from google.colab.patches import cv2_imshow
from urllib.request import urlopen

image_url = "IMAGE-URL-GOES-HERE"
resp = urlopen(image_url)
image = np.asarray(bytearray(resp.read()), dtype="uint8")
image = cv2.imdecode(image, cv2.IMREAD_COLOR) # The image object

# Optional: For testing & viewing the image
cv2_imshow(image)

- Ajeet Verma · Answer 5

上述提到的解决方案可能有效，但它忽略了一个我想要强调的重点，即当我们获取或检索图像URL以进行读取时，如果在进行get请求时没有传递头信息，则我们可能不会总是获取实际的图像内容。

例如： 无头信息的请求

import requests
url = "https://www.roaringcreationsfilms.com/rcsfilms-media/chankya-quotes-in-hindi-32.jpg"
data = requests.get(url).content

如果我们检查数据：

print(data)
b'<head><title>Not Acceptable!</title></head><body><h1>Not Acceptable!</h1><p>An 
appropriate representation of the requested resource could not be found on this server.
This error was generated by Mod_Security.</p></body></html>'

你看，我们实际上并没有获取图像的内容。

带有头部信息的请求

import requests
url = "https://www.roaringcreationsfilms.com/rcsfilms-media/chankya-quotes-in-hindi-32.jpg"
headers = {"User-Agent": "PostmanRuntime/7.31.1"}
data = requests.get(url, headers=headers).content

而且，如果我们现在检查数据：

print(data)
b'\xff\xd8\xff\xe0\x00\x10JFIF\x00\x01\x01\x00\x00\x01\x00\x01\x00\x00\xff\xdb\x00C\x00\t\x06\x06\........\xfb\x04El\xb3\xa8L\xbc\xa12\xc6<\xc4\x891\xf2L|\xf7\x9eV\x18\xc5\xd8\x8f\x02\xca\xdc\xb1c+-\x96\'\x86\xcb,l\xb12;\x16\xd4j\xfd/\xde\xbf\xff\xd9'

现在，我们获取图像的实际内容。

需要注意的是，不同的URL可能需要不同的头部组合（例如“用户代理”，“接受”，“接受编码”等）才能成功获取数据，有些甚至可能不需要任何标题。但在发出请求时，将“User-Agent”作为最低必需标头传递始终是一个好习惯。