OpenCV BGR2GRAY和Pillow convert函数之间的区别

Question

OpenCV BGR2GRAY和Pillow convert函数之间的区别

c++opencvimage-processingpython-imaging-library

3

我正在尝试使用带有OpenCV和C ++的Tesseract库OCR一个同时包含数字和字符的图像。在调用tesseract库之前，我使用了OpenCV对该图像进行灰度处理。

cvtColor(roiImg,roiImg,CV_BGR2GRAY);

这是使用Python接收的灰度图像 Gray scale image i received with python

该图像的OCR结果不是100％准确。

然后，使用Python的pillow库对相同的图像进行了测试。使用以下方法将原始图像转换为灰度图像。

gray = image.convert('L')

这是我使用pillow库接收到的灰度图像： Gray Scale image i received with pillow library

后面提到的灰度图像给出了100%准确的结果。

在我搜索互联网时，有人提到opencv BGR2Gray和pillow img.convert方法都使用相同的亮度变换算法。

为什么会有两个不同的OCR结果呢？

提前感谢您的帮助！

- M.Wijethunge

4

OpenCV 默认使用 BGR 颜色通道顺序，但是你正在将 RGB 转换为灰度图像 (CV_RGB2GRAY)。这应该改为 CV_BGR2GRAY 才对。 - frogatto

@嗨，我是Frogatto。你说得对。我已经编辑了代码。你有什么想法，这是为什么发生的？ - M.Wijethunge

在@Hi I'm Frogatto的建议之后，仍然遇到相同的问题吗？ - NAmorim

2

看起来这两个库使用相同的公式将颜色转换为RGB：http://docs.opencv.org/3.2.0/de/d25/imgproc_color_conversions.html#color_convert_rgb_gray; https://pillow.readthedocs.io/en/4.0.x/reference/Image.html#PIL.Image.Image.convert。差异可能是由于舍入误差引起的。您可以通过使用值从0到255的`16x16`图像并比较结果来检查它。 - Catree

@M.Mahawatta 对我来说，这两张灰度图像在视觉上非常相似。我怀疑差异是由于OpenCV转换使用的是整数（为了性能可能是16位整数？），而Pillow转换使用的是浮点数。您是如何执行OCR部分的呢？对于Pillow，您将图像转换为灰色，然后保存该图像，该图像将由OpenCV + Tesseract C++加载？此外，您的OpenCV代码仍存在拼写错误（应该是BGR到gray而不是GBR到gray）。 - Catree

显示剩余3条评论

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Catree · Accepted Answer

Pillow只能读取彩色图像的3x8位像素。

以下是一个快速测试，以查看两个库如何舍入值：

OpenCV code:

cv::Mat img(2, 1, CV_8UC3), img_gray;
img.at<cv::Vec3b>(0, 0) = cv::Vec3b(248, 249, 249); //BGR
img.at<cv::Vec3b>(1, 0) = cv::Vec3b(249, 248, 248); //BGR

cv::cvtColor(img, img_gray, cv::COLOR_BGR2GRAY);
std::cout << "img:\n" << img << std::endl;
std::cout << "img_gray:\n" << img_gray << std::endl;

float val1 = 249*0.299f + 249*0.587f + 248*0.114f; //RGB
float val2 = 248*0.299f + 248*0.587f + 249*0.114f; //RGB
std::cout << "val1=" << val1 << std::endl;
std::cout << "val2=" << val2 << std::endl;

图片：

[248, 249, 249;

249, 248, 248]

灰度图片：

[249;

248]

val1=248.886

val2=248.114

Python code:

rgbArray = np.zeros((2,1,3), 'uint8')
rgbArray[0,0,0] = 249 #R
rgbArray[0,0,1] = 249 #G
rgbArray[0,0,2] = 248 #B
rgbArray[1,0,0] = 248 #R
rgbArray[1,0,1] = 248 #G
rgbArray[1,0,2] = 249 #B

img = Image.fromarray(rgbArray)
imgGray = img.convert('L')

print("rgbArray:\n", rgbArray)
print("imgGray:\n", np.asarray(imgGray))
print("np.asarray(imgGray).dtype: ", np.asarray(imgGray).dtype)

rgbArray：

[[[249 249 248]]

[[248 248 249]]]

imgGray：

[[248]

[248]]

np.asarray(imgGray).dtype：uint8

rgbArray: [[[249 249 248]] [[248 248 249]]] imgGray: [[248] [248]] np.asarray(imgGray).dtype: uint8