Tesseract OCR难以识别数字。

Question

Tesseract OCR难以识别数字。

pythonopencvimage-processingtesseractpython-tesseract

4

我正在尝试使用Python中的Tesseract检测一些数字。下面是我的起始图像以及我可以得到的结果。这是我用来达到这个目的的代码。

import pytesseract
import cv2
import numpy as np
pytesseract.pytesseract.tesseract_cmd = "C:\\Users\\choll\\AppData\\Local\\Programs\\Tesseract-OCR\\tesseract.exe"

image = cv2.imread(r'64normalwart.png')
lower = np.array([254, 254, 254])
upper = np.array([255, 255, 255])
image = cv2.inRange(image, lower, upper)
image = cv2.bitwise_not(image)
#Uses a language that should work with minecraft text, I have tried with and without, no luck 
text = pytesseract.image_to_string(image, lang='mc')
print(text)
cv2.imwrite("Wartthreshnew.jpg", image)
cv2.imshow("Image", image)
cv2.waitKey(0)

我最终得到了黑底白字的数字，这看起来很不错，但是Tesseract仍然无法识别这些数字。我还注意到数字非常锯齿状，但我不知道该如何解决。有人有推荐吗？我应该如何让Tesseract能够识别这些数字？

起始图像

处理后的图像

- C Holley

1

你可以尝试使用cv2.blur()来平滑数字的粗糙边缘。这会使图像整体变得模糊，但是tesseract可能更容易识别数字。 - sj95126

谢谢您的建议，图片可能太小了，但仍然看不到它。 - C Holley

尝试像这样添加配置psm 6或7：pytesseract.image_to_string(img, config='--psm 6') - Norbert Tiborcz

好主意。我找到的解决方案是使用--psm 8，并将其视为一个单词，同时限制它只包含数字。 https://dev59.com/FlcP5IYBdhLWcg3wTIMt 对于未来看到此问题的任何人都是一个有用的资源。 - C Holley

3个回答

0

使用pytesseract.image_to_string(img, config='--psm 8')或尝试不同的配置来查看图像是否能够被识别。这里有一个有用的链接Pytesseract OCR多个配置选项

- C Holley

0

我认为tesseract默认会将黑名单中的数字排除在外，所以我尝试使用tessedit_char_whitelist来将我想要的字符加入白名单，但是没有成功。因此，我尝试使用以下配置来取消数字的黑名单：tessedit_char_unblacklist='0123456789'。

pytesseract.image_to_string(img, lang='eng', config='--psm 6 --oem 3 -c tessedit_char_unblacklist=0123456789')

- Night Coder

请记住，Stack Overflow 不仅仅是为了解决当前的问题，还要帮助未来的读者找到类似问题的解决方案，这需要理解底层代码。对于我们社区中的初学者来说，这尤为重要，因为他们可能不熟悉语法。鉴于此，请问您能否编辑您的回答，包括对您所做的操作的解释以及为什么您认为这是最佳方法？ - undefined

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Esraa Abdelmaksoud · Accepted Answer

您遇到的问题与页面分割模式有关。Tesseract会以不同的方式对每个图像进行分割。当您没有选择适当的PSM时，它会选择模式3，这是自动的，可能不适合您的情况。我刚刚尝试了您的图像，并使用PSM 6完美地解决了问题。

df = pytesseract.image_to_string(np.array(image),lang='eng', config='--psm 6')

这是目前可用的所有PSM：

  0    Orientation and script detection (OSD) only.
  1    Automatic page segmentation with OSD.
  2    Automatic page segmentation, but no OSD, or OCR.
  3    Fully automatic page segmentation, but no OSD. (Default)
  4    Assume a single column of text of variable sizes.
  5    Assume a single uniform block of vertically aligned text.
  6    Assume a single uniform block of text.
  7    Treat the image as a single text line.
  8    Treat the image as a single word.
  9    Treat the image as a single word in a circle.
 10    Treat the image as a single character.
 11    Sparse text. Find as much text as possible in no particular order.
 12    Sparse text with OSD.
 13    Raw line. Treat the image as a single text line,
            bypassing hacks that are Tesseract-specific.