Tesseract OCR即使在识别打印文本时也会输出非常糟糕的结果。

Question

Tesseract OCR即使在识别打印文本时也会输出非常糟糕的结果。

4

我一直在尝试使用tesseract OCR从预裁剪的图像中提取一些数字，但即使图像相当清晰，它仍然无法正常工作。我已经尝试寻找解决方案，但我在这里看到的所有其他问题都涉及裁剪或倾斜文本的问题。

以下是我的代码示例，它尝试读取图像并输出到命令行。

    #convert image to greyscale for OCR
    im_g = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)

    #create threshold image to simplify things.
    im_t = cv2.threshold(im_g, 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY_INV)[1]

    #define kernel size
    rect_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (20,20))

    #Apply dilation to threshold image
    im_d = cv2.dilate(im_t, rect_kernel, iterations = 1)

    #Find countours
    contours = cv2.findContours(im_t, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[0]

    for cnt in contours:
        x,y,w,h = cv2.boundingRect(cnt)

        #crop
        im_c = im[y:y+h, x:x+w]

        speed = pytesseract.image_to_string(im_c)
        print(im_path +" : " + speed)

这里是一个图片的例子

它的输出结果为：

frame10008.jpg : VAeVAs}

我在将图像转换为字符串的tesseract函数中加入了以下配置，使得一些图像有了微小的改善效果：

config="--psm 7"

没有新的配置，它无法检测this 图像。现在它输出：

frame100.jpg : | U |

有什么想法是我做错了什么吗？有没有不同的方法来解决这个问题？如果不使用Tesseract，我也可以接受。

- nabudahab

2个回答

0

我尝试使用image_to_data函数反转前景和背景像素值并对图像进行OCR处理，得到了预期的结果：7576

gray_image = 255 - gra_image
#convert OpenCV image to PIL image data format
gray_pil = Image.fromarray(gray_image)

# OCR image
config = ('-l eng --oem 1 --psm 7')
text = pytesseract.image_to_data(gray_pil, config=config, output_type='dict')

- flamelite

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- nabudahab · Accepted Answer

我找到了一个不错的解决方法。首先，我把图像放大了。给tesseract更多的工作空间对它非常有帮助。其次，为了去掉非数字字符的输出，我在将图像转换成字符串的函数中使用了以下配置：

config = "--psm 7 outputbase digits"

那行现在看起来是这样的:

speed = pytesseract.image_to_string(im_c, config = "--psm 7 outputbase digits")

返回的数据还远非完美，但成功率足够高，我应该能够清除垃圾数据并在 Tesseract 返回无数字的情况下进行插值。