图像处理 - 旋转和光学字符识别

Question

图像处理 - 旋转和光学字符识别

c++imageimage-processingcaptcha

4

大家早上好，

今天我想讨论的主题是"C++中的图像处理"。

到目前为止，我已经成功地过滤掉了图片中所有嘈杂的部分，并将其变成了黑白色。

但现在我有两个问题。

第一个问题:
下面是一张图像的屏幕截图。找出如何旋转文本的最佳方法是什么？最后，如果文本是水平的，那就太好了。有没有人有一个好的链接或示例。

enter image description here

第二个问题:
接下来该怎么办？您认为我应该将图像发送给“光学字符识别器”(a)还是应该过滤每个字母(b)？

如果答案是(a)，最小的OCR库是什么？迄今为止，我发现的所有库都过于强大且难以实现在现有项目中（例如gocr或tesseract）。

如果答案是(b)，保存每个字母作为单独的图像的最佳方法是什么？我应该搜索白色像素，然后从像素到像素并将坐标保存在2D数组中吗？那字母"i"呢；）

感谢所有帮助我找到出路的人！
对于上面奇怪的英语，我还是一个语言新手:)

- Robert Weindl

将图像分成仅包含一个字符的部分是一项不容易的任务，也属于OCR范畴。因此，您应该发送整个图像。作为非平凡情况的示例，请参见http://www.markboulton.co.uk/images/uploads/2.gif。 - Vlad

1

你是认真地在寻求编程上突破验证码的帮助吗？也就是那些专门设计来使OCR系统难以读取的挑战？ - Spence

不，我不是在问如何破解验证码。我的主要问题是如何找出如何旋转图像中的文本，以使文本变成水平方向。 - Robert Weindl

@Spence 通常是这样，但在这个例子中不是这样的。 - Dr. belisarius

3个回答

1

针对您的第一个问题：

首先，去除不属于字母序列的嘈杂白色像素的任何“规格”。采用温和的低通滤波器（像素颜色=周围像素的平均值），然后将像素值夹紧到纯黑或纯白。这应该可以消除您图像中“a”字符下面的小“点”以及其他规格。

现在搜索以下像素：

xMin = white pixel with the lowest  x value (white pixel closest to the left edge)
xMax = white pixel with the largest x value (white pixel closest to the right edge)
yMin = white pixel with the lowest  y value (white pixel closest to the top edge)
yMax = white pixel with the largest y value (white pixel closest to the bottom edge)

with these four pixel values, form a bounding box: Rect(xMin, yMin, xMax, yMax);
compute the area of the bounding box and find the center.

using the center of the bounding box, rotate the box by N degrees. (You can pick N: 1 degree would be an ok value).

Repeat the process of finding xMin,xMax,yMin,yMax and recompute the area

Continue rotating by N degrees until you've rotated K degrees.  Also rotate by -N degrees until you've rotated by -K degrees.  (Where K is the max rotation... say 30 degrees). At each step recompute the area of the bounding box.

产生最小面积边界框的旋转很可能是将字母与底部边缘（水平对齐）平行对齐的旋转。

- selbie

0

你可以从底部测量每个白色像素的高度，并找出文本倾斜了多少。这是一种非常简单的方法，但当我尝试时它很有效。

- Zitrax

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Dr. belisarius · Accepted Answer

你第一个问题通常被称为“倾斜校正”。你可以在谷歌上搜索相关内容（有很多参考资料）。这篇文章是一份不错的论文，展示了如何进行倾斜校正。您也可以尝试通过主成分分析来进行图像处理（但效果没有前面提到的那篇文章好）。