目前我正在从标签上读取文字的OCR项目中工作(见下面的示例图像)。我遇到了图像倾斜的问题,需要帮助解决图像倾斜的问题,使文本水平而不是倾斜。目前我使用的方法尝试从给定范围内得分不同的角度(包括下面的代码),但这种方法不一致,有时会过度校正图像倾斜或干脆无法识别倾斜并进行校正。值得注意的是,在进行倾斜校正之前,我将所有图像旋转270度以使文本直立,然后将图像通过以下代码传递。传递到函数中的图像已经是二进制图像。
代码:
def findScore(img, angle):
"""
Generates a score for the binary image recieved dependent on the determined angle.\n
Vars:\n
- array <- numpy array of the label\n
- angle <- predicted angle at which the image is rotated by\n
Returns:\n
- histogram of the image
- score of potential angle
"""
data = inter.rotate(img, angle, reshape = False, order = 0)
hist = np.sum(data, axis = 1)
score = np.sum((hist[1:] - hist[:-1]) ** 2)
return hist, score
def skewCorrect(img):
"""
Takes in a nparray and determines the skew angle of the text, then corrects the skew and returns the corrected image.\n
Vars:\n
- img <- numpy array of the label\n
Returns:\n
- Corrected image as a numpy array\n
"""
#Crops down the skewImg to determine the skew angle
img = cv2.resize(img, (0, 0), fx = 0.75, fy = 0.75)
delta = 1
limit = 45
angles = np.arange(-limit, limit+delta, delta)
scores = []
for angle in angles:
hist, score = findScore(img, angle)
scores.append(score)
bestScore = max(scores)
bestAngle = angles[scores.index(bestScore)]
rotated = inter.rotate(img, bestAngle, reshape = False, order = 0)
print("[INFO] angle: {:.3f}".format(bestAngle))
#cv2.imshow("Original", img)
#cv2.imshow("Rotated", rotated)
#cv2.waitKey(0)
#Return img
return rotated
修正前和修正后的标签示例图片
修正前 ->
修正后
![](https://imgur.com/CO32WLn.png)
![](https://imgur.com/XRaJ9Bz.png)
如果有人能帮我解决这个问题,将非常有帮助。