如何使用OpenCV裁剪图像中的白色区域并制作护照照片

11

我有许多图片需要剪裁成完美的护照照片大小。我有成千上万张图片需要类似于这样自动剪裁和矫正。如果图像太模糊无法剪裁,则需要将其复制到拒绝文件夹中。我尝试使用haar级联来实现,但这种方法只给我脸部。但是我需要带有剪裁后背景的面孔,请问有谁能告诉我如何在OpenCV或其他任何软件中编写代码?

            gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
            faceCascade = cv2.CascadeClassifier(
                cv2.data.haarcascades + "haarcascade_frontalface_default.xml")
            faces = faceCascade.detectMultiScale(
                gray,
                scaleFactor=1.3,
                minNeighbors=3,
                minSize=(30, 30)
            )
            if(len(faces) == 1):
                for (x, y, w, h) in faces:
                    if(x-w < 100 and y-h < 100):
                        ystart = int(y-y*int(y1)/100)
                        xstart = int(x-x*int(x1)/100)
                        yend = int(h+h*int(y1)/100)
                        xend = int(w+w*int(y2)/100)
                        roi_color = img[ystart:y + yend, xstart:x + xend]
                        cv2.imwrite(path, roi_color)

                    else:
                        rejectedCount += 1
                        cv2.imwrite(path, img)

Before

enter image description here enter image description here enter image description here

After

enter image description here enter image description here enter image description here

5个回答

7
我将按照以下步骤解决您的问题:
  1. 首先,我们需要获取我们感兴趣的点
  2. 了解正常护照头像的像素大小

如何获取感兴趣的点

我们有更多方法:

  1. 您可以使用 windows 画图应用程序
  2. 如果要更加程序化,我们可以使用 cv2。我将向您展示如何使用cv2完成此操作。

请注意,这不会产生高分辨率图像,您必须自己调整代码。

# imports 
import numpy as np
import cv2

width = height = 600 # normal passport photo size in pixels

# global variable that will update the points when we clicked on the image
pt1 = []
pt2 = np.float32([[0, 0], [height, 0], [0, width], [height, width]])
def mouseEvent(event, x, y, flags, param):
    if event == cv2.EVENT_LBUTTONDOWN:
        global pt1
        if len(pt1) == 4:
            pt1 = []
        else:
            pt1.append([x, y])

while 1:
    image = cv2.imread("img.jpg", cv2.IMREAD_UNCHANGED)
    cv2.imshow("Original Image", image)
    cv2.setMouseCallback("Original Image", mouseEvent)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
    if len(pt1) == 4:
        break

然后我们使用两个cv2函数,它们是getPerspectiveTransformwarpPerspectivegetPerspectiveTransform()将接受两个点,即pt1pt2,然后我们将调用warpPerspective()函数并传递三个位置参数,即图像、矩阵和图像形状:

image = cv2.imread("img.jpg", 0)
matrix = cv2.getPerspectiveTransform(np.float32(pt1), pt2)
image = cv2.warpPerspective(image, matrix, image.shape)
cv2.imshow("Wrap Perspective", image)
cv2.waitKey(0)

我知道这不是一个好的解释,但是你可以理解一下。整个代码程序将如下所示:


import numpy as np
import cv2

width = height = 600 # normal passport photo size in pixels
pt1 = []
pt2 = np.float32([[0, 0], [height, 0], [0, width], [height, width]])
def mouseEvent(event, x, y, flags, param):
    if event == cv2.EVENT_LBUTTONDOWN:
        global pt1
        if len(pt1) == 4:
            pt1 = []
        else:
            pt1.append([x, y])
while 1:
    image = cv2.imread("img.jpg", cv2.IMREAD_UNCHANGED)
    cv2.imshow("Original Image", image)
    cv2.setMouseCallback("Original Image", mouseEvent)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
    if len(pt1) == 4:
        break

image = cv2.imread("img.jpg", 0)
matrix = cv2.getPerspectiveTransform(np.float32(pt1), pt2)
image = cv2.warpPerspective(image, matrix, image.shape)
cv2.imshow("Wrap Perspective", image)
cv2.waitKey(0)
  1. 当您运行以下代码时,将显示一张图片。
  2. 要使用此程序,您需要按顺序点击四个点A-D。例如,如果这是您的图像:
------------------
| (a)          (b)|
|                 |
|                 |
|                 |
|                 |
|                 |
| (c)          (d)|
-------------------

其中a、b、c和d是您在图像上感兴趣的点 crop

演示

enter image description here

单击点 1,然后 2,再点击 3,最后单击 4,即可获得上面的结果。


我有10000多张需要裁剪的图片。我认为手动使用鼠标进行裁剪不是一个可行的想法。有没有一种自动检测点的方法? - Harshith J Poojary
1
你可以使用人工智能,或者cascadeClassifiers来实现它,后者是更好的方法。 - crispengari
2
“使用人工智能”是一个无意义的回答(相当于“使用魔法”),级联分类器完全不适合选择那些带有旋转的角点,因为这些角点会发生旋转。已知级联分类器在物体旋转时容易失败。 - Christoph Rackwitz

5

以下是使用Python/OpenCV提取图像的一种方法,通过关注图像周围的黑色线条实现。

输入:

enter image description here

 - Read the input
 - Pad the image with white so that the lines can be extended until intersection
 - Threshold on black to extract the lines
 - Apply morphology close to try to connect the lines somewhat
 - Get the contours and filter on area drawing the contours on a black background
 - Apply morphology close again to fill the line centers
 - Skeletonize to thin the lines
 - Get the Hough lines and draw them as white on a black background
 - Floodfill the center of the rectangle of lines to fill with mid-gray. Then convert that image to binary so that the gray becomes white and all else is black.
 - Get the coordinates of all non-black pixels and then from the coordinates get the rotated rectangle.
 - Use the angle and center of the rotated rectangle to unrotated both the padded image and this mask image via an Affine warp
 - (Alternately, get the four corners of the rotated rectangle from the mask and then project that to the padded input domain using the affine matrix)
- Get the coordinates of all non-black pixels in the unrotated mask and compute its rotated rectangle.
 - Get the bounding box of the (un-)rotated rectangle 
 - Use those bounds to crop the padded image
 - Save the results

import cv2
import numpy as np
import math
from skimage.morphology import skeletonize

# read image
img = cv2.imread('passport.jpg')
ht, wd = img.shape[:2]

# pad image with white by 20% on all sides
padpct = 20
xpad = int(wd*padpct/100)
ypad = int(ht*padpct/100)
imgpad = cv2.copyMakeBorder(img, ypad, ypad, xpad, xpad, borderType=cv2.BORDER_CONSTANT, value=(255,255,255))
ht2, wd2 = imgpad.shape[:2]

# threshold on black
low = (0,0,0)
high = (20,20,20)

# threshold
thresh = cv2.inRange(imgpad, low, high)

# apply morphology to connect the white lines
kernel = np.ones((5,5), np.uint8)
morph = cv2.morphologyEx(thresh, cv2.MORPH_CLOSE, kernel)

# get contours
contours = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
contours = contours[0] if len(contours) == 2 else contours[1]

# filter on area
mask = np.zeros((ht2,wd2), dtype=np.uint8)
for cntr in contours:
    area = cv2.contourArea(cntr)
    if area > 20:
        cv2.drawContours(mask, [cntr], 0, 255, 1)

# apply morphology to connect the white lines and divide by 255 to make image in range 0 to 1
kernel = np.ones((5,5), np.uint8)
bmask = cv2.morphologyEx(mask, cv2.MORPH_CLOSE, kernel)/255

# apply thinning (skeletonizing)
skeleton = skeletonize(bmask)
skeleton = (255*skeleton).clip(0,255).astype(np.uint8)

# get hough lines
line_img = np.zeros_like(imgpad, dtype=np.uint8)
lines= cv2.HoughLines(skeleton, 1, math.pi/180.0, 90, np.array([]), 0, 0)
a,b,c = lines.shape
for i in range(a):
    rho = lines[i][0][0]
    theta = lines[i][0][1]
    a = math.cos(theta)
    b = math.sin(theta)
    x0, y0 = a*rho, b*rho
    pt1 = ( int(x0+1000*(-b)), int(y0+1000*(a)) )
    pt2 = ( int(x0-1000*(-b)), int(y0-1000*(a)) )
    cv2.line(line_img, pt1, pt2, (255, 255, 255), 1)

# floodfill with mid-gray (128)
xcent = int(wd2/2)
ycent = int(ht2/2)
ffmask = np.zeros((ht2+2, wd2+2), np.uint8)
mask2 = line_img.copy()
mask2 = cv2.floodFill(mask2, ffmask, (xcent,ycent), (128,128,128))[1]

# convert mask2 to binary
mask2[mask2 != 128] = 0
mask2[mask2 == 128] = 255
mask2 = mask2[:,:,0]

# get coordinates of all non-zero pixels
# NOTE: must transpose since numpy coords are y,x and opencv uses x,y
coords = np.column_stack(np.where(mask2.transpose() > 0))

# get rotated rectangle from coords
rotrect = cv2.minAreaRect(coords)
(center), (width,height), angle = rotrect
# from https://www.pyimagesearch.com/2017/02/20/text-skew-correction-opencv-python/
# the `cv2.minAreaRect` function returns values in the
# range [-90, 0); as the rectangle rotates clockwise the
# returned angle trends to 0 -- in this special case we
# need to add 90 degrees to the angle
if angle < -45:
    angle = -(90 + angle)
 
# otherwise, just take the inverse of the angle to make
# it positive
else:
    angle = -angle

# compute correction rotation
rotation = -angle - 90

# compute rotation affine matrix
M = cv2.getRotationMatrix2D(center, rotation, scale=1.0)
    
# unrotate imgpad and mask2 using affine warp
rot_img = cv2.warpAffine(imgpad, M, (wd2, ht2), flags=cv2.INTER_CUBIC, borderValue=(0,0,0))
rot_mask2= cv2.warpAffine(mask2, M, (wd2, ht2), flags=cv2.INTER_CUBIC, borderValue=(0,0,0))

# get coordinates of all non-zero pixels
# NOTE: must transpose since numpy coords are y,x and opencv uses x,y
coords2 = np.column_stack(np.where(rot_mask2.transpose() > 0))

# get bounding box
x,y,w,h = cv2.boundingRect(coords2)
print(x,y,w,h)

# crop rot_img
result = rot_img[y:y+h, x:x+w]

# save resulting images
cv2.imwrite('passport_pad.jpg',imgpad)
cv2.imwrite('passport_thresh.jpg',thresh)
cv2.imwrite('passport_morph.jpg',morph)
cv2.imwrite('passport_mask.jpg',mask)
cv2.imwrite('passport_skeleton.jpg',skeleton)
cv2.imwrite('passport_line_img.jpg',line_img)
cv2.imwrite('passport_mask2.jpg',mask2)
cv2.imwrite('passport_rot_img.jpg',rot_img)
cv2.imwrite('passport_rot_mask2.jpg',rot_mask2)
cv2.imwrite('passport_result.jpg',result)

# show thresh and result    
cv2.imshow("imgpad", imgpad)
cv2.imshow("thresh", thresh)
cv2.imshow("morph", morph)
cv2.imshow("mask", mask)
cv2.imshow("skeleton", skeleton)
cv2.imshow("line_img", line_img)
cv2.imshow("mask2", mask2)
cv2.imshow("rot_img", rot_img)
cv2.imshow("rot_mask2", rot_mask2)
cv2.imshow("result", result)
cv2.waitKey(0)
cv2.destroyAllWindows()

填充图像:

输入图像描述

阈值图像:

输入图像描述

形态学清理后的图像:

输入图像描述

掩模1图像:

输入图像描述

骨架图像:

输入图像描述

(霍夫)线条图像:

输入图像描述

填洪线条图像-掩模2:

输入图像描述

未旋转的填充图像:

输入图像描述

未旋转的掩模2图像:

输入图像描述

裁剪后的图像:

输入图像描述


谢谢您的回答。我尝试了这段代码,但它只适用于这张图片。对于其他图片它不起作用。我在问题中添加了另外两张图片,请检查一下。请告诉我需要在代码中做哪些更改。 - Harshith J Poojary
2
所有图像都很难使其正常工作。在所有图像中,黑线不够明显或深色。同时,在某张图片上还有多余的黑线。此外,图像被扭曲,使得黑线不是直线,因此霍夫线将会检测到每一边的多条线而不是单条线。 - fmw42

4
如果所有照片周围都有细小的白黑边框,您可以:
  1. 对图像进行阈值处理
  2. 获取所有轮廓并
  3. 选择那些符合以下条件的轮廓:
    • 具有正确的梯度
    • 足够大
    • 在通过 approxPolyDP 时缩减为4个角落
  4. 获取定向边界框
  5. 构建仿射变换
  6. 应用仿射变换
如果这些照片不是扫描的,而是从一个角度(非俯视)拍摄的,则需要使用从角点自身计算出的透视变换。
如果照片不是平面的而是扭曲的,则这是一个完全不同的问题。
import numpy as np
import cv2 as cv

im = cv.imread("Zh8QV.jpg")
gray = cv.cvtColor(im, cv.COLOR_BGR2GRAY)

gray = 255 - gray # invert so findContours' implicit black border doesn't bother us

height, width = gray.shape
minarea = (height * width) * 0.20

# (th_level, thresholded) = cv.threshold(gray, thresh=128, maxval=255, type=cv.THRESH_OTSU)

# threshold relative to estimated brightness of "white"
th_level = 255 - (255 - np.median(gray)) * 0.98
(th_level, thresholded) = cv.threshold(gray, thresh=th_level, maxval=255, type=cv.THRESH_BINARY)

(contours, hierarchy) = cv.findContours(thresholded, mode=cv.RETR_LIST, method=cv.CHAIN_APPROX_SIMPLE)

# black-to-white contours have negative area...
#areas = sorted([cv.contourArea(c, oriented=True) for c in contours])

large_areas = [ c for c in contours if cv.contourArea(c, oriented=True) <= -minarea ]

quads = [
    c for c in large_areas
    if len(cv.approxPolyDP(c, epsilon=0.02 * cv.arcLength(c, True), closed=True)) == 4
]

# if there is no quad, or multiple, that's an error (for this example)
assert len(quads) == 1, quads
[quad] = quads

bbox = cv.minAreaRect(quad)
(bcenter, bsize, bangle) = bbox
bcenter = np.array(bcenter)
bsize = np.array(bsize)

# keep orientation upright, fix up bbox size
(rot90, bangle) = divmod(bangle + 45, 90)
bangle -= 45
if rot90 % 2 != 0:
    bsize = bsize[::-1]

# construct affine transformation
M1 = np.eye(3)
M1[0:2,2] = -bcenter

R = np.eye(3)
R[0:2] = cv.getRotationMatrix2D(center=(0,0), angle=bangle, scale=1.0)

M2 = np.eye(3)
M2[0:2,2] = +bsize * 0.5

M = M2 @ R @ M1

bwidth, bheight = np.ceil(bsize)
dsize = (int(bwidth), int(bheight))

output = cv.warpAffine(im, M[0:2], dsize=dsize, flags=cv.INTER_CUBIC)

cv.imshow("output", output)
cv.waitKey(-1)
cv.destroyWindow("output")

input output


兄弟,代码没有显示输出图像 https://imgur.com/zzQ731c - Harshith J Poojary
当我将整个代码放在try-catch块中时,它会给我一个错误。你能修复这段代码吗? - Harshith J Poojary
我使用了你所用的同一张图片。 - Harshith J Poojary
感谢您指出这个问题。已经修复并更新了答案。 - Christoph Rackwitz
让我们在聊天中继续这个讨论 - Harshith J Poojary
显示剩余4条评论

3
我会做以下3个步骤(抱歉我不会为您编写代码,如果您需要某个阶段的帮助,我很乐意提供):
  1. 使用Hough变换来检测图片中最强的4条线。

  2. 计算这4条线交点的坐标。

  3. 应用透视变换。

然后您将会得到所需的裁剪图像。


2

概念

  1. 处理每张图片以增强照片的边缘。

  2. 通过首先找到具有最大面积的轮廓,获取其凸包并逐步逼近凸包直到只剩下4个点,来获得每个处理后图像的照片的4个角落。

  3. 根据检测到的4个角落对每个图像进行变形。

代码

import cv2
import numpy as np

def process(img):
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    img_blur = cv2.GaussianBlur(img_gray, (1, 1), 1)
    img_canny = cv2.Canny(img_blur, 350, 150)
    kernel = np.ones((3, 3))
    img_dilate = cv2.dilate(img_canny, kernel, iterations=2)
    return cv2.erode(img_dilate, kernel, iterations=1)

def get_pts(img):
    contours, _ = cv2.findContours(img, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
    cnt = max(contours, key=cv2.contourArea)
    peri = cv2.arcLength(cnt, True)
    return cv2.approxPolyDP(cv2.convexHull(cnt), 0.04 * peri, True)

files = ["1.jpg", "2.jpg", "3.jpg"]
width, height = 350, 450
pts2 = np.float32([[width, 0], [0, 0], [width, height], [0, height]])

for file in files:
    img = cv2.imread(file)
    pts1 = get_pts(process(img)).squeeze()
    pts1 = np.float32(pts1[np.lexsort(pts1.T)])
    matrix = cv2.getPerspectiveTransform(pts1, pts2)
    out = cv2.warpPerspective(img, matrix, (width, height))[5:-5, 5:-5]
    cv2.imshow(file, out)

cv2.waitKey(0)
cv2.destroyAllWindows()

输出结果

我将每个输出放在一起以适应一个图像:

enter image description here

解释说明

  1. 导入必要的库:
import cv2
import numpy as np
  1. 定义一个函数process(),该函数接受BGR图像数组并返回使用Canny边缘检测器处理后的图像,以便稍后更准确地检测每张照片的边缘。如果需要,可以调整函数中使用的值以适应其他图像:
def process(img):
    img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    img_blur = cv2.GaussianBlur(img_gray, (1, 1), 1)
    img_canny = cv2.Canny(img_blur, 350, 150)
    kernel = np.ones((3, 3))
    img_dilate = cv2.dilate(img_canny, kernel, iterations=2)
    return cv2.erode(img_dilate, kernel, iterations=1)

定义一个函数get_pts(),该函数接受一个处理过的图像并返回面积最大的轮廓的凸包的4个点。为了从凸包中获取4个点,我们使用cv2.approxPolyDP()方法:
def get_pts(img):
    contours, _ = cv2.findContours(img, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
    cnt = max(contours, key=cv2.contourArea)
    peri = cv2.arcLength(cnt, True)
    return cv2.approxPolyDP(cv2.convexHull(cnt), 0.04 * peri, True)
  1. 定义一个名为files的列表,其中包含你想从中提取照片的每个文件的名称以及你想要生成的图像的尺寸,即widthheight
files = ["1.jpg", "2.jpg", "3.jpg"]
width, height = 350, 450

使用上述定义的维度,为将要映射的4个坐标点之一定义一个矩阵:
pts2 = np.float32([[width, 0], [0, 0], [width, height], [0, height]])
  1. 遍历每个文件名,将每个文件读入BGR图像数组中,在图像中获取照片的4个点,使用cv2.getPerspectiveTransform()方法获取变换的解决矩阵,最后使用cv2.warpPerspective()方法根据解决矩阵扭曲图像中的照片部分:
for file in files:
    img = cv2.imread(file)
    pts1 = get_pts(process(img)).squeeze()
    pts1 = np.float32(pts1[np.lexsort(pts1.T)])
    matrix = cv2.getPerspectiveTransform(pts1, pts2)
    out = cv2.warpPerspective(img, matrix, (width, height))[5:-5, 5:-5]
    cv2.imshow(file, out)

最后,添加一个延迟时间,然后销毁所有窗口:
cv2.waitKey(0)
cv2.destroyAllWindows()

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接