使用Python和OpenCV解决图像拼接问题

Question

使用Python和OpenCV解决图像拼接问题

pythonopencvimage-processingcomputer-visionimage-stitching

3

我在将24张拼接的图像拼接到第25张图像之前，得到了以下输出结果。在此之前，拼接效果良好。

有人知道为什么/何时拼接输出会像这样吗？出现这种输出的可能性是什么？原因可能是什么？

拼接代码遵循标准拼接步骤，如查找关键点、描述符，然后匹配点，计算单应性，然后扭曲图像。但我不明白为什么会出现这种输出。

拼接的核心部分如下：

detector = cv2.SIFT_create(400)
# find the keypoints and descriptors with SIFT
gray1 = cv2.cvtColor(image1,cv2.COLOR_BGR2GRAY)
ret1, mask1 = cv2.threshold(gray1,1,255,cv2.THRESH_BINARY)
kp1, descriptors1 = detector.detectAndCompute(gray1,mask1)

gray2 = cv2.cvtColor(image2,cv2.COLOR_BGR2GRAY)
ret2, mask2 = cv2.threshold(gray2,1,255,cv2.THRESH_BINARY)
kp2, descriptors2 = detector.detectAndCompute(gray2,mask2)

keypoints1Im = cv2.drawKeypoints(image1, kp1, outImage = cv2.DRAW_MATCHES_FLAGS_DEFAULT, color=(0,0,255))
keypoints2Im = cv2.drawKeypoints(image2, kp2, outImage = cv2.DRAW_MATCHES_FLAGS_DEFAULT, color=(0,0,255))

# BFMatcher with default params
matcher = cv2.BFMatcher()
matches = matcher.knnMatch(descriptors2,descriptors1, k=2)

# Apply ratio test
good = []
for m, n in matches:
    if m.distance < 0.75 * n.distance:
        good.append(m)

print (str(len(good)) + " Matches were Found")

if len(good) <= 10:
    return image1

matches = copy.copy(good)

matchDrawing = util.drawMatches(gray2,kp2,gray1,kp1,matches)

#Aligning the images
src_pts = np.float32([ kp2[m.queryIdx].pt for m in matches ]).reshape(-1,1,2)
dst_pts = np.float32([ kp1[m.trainIdx].pt for m in matches ]).reshape(-1,1,2)


H = cv2.findHomography(src_pts, dst_pts, cv2.RANSAC,5.0)[0]

h1,w1 = image1.shape[:2]
h2,w2 = image2.shape[:2]
pts1 = np.float32([[0,0],[0,h1],[w1,h1],[w1,0]]).reshape(-1,1,2)
pts2 = np.float32([[0,0],[0,h2],[w2,h2],[w2,0]]).reshape(-1,1,2)
pts2_ = cv2.perspectiveTransform(pts2, H)
pts = np.concatenate((pts1, pts2_), axis=0)
# print("pts:", pts)
[xmin, ymin] = np.int32(pts.min(axis=0).ravel() - 0.5)
[xmax, ymax] = np.int32(pts.max(axis=0).ravel() + 0.5)
t = [-xmin,-ymin]
Ht = np.array([[1,0,t[0]],[0,1,t[1]],[0,0,1]]) # translate

result = cv2.warpPerspective(image2, Ht.dot(H), (xmax-xmin, ymax-ymin))

resizedB = np.zeros((result.shape[0], result.shape[1], 3), np.uint8)

resizedB[t[1]:t[1]+h1,t[0]:w1+t[0]] = image1
# Now create a mask of logo and create its inverse mask also
img2gray = cv2.cvtColor(result,cv2.COLOR_BGR2GRAY)
ret, mask = cv2.threshold(img2gray, 0, 255, cv2.THRESH_BINARY)

kernel = np.ones((5,5),np.uint8)
k1 = (kernel == 1).astype('uint8')
mask = cv2.erode(mask, k1, borderType=cv2.BORDER_CONSTANT)

mask_inv = cv2.bitwise_not(mask)

difference = cv2.bitwise_or(resizedB, resizedB, mask=mask_inv)

result2 = cv2.bitwise_and(result, result, mask=mask)

result = cv2.add(result2, difference)

编辑：

Edit:

这张图片显示了在将25个图像拼接到结果中之前，拼接了24个图像的匹配绘图：

在那个匹配绘图之前：

我总共有97张图片要拼接。如果我分别拼接第24和25张图像，则它们可以正确拼接。如果我从第23张图像开始拼接，那么拼接也很好，但是当我从第1张图像开始拼接时，就会出现问题。我无法理解这个问题。

拼接第23张图像后的结果：

第24幅图像拼接后的结果如下：

第25幅图像拼接后的结果与上述情况不同。

奇怪的观察结果是：如果我用相同的代码分别拼接23、24、25幅图像，它们可以拼接成功。如果我拼接从第23幅图像之后到第97幅图像，它们也可以拼接成功。但是，如果我从第1幅图像开始拼接，当拼接第25幅图像时就会中断。我不明白为什么会这样。

我尝试了不同的关键点检测、提取方法、匹配方法、不同的单应性计算、不同的变形代码，但这些组合都没有起作用。在步骤组合代码中有一些缺失或错误。我无法找出问题所在。

对于这个冗长的问题，我很抱歉。由于我完全是新手，所以无法正确地解释和理解事情。感谢您的帮助和指导。

23、24、25张图像使用相同的代码分别拼接的结果如下：

使用不同的代码（在拼接之间产生黑色线条），如果我拼接了97张图像，则第25张图像会上移并拼接如下（右上角）：

- ganesh

1

可能是最后一张图片的透视变形有问题，可能是由于匹配点不准确导致的。 - undefined

1

你正在积累错误。图像拼接的问题越来越严重。看看你的“拼接”链的右端。那看起来不像是一个正确的俯视图，它被严重扭曲了。总的来说，图像拼接比任何所谓的“教程”要复杂得多，这些教程都告诉大家要做这个...但是这个方法最终总会失败，就像现在这样。OpenCV有一个完整的图像拼接模块，你应该使用它，或者考虑参加相关课程，或者阅读书籍/其他出版物。 - undefined

2

我宁愿缝合相邻的图像对，而不是重复。但是使用这么多图像仍然可能遇到过度扭曲的问题。 - undefined

1

我会使用成对拼接。请看一下LFB研究所的这些论文：https://www.lfb.rwth-aachen.de/bibtexupload/pdf/BEH11g.pdf https://www.lfb.rwth-aachen.de/bibtexupload/pdf/BEH11a.pdf https://www.lfb.rwth-aachen.de/bibtexupload/pdf/BEH10g.pdf 和 https://www.lfb.rwth-aachen.de/files/publications/2010/BEH10a.pdf 如果您不需要高速性能，您可以尝试一下束调整，但仍然要在单个图像上计算特征，而不是在镶嵌图像上计算。 - undefined

1

你已经收到了一个关于进行成对匹配的评论。这里有另一个原因，为什么你必须使用成对匹配而不是与当前拼接结果进行匹配：当前的拼接结果会随着每张添加进来的图片而增长。每一张后续要匹配的图片都会变得更加复杂。你制作了一个O(n^2)的程序，而成对匹配是一个O(n)的问题。 - undefined

显示剩余16条评论

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Rahul Kedia · Accepted Answer

首先，我的系统无法处理图片太大而无法重新创建和解决您的问题。但是，在我的全景拼接项目中，我也遇到了同样的问题，因此我分享了这个问题的原因以及解决方法。希望这也对您有所帮助。

当我像您一样将4张图像拼接在一起时，我的问题看起来就像这样。

你可以看到，第四张图片被扭曲了很多，这是不应该发生的。同样的事情也发生在你身上，但程度更加严重。

现在，我将展示一下在进行图像预处理后拼接8张图片的输出结果。

经过对输入图像的预处理，我成功地将8张图像完美地拼接在一起，没有任何变形。

要了解这种变形背后的确切原因，请观看Joseph Redmon的this video，时间为50:26-1:07:23。

正如视频中建议的那样，我们首先需要将图像投影到圆柱体上，然后展开它们，再将这些展开的图像拼接在一起。

下面是初始输入图像（左）和经过投影和展开到圆柱体上的图像（右）。

针对您的问题，由于您使用的是卫星图像，我猜球面投影比圆柱投影更适合，但您需要尝试一下。

以下是我用于将图像投影到圆柱体并展开的代码，供参考。其背后使用的数学与视频中给出的相同。


def Convert_xy(x, y):
    global center, f

    xt = ( f * np.tan( (x - center[0]) / f ) ) + center[0]
    yt = ( (y - center[1]) / np.cos( (x - center[0]) / f ) ) + center[1]
    
    return xt, yt


def ProjectOntoCylinder(InitialImage):
    global w, h, center, f
    h, w = InitialImage.shape[:2]
    center = [w // 2, h // 2]
    f = 1100       # 1100 field; 1000 Sun; 1500 Rainier; 1050 Helens
    
    # Creating a blank transformed image
    TransformedImage = np.zeros(InitialImage.shape, dtype=np.uint8)
    
    # Storing all coordinates of the transformed image in 2 arrays (x and y coordinates)
    AllCoordinates_of_ti =  np.array([np.array([i, j]) for i in range(w) for j in range(h)])
    ti_x = AllCoordinates_of_ti[:, 0]
    ti_y = AllCoordinates_of_ti[:, 1]
    
    # Finding corresponding coordinates of the transformed image in the initial image
    ii_x, ii_y = Convert_xy(ti_x, ti_y)

    # Rounding off the coordinate values to get exact pixel values (top-left corner)
    ii_tl_x = ii_x.astype(int)
    ii_tl_y = ii_y.astype(int)

    # Finding transformed image points whose corresponding 
    # initial image points lies inside the initial image
    GoodIndices = (ii_tl_x >= 0) * (ii_tl_x <= (w-2)) * \
                  (ii_tl_y >= 0) * (ii_tl_y <= (h-2))

    # Removing all the outside points from everywhere
    ti_x = ti_x[GoodIndices]
    ti_y = ti_y[GoodIndices]
    
    ii_x = ii_x[GoodIndices]
    ii_y = ii_y[GoodIndices]

    ii_tl_x = ii_tl_x[GoodIndices]
    ii_tl_y = ii_tl_y[GoodIndices]

    # Bilinear interpolation
    dx = ii_x - ii_tl_x
    dy = ii_y - ii_tl_y

    weight_tl = (1.0 - dx) * (1.0 - dy)
    weight_tr = (dx)       * (1.0 - dy)
    weight_bl = (1.0 - dx) * (dy)
    weight_br = (dx)       * (dy)
    
    TransformedImage[ti_y, ti_x, :] = ( weight_tl[:, None] * InitialImage[ii_tl_y,     ii_tl_x,     :] ) + \
                                      ( weight_tr[:, None] * InitialImage[ii_tl_y,     ii_tl_x + 1, :] ) + \
                                      ( weight_bl[:, None] * InitialImage[ii_tl_y + 1, ii_tl_x,     :] ) + \
                                      ( weight_br[:, None] * InitialImage[ii_tl_y + 1, ii_tl_x + 1, :] )


    # Getting x coorinate to remove black region from right and left in the transformed image
    min_x = min(ti_x)

    # Cropping out the black region from both sides (using symmetricity)
    TransformedImage = TransformedImage[:, min_x : -min_x, :]

    return TransformedImage, ti_x-min_x, ti_y

您只需调用函数ProjectOntoCylinder并传入一张图像，即可得到结果图像和掩膜图像中白色像素的坐标。使用下面的代码调用此函数并获取掩膜图像。

# Applying Cylindrical projection on Image
Image_Cyl, mask_x, mask_y = ProjectOntoCylinder(Image)

# Getting Image Mask
Image_Mask = np.zeros(Image_Cyl.shape, dtype=np.uint8)
Image_Mask[mask_y, mask_x, :] = 255

这是我项目及其详细文档的链接，仅供参考：

第一部分：源代码, 文档第二部分：源代码, 文档