如何将cv2矩形边界框合并为多边形？（不能通过重叠/阈值）

Question

如何将cv2矩形边界框合并为多边形？（不能通过重叠/阈值）

pythonopencv

4

我有多个矩形边界框，它们属于同一个对象（报纸文章的某些部分），如第一张图片所示。我正在尝试找到一种方法将它们合并为一个多边形边界框，用于整篇文章，就像第二张图片一样。

我已经看到了很多基于合并重叠边界框的解决方案，但是我不在乎它们是否重叠，因为我已经知道它们属于同一篇文章。在某些情况下，标题相当远（例如在图片上方），因此基于填充的解决方案也不起作用。

我感觉应该有一个cv2函数可以做到这一点，但如果有的话，我会忽略它。任何建议都将非常有帮助。

创建这两个图像的代码：

# Individual bounding boxes

image_0 = cv2.imread('63976500-anderson-herald-bulletin-Jun-18-1968-p-64.jpg')
# Black box, to reproduce: image_0 = np.zeros((5000, 6000, 3), dtype = "uint8")

bbox_list = [[195, 3455, 633, 4213], [658, 3427, 1094, 4222], [1120, 3435, 1553, 4473], [295, 3421, 531, 3451], [201, 3313, 1548, 3409]]

for bbox in bbox_list:
    image_0 = cv2.rectangle(image_0, (bbox[0], bbox[1]), (bbox[2], bbox[3]), (0,255,0), 10)

cv2.imwrite("original_bboxes.jpg", image_0)


# Grouped bounding box

image_1 = cv2.imread('63976500-anderson-herald-bulletin-Jun-18-1968-p-64.jpg')
# Black box, to reproduce: image_1 = np.zeros((5000, 6000, 3), dtype = "uint8")

coordinates = np.array([[195,3313],[195,4222],[1120,4222],[1120,4473],[1553,4473],[1553,3313]], np.int32)

image_1 = cv2.polylines(image_1, [coordinates], True, (0,255,0), 10)

cv2.imwrite("grouped_bboxes.jpg", image_1)

- emilys

最容易使用的是掩模。 "闭合"形态学操作。如果您坚持使用轮廓，OpenCV没有与像素空间中形态学操作等效的“矢量图形”例程。 - Christoph Rackwitz

1

使用德劳内三角剖分生成角落的三角网格。然后删除内部顶点。使用凸包距离获取凹度。 - Micka

你能提供原始图像吗？ - Costantino Grana

@CostantinoGrana 原始图像太大无法上传，但您可以在此处找到它：https://drive.google.com/file/d/1mNrNoxwOGlsnKGtdfJTMawkeS8pLZQZG/view - emilys

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Costantino Grana · Accepted Answer

您可以绘制轮廓点的凸包（这是手动绘制的）：
（错误的凸包图像）
然后只保留外轮廓并尝试多边形逼近。我必须承认，我无法想到更聪明的方法来获取仅垂直和水平线。如Christoph Rackwitz所观察到的那样，我错了。凸包不起作用。也许α形状可以解决问题，但我不确定。另一种方法可能是提取定义边界框的所有线段的方程，然后对于每个点，计算连接它与最近线段的线段。如果该线段是同一边界框的一部分或者该点位于边界框之外，则删除该线段。这比我预期的要困难，因为我的Python OpenCV熟练程度不够。尽管如此，您可以在问题中几乎得到您想要的结果。

from cv2 import cv2
import numpy as np

image_0 = cv2.imread('63976500-anderson-herald-bulletin-Jun-18-1968-p-64.jpg')
bwimage = np.zeros((image_0.shape[0], image_0.shape[1]), dtype=np.uint8)

bbox_list = [[195, 3455, 633, 4213], [658, 3427, 1094, 4222], [1120, 3435, 1553, 4473], [295, 3421, 531, 3451],
             [201, 3313, 1548, 3409]]

for bbox in bbox_list:
    bwimage = cv2.rectangle(bwimage, (bbox[0], bbox[1]), (bbox[2], bbox[3]), 255, 1)

#cv2.imwrite("original_bboxes.png", image_0)

# create list of corners with bbox index
corners = []
for i, bbox in enumerate(bbox_list):
    corners.append((bbox[0], bbox[1], i))
    corners.append((bbox[0], bbox[3], i))
    corners.append((bbox[2], bbox[1], i))
    corners.append((bbox[2], bbox[3], i))

# for each corner find nearest border
for c in corners:
    min_dist = float('inf')
    min_dist_i = None
    min_dist_type = None
    for i, bb in enumerate(bbox_list):
        for side in range(4):
            thisdim = side % 2
            otherdim = 1 - thisdim
            dist = abs(c[thisdim] - bb[side])
            if dist == 0 and c[2] == i:
                pass
            elif min_dist > dist and bb[otherdim] < c[otherdim] < bb[otherdim + 2]:
                min_dist = dist
                min_dist_i = i
                min_dist_type = side

    if min_dist_i is not None:
        bb = bbox_list[min_dist_i]
        print(f"Corner ({c[0]}, {c[1]}) nearest BB: {min_dist_i} [({bb[0]}, {bb[1]})->({bb[2]}, {bb[3]})]")
        if min_dist_type % 2 == 0:
            dest = (bb[min_dist_type], c[1])
        else:
            dest = (c[0], bb[min_dist_type])
        bwimage = cv2.line(bwimage, (c[0], c[1]), dest, 255, 1)

contours, _ = cv2.findContours(image=bwimage, mode=cv2.RETR_EXTERNAL, method=cv2.CHAIN_APPROX_NONE)
image_0 = cv2.drawContours(image_0, contours, -1, (0, 255, 0), 1)

cv2.imwrite("result.png", image_0)

这是结果：