提取边界框并将其保存为图像

Question

提取边界框并将其保存为图像

pythonimageopencvimage-processingbounding-box

35

假设您有以下图像：

Example:

现在我想将每个独立字母提取为单独的图像。目前，我已恢复了轮廓并绘制了边界框，例如该字符 a：

Bounding box for the character 'a'

之后，我想要提取每个框（如此处的字母 a），并保存到一个图像文件中。

期望的结果如下：

Result

这是我目前的代码：

import numpy as np
import cv2

im = cv2.imread('abcd.png')
im[im == 255] = 1
im[im == 0] = 255
im[im == 1] = 0
im2 = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
ret,thresh = cv2.threshold(im2,127,255,0)
contours, hierarchy = cv2.findContours(thresh,cv2.RETR_TREE,cv2.CHAIN_APPROX_SIMPLE)

for i in range(0, len(contours)):
    if (i % 2 == 0):
       cnt = contours[i]
       #mask = np.zeros(im2.shape,np.uint8)
       #cv2.drawContours(mask,[cnt],0,255,-1)
       x,y,w,h = cv2.boundingRect(cnt)
       cv2.rectangle(im,(x,y),(x+w,y+h),(0,255,0),2)
       cv2.imshow('Features', im)
       cv2.imwrite(str(i)+'.png', im)

cv2.destroyAllWindows()

提前致谢。

- Edgar Andrés Margffoy Tuay

3个回答

4

以下是一种方法：

将图像转换为灰度
使用Otsu的阈值来获取二进制图像
查找轮廓
通过Numpy切片迭代轮廓并提取ROI

在找到轮廓后，我们使用cv2.boundingRect（） 来获取每个字母的边界矩形坐标。

x,y,w,h = cv2.boundingRect(c)

为了提取ROI，我们使用NumPy切片。

ROI = image[y:y+h, x:x+w]

由于我们有边界矩形的坐标，因此我们可以绘制绿色边界框。

cv2.rectangle(copy,(x,y),(x+w,y+h),(36,255,12),2)

这是检测到的字母

这是每个保存字母的ROI

import cv2

image = cv2.imread('1.png')
copy = image.copy()
gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
thresh = cv2.threshold(gray,0,255,cv2.THRESH_OTSU + cv2.THRESH_BINARY)[1]

cnts = cv2.findContours(thresh, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
cnts = cnts[0] if len(cnts) == 2 else cnts[1]

ROI_number = 0
for c in cnts:
    x,y,w,h = cv2.boundingRect(c)
    ROI = image[y:y+h, x:x+w]
    cv2.imwrite('ROI_{}.png'.format(ROI_number), ROI)
    cv2.rectangle(copy,(x,y),(x+w,y+h),(36,255,12),2)
    ROI_number += 1

cv2.imshow('thresh', thresh)
cv2.imshow('copy', copy)
cv2.waitKey()

- nathancy

1

我该如何将这种方法应用于提取单词图像而不是字母图像，@nathancy？ - Raj

@Raj，同样的过程，只需进行图像处理直到获得二进制图像，然后您就可以使用此示例。它可以与任何东西一起使用，如形状、对象、单词聚类、斑点，只要您要提取的前景对象与背景不同即可。在图像处理中，我们通常希望所需的对象为白色，而背景为黑色。 - nathancy

0

        def bounding_box_img(img,bbox):
            x_min, y_min, x_max, y_max = bbox
            bbox_obj = img[y_min:y_max, x_min:x_max]
            return bbox_obj

        img = cv2.imread("image.jpg")
        cropped_img = bounding_box_img(img,bbox)
        cv2.imshow(cropped_img)

这将返回裁剪后的图像（边界框）

在这种方法中，边界框坐标基于Pascal-VOC注释格式，例如此处。

- livan3li

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Andrey Kamaev · Accepted Answer

47

以下会给您一个单个字母。

letter = im[y:y+h,x:x+w]

- Andrey Kamaev

当我对数组进行切片时，它得到了错误的索引，即字母'a'移动了，所以我只得到了右上角，而其他的则出现了以下错误： libpng警告：IHDR中图像高度为零 libpng错误：无效的IHDR数据 - Edgar Andrés Margffoy Tuay

我找到了问题所在，尺寸被颠倒了，即：im[y:y+h, x:x+w]。 - Edgar Andrés Margffoy Tuay

这个解决方案如何修改才能在原始图像上绘制绿色边界框？ - DeaconDesperado

@Andfoy，我需要关于这篇文章的帮助...http://stackoverflow.com/questions/43097703/how-to-find-yellow-box-coordinate-of-an-image....你能帮我吗？ - Sudip Das