连接线段中最近的点并标记该线段。

10
我使用Open CV和skimage进行数据表的文档分析。 enter image description here 我正在尝试将阴影区域单独分割出来。 enter image description here 我目前能够将零件和编号分别分割出来作为不同的聚类。 enter image description here 使用skimage中的felzenszwalb()函数,我对零件进行了分割。
import matplotlib.pyplot as plt
import numpy as np     
from skimage.segmentation import felzenszwalb
from skimage.io import imread

img = imread('test.jpg')

segments_fz = felzenszwalb(img, scale=100, sigma=0.2, min_size=50)

print("Felzenszwalb number of segments {}".format(len(np.unique(segments_fz))))

plt.imshow(segments_fz)
plt.tight_layout()
plt.show()

但是无法将它们连接起来。有任何想法可以系统地连接并标记出相应的部分和部件编号,将会非常有帮助。 提前感谢您的时间 - 如果我遗漏了任何东西,过分强调或低估了特定点,请在评论中让我知道。


@JeruLuke 我在scikit-image中使用了Felzenszwalb的分割方法。我已经添加了示例代码。 - Nithin Varghese
你是想将数字仅与分割线上的箭头连接起来,还是要将箭头所指区域和数字分开呢? - dhanushka
2
你的脚本足够短,可以直接包含在你的问题中。这样我们可以在不用额外点击或受到烦人的 Dropbox 横幅的情况下看到它。同时也保证了问题完整性,在链接失效时也不会变得不完整。如果使用合适的颜色映射,即让不同的相邻整数有非常不同的颜色,分割结果将更清晰明了。 - Cris Luengo
大致上来说,您想要识别数字,然后沿着最近的线条找到一个物体并将它们组合在一起? - Richard
你有更大的图像库吗? - Richard
显示剩余3条评论
1个回答

9

前提条件

一些预备代码:

%matplotlib inline
%load_ext Cython
import numpy as np
import cv2
from matplotlib import pyplot as plt
import skimage as sk
import skimage.morphology as skm
import itertools

def ShowImage(title,img,ctype):
  plt.figure(figsize=(20, 20))
  if ctype=='bgr':
    b,g,r = cv2.split(img)       # get b,g,r
    rgb_img = cv2.merge([r,g,b])     # switch it to rgb
    plt.imshow(rgb_img)
  elif ctype=='hsv':
    rgb = cv2.cvtColor(img,cv2.COLOR_HSV2RGB)
    plt.imshow(rgb)
  elif ctype=='gray':
    plt.imshow(img,cmap='gray')
  elif ctype=='rgb':
    plt.imshow(img)
  else:
    raise Exception("Unknown colour type")
  plt.axis('off')
  plt.title(title)
  plt.show()

作为参考,这是您的原始图像:

#Read in image
img         = cv2.imread('part.jpg')
ShowImage('Original',img,'bgr')

Original Image

识别数字

为了简化事情,我们需要将像素分类为开或关。我们可以使用阈值处理来实现。由于我们的图像包含两种明显的像素类别(黑色和白色),因此我们可以使用大津法。我们将反转颜色方案,因为我们正在使用的库认为黑色像素很无聊,而白色像素很有趣。

#Convert image to grayscale
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)

#Apply Otsu's method to eliminate pixels of intermediate colour
ret, thresh = cv2.threshold(gray,0,255,cv2.THRESH_BINARY_INV+cv2.THRESH_OTSU)

ShowImage('Applying Otsu',thresh,'gray')

#Verify that pixels are either black or white and nothing in between
np.unique(thresh)

Otsu transformed

我们的策略是定位数字,然后沿着它们附近的线条到达部件,并对这些部件进行标记。由于所有阿拉伯数字都由连续的像素形成,因此我们可以从查找连接的组件开始。
ret, components = cv2.connectedComponents(thresh)
#Each component is a different colour
ShowImage('Connected Components', components, 'rgb')

Connected components

我们可以通过筛选维度来过滤连接的组件以找到数字。请注意,这不是一个非常健壮的方法。更好的选择是使用字符识别,但这留给读者作为练习 :-)
class Box:
    def __init__(self,x0,x1,y0,y1):
        self.x0, self.x1, self.y0, self.y1 = x0,x1,y0,y1
    def overlaps(self,box2,tol):
        if self.x0 is None or box2.x0 is None:
            return False
        return not (self.x1+tol<=box2.x0 or self.x0-tol>=box2.x1 or self.y1+tol<=box2.y0 or self.y0-tol>=box2.y1)
    def merge(self,box2):
        self.x0 = min(self.x0,box2.x0)
        self.x1 = max(self.x1,box2.x1)
        self.y0 = min(self.y0,box2.y0)
        self.y1 = max(self.y1,box2.y1)
        box2.x0 = None #Used to mark `box2` as being no longer valid. It can be removed later
    def dist(self,x,y):
        #Get center point
        ax = (self.x0+self.x1)/2
        ay = (self.y0+self.y1)/2
        #Get distance to center point
        return np.sqrt((ax-x)**2+(ay-y)**2)
    def good(self):
        return not (self.x0 is None)

def ExtractComponent(original_image, component_matrix, component_number):
    """Extracts a component from a ConnectedComponents matrix"""
    #Create a true-false matrix indicating if a pixel is part of a particular component
    is_component = component_matrix==component_number
    #Find the coordinates of those pixels
    coords = np.argwhere(is_component)

    # Bounding box of non-black pixels.
    y0, x0 = coords.min(axis=0)
    y1, x1 = coords.max(axis=0) + 1   # slices are exclusive at the top

    # Get the contents of the bounding box.
    return x0,x1,y0,y1,original_image[y0:y1, x0:x1]

numbers_img = thresh.copy() #This is used purely to show that we can identify numbers
numbers = []
for component in range(components.max()):
    tx0,tx1,ty0,ty1,this_component = ExtractComponent(thresh, components, component)
    #ShowImage('Component #{0}'.format(component), this_component, 'gray')
    cheight, cwidth = this_component.shape
    #print(cwidth,cheight) #Enable this to see dimensions
    #Identify numbers based on aspect ratio
    if (abs(cwidth-14)<3 or abs(cwidth-7)<3) and abs(cheight-24)<3:
        numbers_img[ty0:ty1,tx0:tx1] = 128
        numbers.append(Box(tx0,tx1,ty0,ty1))
ShowImage('Numbers', numbers_img, 'gray')

Numbers with Separated Boxes

我们现在将数字通过稍微扩大它们的边界框并寻找重叠部分来连接成连续的块。
#This is kind of a silly way to do this, but it will work find for small quantities (hundreds)
merged=True                                       #If true, then a merge happened this round
while merged:                                     #Continue until there are no more mergers
    merged=False                                  #Reset merge indicator
    for a,b in itertools.combinations(numbers,2): #Consider all pairs of numbers
        if a.overlaps(b,10):                      #If this pair overlaps
            a.merge(b)                            #Merge it
            merged=True                           #Make a note that we've merged
numbers = [x for x in numbers if x.good()]        #Eliminate those boxes that were gobbled by the mergers

#This is used purely to show that we can identify numbers
numbers_img = thresh.copy() 
for n in numbers:
    numbers_img[n.y0:n.y1,n.x0:n.x1] = 128
    thresh[n.y0:n.y1,n.x0:n.x1] = 0 #Drop numbers from thresholded image
ShowImage('Numbers', numbers_img, 'gray')

Numbers connected

好的,现在我们已经确定了这些数字!稍后我们将使用它们来识别部件。

识别箭头

接下来,我们需要找出数字所指向的部分。为此,我们想要检测线条。霍夫变换很适合这个任务。为了减少误报的数量,我们对数据进行骨架化处理,将其转换成最多只有一个像素宽度的表示形式。

skel = sk.img_as_ubyte(skm.skeletonize(thresh>0))
ShowImage('Skeleton', skel, 'gray')

Skeleton

现在我们进行Hough变换。我们正在寻找一个可以识别从数字到零件的所有线条的变换。正确地完成这一步可能需要调整参数。
lines = cv2.HoughLinesP(
    skel,
    1,           #Resolution of r in pixels
    np.pi / 180, #Resolution of theta in radians
    30,          #Minimum number of intersections to detect a line
    None,
    80,          #Min line length
    10           #Max line gap
)
lines = [x[0] for x in lines]

line_img = thresh.copy()
line_img = cv2.cvtColor(line_img, cv2.COLOR_GRAY2BGR)
for l in lines:
    color = tuple(map(int, np.random.randint(low=0, high=255, size=3)))
    cv2.line(line_img, (l[0], l[1]), (l[2], l[3]), color, 3, cv2.LINE_AA)
ShowImage('Lines', line_img, 'bgr')

Lines Identified

我们现在要找到最接近每个数字的直线或直线,并仅保留这些。我们实际上是过滤掉所有不是箭头的线条。为此,我们将每条线的端点与每个数字框的中心点进行比较。
  comp_labels = np.zeros(img.shape[0:2], dtype=np.uint8)

for n_idx,n in enumerate(numbers):
    distvals = []
    for i,l in enumerate(lines):
        #Distances from each point of line to midpoint of rectangle
        dists    = [n.dist(l[0],l[1]),n.dist(l[2],l[3])] 
        #Minimum distance and the end point (0 or 1) of the line associated with that point
        #Tuples of (Line Number, Line Point, Dist to Line Point) are produced
        distvals.append( (i,np.argmin(dists),np.min(dists)) )
    #Sort by distance between the number box and the line
    distvals = sorted(distvals, key=lambda x: x[2])
    #Include nearby lines, not just the closest one. This accounts for forking.
    distvals = [x for x in distvals if x[2]<1.5*distvals[0][2]]

    #Draw a white rectangle where the number box was
    cv2.rectangle(comp_labels, (n.x0,n.y0), (n.x1,n.y1), 1, cv2.FILLED)

    #Draw white lines where the arrows are
    for dv in distvals:
        l = lines[dv[0]]
        lp = (l[0],l[1]) if dv[1]==0 else (l[2],l[3])
        cv2.line(comp_labels, (l[0], l[1]), (l[2], l[3]), 1, 3, cv2.LINE_AA)
        cv2.line(comp_labels, (lp[0], lp[1]), ((n.x0+n.x1)//2, (n.y0+n.y1)//2), 1, 3, cv2.LINE_AA)
ShowImage('Lines', comp_labels, 'gray')

The Arrows

查找零件

这部分很难!我们现在想要将图像中的零件进行分割。如果有一种方法可以断开连接子部件的线条,那么这将会很容易。不幸的是,连接子部件的线与构成零件的许多线条具有相同的宽度。

为了解决这个问题,我们可以使用大量的逻辑。这将是痛苦且容易出错的。

或者,我们可以假设您有一个专家来协助。这位专家的唯一工作就是切断连接子部件的线条。这对他们来说应该既容易又快速。对于人类来说,标记所有东西可能会很慢且令人沮丧,但对于计算机来说却很快。将事物分离对于人类来说很容易,但对于计算机来说很困难。因此,我们让两者各司其职。

在这种情况下,您可能可以在几分钟内培训某人来完成这项工作,因此真正的“专家”并不是必需的。只需要一个稍有能力的人类即可。

如果您要追求这一点,您需要编写专家协助工具。为此,保存骨架图像,让您的专家修改它们,然后再读取骨架化图像。就像这样。

#Save the image, or display it on a GUI
#cv2.imwrite("/z/skel.png", skel);
#EXPERT DOES THEIR THING HERE
#Read the expert-mediated image back in
skelhuman = cv2.imread('/z/skel.png')
#Convert back to the form we need
skelhuman = cv2.cvtColor(skelhuman,cv2.COLOR_BGR2GRAY)
ret, skelhuman = cv2.threshold(skelhuman,0,255,cv2.THRESH_OTSU)
ShowImage('SkelHuman', skelhuman, 'gray')

Skeleton with human modification

现在我们已经把零件分开了,我们将尽可能消除箭头。我们已经提取了这些内容,因此如果需要,我们可以随时添加它们回来。
为了消除箭头,我们将找到所有终止于其他位置而不是另一条线的线条。也就是说,我们将定位只有一个相邻像素的像素。然后,我们将删除该像素并查看其邻居。通过迭代执行此操作可以消除箭头。由于我不知道其他术语,所以我将其称为“融合变换”。由于这将需要操作单个像素,在Python中速度会非常慢,因此我们将使用Cython编写转换。
%%cython -a --cplus
import cython

from libcpp.queue cimport queue
import numpy as np
cimport numpy as np

@cython.boundscheck(False)
@cython.wraparound(False)
@cython.nonecheck(False)
@cython.cdivision(True) 
cpdef void FuseTransform(unsigned char [:, :] image):
    # set the variable extension types
    cdef int c, x, y, nx, ny, width, height, neighbours
    cdef queue[int] q

    # grab the image dimensions
    height = image.shape[0]
    width  = image.shape[1]

    cdef int dx[8]
    cdef int dy[8]

    #Offsets to neighbouring cells
    dx[:] = [-1,-1,0,1,1,1,0,-1]
    dy[:] = [0,-1,-1,-1,0,1,1,1]

    #Find seed cells: those with only one neighbour
    for y in range(1, height-1):
        for x in range(1, width-1):
            if image[y,x]==0: #Seed cells cannot be blank cells
                continue
            neighbours = 0
            for n in range(0,8):   #Looks at all neighbours
                nx = x+dx[n]
                ny = y+dy[n]
                if image[ny,nx]>0: #This neighbour has a value
                    neighbours += 1
            if neighbours==1:      #Was there only one neighbour?
                q.push(y*width+x)  #If so, this is a seed cell

    #Starting with the seed cells, gobble up the lines
    while not q.empty():
        c = q.front()
        q.pop()
        y = c//width         #Convert flat index into 2D x-y index
        x = c%width
        image[y,x] = 0       #Gobble up this part of the fuse
        neighbour  = -1      #No neighbours yet
        for n in range(0,8): #Look at all neighbours
            nx = x+dx[n]     #Find coordinates of neighbour cells
            ny = y+dy[n]
            #If the neighbour would be off the side of the matrix, ignore it
            if nx<0 or ny<0 or nx==width or ny==height:
                continue
            if image[ny,nx]>0:      #Is the neighbouring cell active?
                if neighbour!=-1:   #If we've already found an active neighbour
                    neighbour=-1    #Then pretend we found no neighbours
                    break           #And stop looking. This is the end of the fuse.
                else:               #Otherwise, make a note of the neighbour's index.
                    neighbour = ny*width+nx
        if neighbour!=-1:           #If there was only one neighbour
            q.push(neighbour)       #Continue burning the fuse

回到标准的Python:

#Apply the Fuse Transform
skh_dilated=skelhuman.copy()
FuseTransform(skh_dilated)
ShowImage('Fuse Transform', skh_dilated, 'gray')

Fuse Transformed

现在我们已经消除了连接部件的所有箭头和线条,我们要大幅度扩张剩下的像素。

kernel = np.ones((3,3),np.uint8)
dilated  = cv2.dilate(skh_dilated, kernel, iterations=6)
ShowImage('Dilation', dilated, 'gray')

Dilated parts

把所有东西放在一起

并且覆盖我们之前分割出来的标签和箭头...

comp_labels_dilated  = cv2.dilate(comp_labels, kernel, iterations=5)
labels_combined = np.uint8(np.logical_or(comp_labels_dilated,dilated))
ShowImage('Comp Labels', labels_combined, 'gray')

Combined arrows and parts

最后,我们使用Color Brewer中漂亮的颜色对合并的数字框、组件箭头和零件进行着色。然后将其覆盖在原始图像上,以获得所需的高亮效果。
ret, labels = cv2.connectedComponents(labels_combined)
colormask = np.zeros(img.shape, dtype=np.uint8)
#Colors from Color Brewer
colors = [(228,26,28),(55,126,184),(77,175,74),(152,78,163),(255,127,0),(255,255,51),(166,86,40),(247,129,191),(153,153,153)]
for l in range(labels.max()):
    if l==0: #Background component
        colormask[labels==0] = (255,255,255)
    else:
        colormask[labels==l] = colors[l]
ShowImage('Comp Labels', colormask, 'bgr')
blended = cv2.addWeighted(img,0.7,colormask,0.3,0)
ShowImage('Blended', blended, 'bgr')

Colored parts

最终图像

Final image

因此,简要概括一下,我们识别了数字、箭头和零件。在某些情况下,我们能够自动分离它们。在其他情况下,我们使用专家来处理。当我们需要单独操作像素时,我们使用Cython来提高速度。

当然,这种方法的危险在于,其他图像可能会打破我在这里所做的(许多)假设。但是,当你尝试使用单个图像来呈现问题时,这就是你承担的风险。


致命错误:ios:没有那个文件或目录 在编译Cython时,使用cythonize -a -i fuse.pyx时出现了上述错误。请注意,在#include "ios"中引用的文件可能不存在或路径不正确。 - Nithin Varghese
@NithinVarghese:你可能需要在这里提出一个单独的问题,因为这是Cython的问题,而不是图像处理本身的问题。(也就是说,我的知识不能用来帮助你解决这个编译问题。) - Richard

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接