将一组x、y点的匹配集合与另一组进行比对,后者可能存在缩放、旋转、移动和缺失元素。

17

(我为什么要这样做?请见下面的解释)

考虑如下所示的两组点,AB

enter image description here

看起来可能不像,但是集合A“隐藏”在集合B中。由于B中的点是相对于A进行缩放、旋转和平移的,因此它们不容易被发现。A中存在的一些点在B中不存在,而B中包含许多不在A中的点,这使得情况更加糟糕。

我需要找到适当的缩放、旋转和平移值,以便将B集合与A集合匹配。在上面的情况中,正确的值为:

scale = 0.14, rot_angle = 0.0, x_transl = 35.0, y_transl = 2.0

这将产生(足够好的)匹配

enter image description here

(在红色区域中,仅显示匹配的B点;这些点位于右侧第一张图中的扇形区域1000<x<2000,y~2000中)。由于自由度很多(DoF:缩放+旋转+2D平移),我意识到可能存在不匹配的可能性,但是这些点的坐标并不是随机的(尽管看起来像),因此发生这种情况的概率非常小。

我编写的代码(见下文)使用蛮力法循环遍历每个预定义范围中所有可能的DoF值。代码的核心基于将A中的每个点与B中的任何点的距离最小化。

该代码可以工作(实际上生成了上述解决方案),但由于解决方案数量(即每个DoF的接受值的组合)随着范围的增大而增加,因此它可能会变得不可接受地缓慢(还会占用系统中的所有RAM)。

如何提高代码的性能?我可以接受任何解决方案,包括使用numpy和/或scipy。也许可以尝试类似Basing-Hopping这样的方法来搜索最佳匹配(或相对接近的匹配),而不是目前采用的暴力方法?

import numpy as np
from scipy.spatial import distance
import math


def scalePoints(B_center, delta_x, delta_y, scale):
    """
    Scales xy points.

    http://codereview.stackexchange.com/q/159183/35351
    """
    x_scale = B_center[0] - scale * delta_x
    y_scale = B_center[1] - scale * delta_y

    return x_scale, y_scale


def rotatePoints(center, x, y, angle):
    """
    Rotates points in 'xy' around 'center'. Angle is in degrees.
    Rotation is counter-clockwise

    https://dev59.com/2mIj5IYBdhLWcg3wpmsH#20024348
    """
    angle = math.radians(angle)
    xy_rot = x - center[0], y - center[1]
    xy_rot = (xy_rot[0] * math.cos(angle) - xy_rot[1] * math.sin(angle),
              xy_rot[0] * math.sin(angle) + xy_rot[1] * math.cos(angle))
    xy_rot = xy_rot[0] + center[0], xy_rot[1] + center[1]

    return xy_rot


def distancePoints(set_A, x_transl, y_transl):
    """
    Find the sum of the minimum distance of points in set_A to points in set_B.
    """
    d = distance.cdist(set_A, zip(*[x_transl, y_transl]), 'euclidean')
    # Sum of all minimal distances.
    d_sum = np.sum(np.min(d, axis=1))

    return d_sum


def match_frames(
        set_A, B_center, delta_x, delta_y, tol, sc_min, sc_max, sc_step,
        ang_min, ang_max, ang_step, xmin, xmax, xstep, ymin, ymax, ystep):
    """
    Process all possible solutions in the defined ranges.
    """
    N = len(set_A)
    # Ranges
    sc_range = np.arange(sc_min, sc_max, sc_step)
    ang_range = np.arange(ang_min, ang_max, ang_step)
    x_range = np.arange(xmin, xmax, xstep)
    y_range = np.arange(ymin, ymax, ystep)
    print("Total solutions: {:.2e}".format(
          np.prod([len(_) for _ in [sc_range, ang_range, x_range, y_range]])))

    d_sum, params_all = [], []
    for scale in sc_range:
        # Scaled points.
        x_scale, y_scale = scalePoints(B_center, delta_x, delta_y, scale)
        for ang in ang_range:
            # Rotated points.
            xy_rot = rotatePoints(B_center, x_scale, y_scale, ang)
            # x translation
            for x_tr in x_range:
                x_transl = xy_rot[0] + x_tr
                # y translation
                for y_tr in y_range:
                    y_transl = xy_rot[1] + y_tr

                    # Find minimum distance sum.
                    d_sum.append(distancePoints(set_A, x_transl, y_transl))

                    # Store solutions.
                    params_all.append([scale, ang, x_tr, y_tr])

                    # Condition to break out if given tolerance for match
                    # is achieved.
                    if d_sum[-1] <= tol * N:
                        print("Match found:", scale, ang, x_tr, y_tr)
                        i_min = d_sum.index(min(d_sum))
                        return i_min, params_all

        # Print best solution found so far.
        i_min = d_sum.index(min(d_sum))
        print("d_sum_min = {:.2f}".format(d_sum[i_min]))

    return i_min, params_all


# Data.
set_A = [[2015.81, 1981.665], [1967.31, 1960.865], [1962.91, 1951.365],
         [1964.91, 1994.565], [1894.41, 1957.065]]
set_B = [
    [2689.28, 3507.04, 2895.67, 1051.3, 1929.49, 1035.97, 752.44, 130.62,
     620.06, 2769.06, 1580.77, 281.76, 224.54, 3848.3],
    [2061.19, 3700.27, 2131.2, 1837.3, 2017.52, 80.96, 3524.61, 3821.22,
     3711.53, 1812.12, 1868.33, 3865.77, 3273.77, 2100.71]]

# This is necessary to apply the scaling.
x, y = np.asarray(set_B)
# Center of B points, defined as the center of the minimal rectangle that
# contains all points.
B_center = [(min(x) + max(x)) * .5, (min(y) + max(y)) * .5]
# Difference between the center coordinates and the xy points.
delta_x, delta_y = B_center[0] - x, B_center[1] - y

# Tolerance in pixels for match.
tol = 1.
# Ranges for each DoF.
sc_min, sc_max, sc_step = .01, .2, .01
ang_min, ang_max, ang_step = -30., 30., 1.
xmin, xmax, xstep = -150., 150., 1.
ymin, ymax, ystep = -150., 150., 1.

# Find proper scaling + rotation + translation for set_B.
i_min, params_all = match_frames(
    set_A, B_center, delta_x, delta_y, tol, sc_min, sc_max, sc_step,
    ang_min, ang_max, ang_step, xmin, xmax, xstep, ymin, ymax, ystep)

# Best match found
print(params_all[i_min])

我为什么要这样做?

当天文学家观测星场时,还需要观测所谓的“标准星场”。这是为了能够将观测到的星体的“仪器星等”(一种对亮度的对数测量)转换为通用的普遍尺度,因为这些星等取决于所使用的光学系统(望远镜+CCD阵列)。在此示例中,可以看到标准场位于左下方,观测场位于右侧。

enter image description here

请注意,在上面使用的集合A中的点是标准场中标记的星号,而集合B是在观测场中检测到的星星(在上面用红色标记)。
将观测场中与标准场中的星星相对应的星星识别过程仍然是通过肉眼完成的,即使是今天也是如此。这是由于任务的复杂性。
在上面的观测图像中,存在相当多的缩放,但没有旋转和很少的平移。这构成了一个相当有利的情况;它可能会更糟。我正在尝试开发一个简单的算法,以避免逐个手动识别观测场中的星星作为标准场中的星星。

litepresence提出的解决方案

这是我根据litepresence的答案编写的脚本。

import itertools
import numpy as np
import matplotlib.pyplot as plt


def getTriangles(set_X, X_combs):
    """
    Inefficient way of obtaining the lengths of each triangle's side.
    Normalized so that the minimum length is 1.
    """
    triang = []
    for p0, p1, p2 in X_combs:
        d1 = np.sqrt((set_X[p0][0] - set_X[p1][0]) ** 2 +
                     (set_X[p0][1] - set_X[p1][1]) ** 2)
        d2 = np.sqrt((set_X[p0][0] - set_X[p2][0]) ** 2 +
                     (set_X[p0][1] - set_X[p2][1]) ** 2)
        d3 = np.sqrt((set_X[p1][0] - set_X[p2][0]) ** 2 +
                     (set_X[p1][1] - set_X[p2][1]) ** 2)
        d_min = min(d1, d2, d3)
        d_unsort = [d1 / d_min, d2 / d_min, d3 / d_min]
        triang.append(sorted(d_unsort))

    return triang


def sumTriangles(A_triang, B_triang):
    """
    For each normalized triangle in A, compare with each normalized triangle
    in B. find the differences between their sides, sum their absolute values,
    and select the two triangles with the smallest sum of absolute differences.
    """
    tr_sum, tr_idx = [], []
    for i, A_tr in enumerate(A_triang):
        for j, B_tr in enumerate(B_triang):
            # Absolute value of lengths differences.
            tr_diff = abs(np.array(A_tr) - np.array(B_tr))
            # Sum the differences
            tr_sum.append(sum(tr_diff))
            tr_idx.append([i, j])

    # Index of the triangles in A and B with the smallest sum of absolute
    # length differences.
    tr_idx_min = tr_idx[tr_sum.index(min(tr_sum))]
    A_idx, B_idx = tr_idx_min[0], tr_idx_min[1]
    print("Smallest difference: {}".format(min(tr_sum)))

    return A_idx, B_idx


# Data.
set_A = [[2015.81, 1981.665], [1967.31, 1960.865], [1962.91, 1951.365],
         [1964.91, 1994.565], [1894.41, 1957.065]]
set_B = [
    [2689.28, 3507.04, 2895.67, 1051.3, 1929.49, 1035.97, 752.44, 130.62,
     620.06, 2769.06, 1580.77, 281.76, 224.54, 3848.3],
    [2061.19, 3700.27, 2131.2, 1837.3, 2017.52, 80.96, 3524.61, 3821.22,
     3711.53, 1812.12, 1868.33, 3865.77, 3273.77, 2100.71]]
set_B = zip(*set_B)

# All possible triangles.
A_combs = list(itertools.combinations(range(len(set_A)), 3))
B_combs = list(itertools.combinations(range(len(set_B)), 3))

# Obtain normalized triangles.
A_triang, B_triang = getTriangles(set_A, A_combs), getTriangles(set_B, B_combs)

# Index of the A and B triangles with the smallest difference.
A_idx, B_idx = sumTriangles(A_triang, B_triang)

# Indexes of points in A and B of the best match triangles.
A_idx_pts, B_idx_pts = A_combs[A_idx], B_combs[B_idx]
print 'triangle A %s matches triangle B %s' % (A_idx_pts, B_idx_pts)

# Matched points in A and B.
print "A:", [set_A[_] for _ in A_idx_pts]
print "B:", [set_B[_] for _ in B_idx_pts]

# Plot
A_pts = zip(*[set_A[_] for _ in A_idx_pts])
B_pts = zip(*[set_B[_] for _ in B_idx_pts])
plt.scatter(*A_pts, s=10, c='k')
plt.scatter(*B_pts, s=10, c='r')
plt.show()

这个方法几乎是即时的,能够产生正确的匹配:

Smallest difference: 0.0314154749597
triangle A (0, 1, 4) matches triangle B (3, 4, 10)
A: [[2015.81, 1981.665], [1967.31, 1960.865], [1894.41, 1957.065]]
B: [(1051.3, 1837.3), (1929.49, 2017.52), (1580.77, 1868.33)]

enter image description here


1
你尝试过使用https://en.m.wikipedia.org/wiki/Iterative_closest_point或RANSAC吗? - visoft
我会看一下,谢谢。 - Gabriel
1
@Gabriel 首先,感谢您提出如此深入研究且有趣的问题!我认为优化(最小化从 AB 的点之间的距离)是一个合理的方法,但确实,通过预定义的三个 DOF 值进行暴力迭代并不太好。也许可以看一下 适当的 优化库?http://esa.github.io/pygmo/quickstart.html - Aleksander Lidtke
谢谢@AleksanderLidtke。我看了一下PyGMO,虽然它看起来是一个非常有趣的项目,但我倾向于远离那些不能通过pipconda或最多git clone安装的软件包。此外,该软件包似乎利用了多个CPU,而我正在寻求代码级别的性能提升。 - Gabriel
我刚刚用实际的 CCD 星表进行了测试,A有35个点,B有28个点,你的脚本花费了116.12756299972534秒才找出最佳匹配。我有数百个星表,这太可怕了!你有什么办法来提高速度吗? - John Chen
显示剩余2条评论
2个回答

5

1) 我会给所有点打标签,并从每个集合中找到所有可能的三个点的组合来解决这个问题。

# normalize B data to same format as A
set_Bx, set_By = (set_B)
set_B = []
for i in range(len(set_Bx)):
    set_B.append([set_Bx[i],set_By[i]])
'''
set_B = [[2689.28, 2061.19], [3507.04, 3700.27], [2895.67, 2131.2], 
[1051.3, 1837.3], [1929.49, 2017.52], [1035.97, 80.96], [752.44, 
3524.61], [130.62, 3821.22], [620.06, 3711.53], [2769.06, 1812.12], 
[1580.77, 1868.33], [281.76, 3865.77], [224.54, 3273.77], [3848.3, 
2100.71]]
'''

list(itertools.combinations(range(len(set_A)), 3))
list(itertools.combinations(range(len(set_B)), 3))

如何在Python中生成列表的所有排列组合

2) 对于每个3点组,计算相应三角形的边长;然后对A组和B组重复此过程。

dist = sqrt( (x2 - x1)**2 + (y2 - y1)**2 )

如何计算两点间的距离?

3) 然后对于每个三角形,将其边长比例缩小,使得每个三角形的最短边都等于1;其他边按比例缩小。

对于两个相似的三角形:

两个三角形的周长之比与它们对应的边长之比相同。它们对应的中线和高也满足这个比例关系。

http://www.mathopenref.com/similartrianglesparts.html

4) 对于A组中的每个三角形,与B组中的每个三角形进行逐元素的减法运算。然后求和并找出总和最小的A组和B组中的三角形。

list(numpy.array(list1)-numpy.array(list2))

Python中的两个列表相减

5) 给出匹配的三角形; 在CPU / RAM方面,找到适当的缩放,平移和旋转应该是相对容易的。

enter image description here

ETA1:草稿脚本

ETA2:修补了在评论中讨论的错误:使用sum(abs())而不是abs(sum())。现在它能正常工作,并且速度也很快!

'''
known correct solution

A = [[1894.41, 1957.065],[1967.31, 1960.865],[2015.81, 1981.665]]
B = [[1051.3, 1837.3],[1580.77, 1868.33],[1929.49, 2017.52]]

'''
import numpy as np
import itertools
import math
import operator

set_A = [[2015.81, 1981.665], [1967.31, 1960.865], [1962.91, 1951.365],
        [1964.91, 1994.565], [1894.41, 1957.065]]
set_B = [[2689.28, 3507.04, 2895.67, 1051.3, 1929.49, 1035.97, 752.44, 130.62,
        620.06, 2769.06, 1580.77, 281.76, 224.54, 3848.3],
        [2061.19, 3700.27, 2131.2, 1837.3, 2017.52, 80.96, 3524.61, 3821.22,
        3711.53, 1812.12, 1868.33, 3865.77, 3273.77, 2100.71]]

# normalize set B data to set A format
set_Bx, set_By = (set_B)
set_B = []
for i in range(len(set_Bx)):
    set_B.append([set_Bx[i],set_By[i]])

'''
set_B = [[2689.28, 2061.19], [3507.04, 3700.27], [2895.67, 2131.2], 
[1051.3, 1837.3], [1929.49, 2017.52], [1035.97, 80.96], [752.44, 3524.61], 
[130.62, 3821.22], [620.06, 3711.53], [2769.06, 1812.12], [1580.77, 1868.33], 
[281.76, 3865.77], [224.54, 3273.77], [3848.3, 2100.71]]
'''

print set_A
print set_B
print len(set_A)
print len(set_B)

set_A_tri = list(itertools.combinations(range(len(set_A)), 3))
set_B_tri = list(itertools.combinations(range(len(set_B)), 3))

print set_A_tri
print set_B_tri
print len(set_A_tri)
print len(set_B_tri)

'''
set_A = [[2015.81, 1981.665], [1967.31, 1960.865], [1962.91, 1951.365], 
[1964.91, 1994.565], [1894.41, 1957.065]]

set_A_tri = [(0, 1, 2), (0, 1, 3), (0, 1, 4), (0, 2, 3), (0, 2, 4), (0, 3, 4), 
(1, 2, 3), (1, 2, 4), (1, 3, 4), (2, 3, 4)]
'''

def distance(x1,y1,x2,y2):
    return math.sqrt((x2 - x1)**2 + (y2 - y1)**2 )

def tri_sides(set_x, set_x_tri):

    triangles = []
    for i in range(len(set_x_tri)):

        point1 = set_x_tri[i][0]
        point2 = set_x_tri[i][1]
        point3 = set_x_tri[i][2]

        point1x, point1y = set_x[point1][0], set_x[point1][1]
        point2x, point2y = set_x[point2][0], set_x[point2][1]
        point3x, point3y = set_x[point3][0], set_x[point3][1] 

        len1 = distance(point1x,point1y,point2x,point2y)
        len2 = distance(point1x,point1y,point3x,point3y)
        len3 = distance(point2x,point2y,point3x,point3y)

        min_side = min(len1,len2,len3)
        len1/=min_side
        len2/=min_side
        len3/=min_side
        t=[len1,len2,len3]
        t.sort()
        triangles.append(t)

    return triangles

A_triangles = tri_sides(set_A, set_A_tri)
B_triangles = tri_sides(set_B, set_B_tri)

print A_triangles
'''
[[1.0, 5.0405616860744304, 5.822935502560814], 
[1.0, 1.5542012854321234, 1.5619803879976761], 
[1.0, 1.3832883678507584, 2.347214708755337], 
[1.0, 1.2141910838179129, 1.4096730529373076], 
[1.0, 1.1275138587537166, 2.0318412465223665], 
[1.0, 1.5207417600732074, 2.3589630093994876], 
[1.0, 3.2270326342163584, 4.13069930678442], 
[1.0, 6.565440477766354, 6.972550347780966], 
[1.0, 2.1606693015281997, 2.3635387983160885], 
[1.0, 1.589425903498476, 1.846471085870448]]
'''
print B_triangles

def list_subtract(list1,list2):

    return np.absolute(np.array(list1)-np.array(list2))

sums = []
threshold = 1
for i in range(len(A_triangles)):
    for j in range(len(B_triangles)):
        k = sum(list_subtract(A_triangles[i], B_triangles[j]))
        if k < threshold:
            sums.append([i,j,k])
# sort by smallest sum
sums = sorted(sums, key=operator.itemgetter(2))

print sums
print 'winner %s' % sums[0]
print sums[0][0]
print sums[0][1]
match_A = set_A_tri[sums[0][0]]
match_B = set_B_tri[sums[0][1]]
print 'triangle A %s matches triangle B %s' % (match_A, match_B)

match_A_pts = []
match_B_pts = []
for i in range(3):
    match_A_pts.append(set_A[match_A[i]])
    match_B_pts.append(set_B[match_B[i]])

print 'triangle A has points %s' % match_A_pts
print 'triangle B has points %s' % match_B_pts

'''
winner [2, 204, 0.031415474959738399]
2
204
triangle A (0, 1, 4) matches triangle B (3, 4, 10)
triangle A has points [[2015.81, 1981.665], [1967.31, 1960.865], [1894.41, 1957.065]]
triangle B has points [[1051.3, 1837.3], [1929.49, 2017.52], [1580.77, 1868.33]]
'''

1
@litepresence 当 AB 包含不同的点时怎么办? - Aleksander Lidtke
即使A、B包含不同的点,您仍然可以找到两个集合中最相似的三角形。我看不出有什么问题? - litepresence
1
@Gabriel 我更新了完整的脚本... 但是它有一个索引错误。你应该能够掌握要点、进行速度测试,并希望找到解决索引问题的补丁。这真是一个很酷的问题。 - litepresence
1
@litepresence 我刚刚完成了自己的脚本,实现你的答案。据我所见,我的脚本似乎无法正常工作(产生了错误的匹配),所以我现在会检查你的脚本,看看是否我的脚本只是有些bug。 - Gabriel
1
@litepresence 撤回我之前说的话,你的方法确实可行!问题在于,在你的脚本中我删除了t.sort()这行,认为它不重要,这是完全错误的。当考虑到该行与我上面提到的修复不正确的abs(sum())时,就会得出正确的解决方案。我将编辑我在问题中发布的你的方法的脚本。非常感谢! - Gabriel
显示剩余5条评论

1

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接