在三维空间中生成最小最近邻距离的随机点

Question

在三维空间中生成最小最近邻距离的随机点

pythonnumpyrandom

4

类似问题：
生成N个随机点，使它们之间的预定义距离为delta

在R中选择N个最远的点

但它们要么是用matlab编写的，要么不能完全满足所需任务。

我必须在长度为10 Angstrom的立方体内创建N个点，使任意两个点之间的距离大于delta。

例如：假设我在x、y、z轴上有一个长度为10 Angstrom的盒子。
我想在此框内放置200个随机点，以使任意两点之间的最小距离大于3 Angstrom。

尝试：

#!python
# -*- coding: utf-8 -*-#
import numpy as np
np.random.seed(100)
np.set_printoptions(2)

box_length = 10
d = box_length
threshold = 6
num_points = 5
x1, y1, z1 = np.random.random(num_points)* box_length, np.random.random(num_points)* box_length, np.random.random(num_points)* box_length
x2, y2, z2 = np.random.random(num_points)* box_length, np.random.random(num_points)* box_length, np.random.random(num_points)* box_length

# print(len(x1))

# just for checking make ponts integers
for i in range(len(x1)):
    x1[i] = int(x1[i])
    x2[i] = int(x2[i])
    y1[i] = int(y1[i])
    y2[i] = int(y2[i])
    z1[i] = int(z1[i])
    z2[i] = int(z2[i])


print(x1)
print(y1)
print(z1)
print("\n")

pt1_lst = []
pt2_lst = []
for i in range(len(x1)):
    a, b, c    = x1[i], y1[i], z1[i]
    a2, b2, c2 = x2[i], y2[i], z2[i]
    dist2      = (a-a2)**2 + (b-b2)**2 + (c-c2)**2

    print("\n")
    print(a,b,c)
    print(a2,b2,c2)
    print(dist2)

    if dist2 > threshold**2:
        pt1 = (a,b,c)
        pt2 = (a2,b2,c2)
        pt1_lst.append(pt1)
        pt2_lst.append(pt2)



print("points")
print(pt1_lst)
print(pt2_lst)

代码问题：这段代码比较了从points1到points2的点，但没有在points1和points2内部进行比较。

也许有更好的算法来解决这个问题，向那些想出了解决方案的人致敬。

谢谢。附言：我做了一些研究并尝试找到相关链接，但无法解决问题。仍然有一些相关链接： 更新： 我尝试了下面Stefans的代码，它对于N=10是有效的，但我尝试了N=200，使用的时间非常长（我在10分钟后停止了代码）。

有没有更有效的方法？

真的非常感谢帮助！使用Python从节点n生成长度为L的所有路径
使用Python在定义的矩形内创建随机点

- user8864088

您可以使用Scipy中的cdist（https://docs.scipy.org/doc/scipy/reference/generated/scipy.spatial.distance.cdist.html）来执行以下步骤： 1- 生成随机点。 2- 使用cdist。 3- 删除距离指定距离内的这些点。 - Mohamed Ibrahim

5个回答

1

其他一些解决方案会在盒子中选取 n 个随机点，如果任意两个点太靠近就将整个集合作废。这并不高效；更有效的方法是找到最糟糕的点并将其移动，步骤如下：

import numpy as np

def get_sphere_distribution(n, dmin, Ls, maxiter=1e4, allow_wall=True):
    """Get random points in a box with given dimensions and minimum separation.
    
    Parameters:
      
    - n: number of points
    - dmin: minimum distance
    - Ls: dimensions of box, shape (3,) array 
    - maxiter: maximum number of iterations.
    - allow_wall: whether to allow points on wall; 
       (if False: points need to keep distance dmin/2 from the walls.)
        
    Return:
        
    - ps: array (n, 3) of point positions, 
      with 0 <= ps[:, i] < Ls[i]
    - n_iter: number of iterations
    - dratio: average nearest-neighbor distance, divided by dmin.
    
    Note: with a fill density (sphere volume divided by box volume) above about
    0.53, it takes very long. (Random close-packed spheres have a fill density
    of 0.64).
    
    Author: Han-Kwang Nienhuys (2020)
    Copying: BSD, GPL, LGPL, CC-BY, CC-BY-SA
    See Stackoverflow: https://dev59.com/bqbja4cB1Zd3GeqPfmVR#62895898 
    """
    Ls = np.array(Ls).reshape(3)
    if not allow_wall:
        Ls -= dmin
    
    # filling factor; 0.64 is for random close-packed spheres
    # This is an estimate because close packing is complicated near the walls.
    # It doesn't work well for small L/dmin ratios.
    sphere_vol = np.pi/6*dmin**3
    box_vol = np.prod(Ls + 0.5*dmin)
    fill_dens = n*sphere_vol/box_vol
    if fill_dens > 0.64:
        msg = f'Too many to fit in the volume, density {fill_dens:.3g}>0.64'
        raise ValueError(msg)
    
    # initial try   
    ps = np.random.uniform(size=(n, 3)) * Ls
    
    # distance-squared matrix (diagonal is self-distance, don't count)
    dsq = ((ps - ps.reshape(n, 1, 3))**2).sum(axis=2)
    dsq[np.arange(n), np.arange(n)] = np.infty

    for iter_no in range(int(maxiter)):
        # find points that have too close neighbors
        close_counts = np.sum(dsq < dmin**2, axis=1)  # shape (n,)
        n_close = np.count_nonzero(close_counts)
        if n_close == 0:
            break
        
        # Move the one with the largest number of too-close neighbors
        imv = np.argmax(close_counts)
        
        # new positions
        newp = np.random.uniform(size=3)*Ls
        ps[imv]= newp
        
        # update distance matrix
        new_dsq_row = ((ps - newp.reshape(1, 3))**2).sum(axis=-1)
        dsq[imv, :] = dsq[:, imv] = new_dsq_row
        dsq[imv, imv] = np.inf
    else:
        raise RuntimeError(f'Failed after {iter_no+1} iterations.')

    if not allow_wall:
        ps += dmin/2
    
    dratio = (np.sqrt(dsq.min(axis=1))/dmin).mean()
    return ps, iter_no+1, dratio

我尝试了不同的策略来决定要移动哪些点（也可以查看此答案的编辑历史记录）；似乎每次迭代只移动一个错误点比尝试同时移动多个错误点更加有效。这使得收敛到填充密度0.25与0.53之间的差异变得更加明显。

以下是针对和盒子大小LxLxL的基准测试；3.33对应于问题中的大小。每个盒子大小的最高n是在3e+4次迭代内收敛的最后一个n。列是平均最近邻距离。

        L    n  n_iter  d_nn  density
0    3.33   10       9  1.39    0.123
3    3.33   20      40  1.16    0.246
7    3.33   40    5510  1.06    0.493
8    3.33   43    2591  1.03    0.530

9    5.70   10       2  1.73    0.033
14   5.70   60      45  1.22    0.199
16   5.70  150    4331  1.05    0.499
17   5.70  165   25690  1.06    0.549

18  10.00   40       3  1.97    0.030
22  10.00   70      10  1.71    0.053
23  10.00  150      47  1.40    0.113
24  10.00  300     276  1.19    0.225
25  10.00  600    4808  1.07    0.451
27  10.00  720   26418  1.05    0.541

密度值是直径为dmin的球体密度（近似校正壁效应）。随机堆积球的极限约为0.64；这个算法显然不如将弹珠扔进盒子中好。

请注意，问题要求在L=3.33*dmin的盒子中放置n=200个球，这将得到密实度为1.84，但是这是无法实现的。

相比之下，一个算法的基准是如果有任何接近的一对，则丢弃整个试验解决方案：

        L   n  n_iter  d_nn  density
0    3.33  10      52  1.38    0.123
1    3.33  14     224  1.25    0.172
2    3.33  15   24553  1.15    0.185

3    5.70  10       2  2.02    0.033
4    5.70  20      93  1.42    0.066
5    5.70  29    2089  1.36    0.096
6    5.70  30   10886  1.33    0.100

7   10.00  40      10  2.05    0.030
8   10.00  60    1120  1.79    0.045
9   10.00  68    5521  1.66    0.051
11  10.00  70   28545  1.64    0.053

这个算法需要更多的迭代，而且每次迭代本身都会变得更慢，因为所有点和所有距离都需要重新生成并在每次迭代中进行评估。

- Han-Kwang Nienhuys

0

我通过在网格中生成随机点、从随机选项范围中删除该点及其邻居并选择另一个点来解决类似问题。

- Matheus MN

1

为了说服他人，展示你的代码。 - pjs

目前你的回答不够清晰，请编辑并添加更多细节，以帮助其他人理解它如何回答问题。你可以在帮助中心找到有关如何编写好答案的更多信息。 - Community

这并没有回答问题。一旦您拥有足够的声望，您将能够评论任何帖子；相反，提供不需要询问者澄清的答案。- 来自审核 - ZF007

0

假设我有一个在x、y、z轴上长度为10埃的盒子。我想在这个盒子里随机生成10个点，使得任意两个点之间的最小距离大于3埃。

我认为这样可以实现，即在该盒子中重复生成十个随机点，直到它们之间的距离都足够大。

>>> import numpy as np
>>> from itertools import combinations

>>> while True:
        P = np.random.rand(10, 3) * 10
        if all(np.linalg.norm(p - q) > 3
               for p, q in combinations(P, 2)):
            break

>>> P
array([[ 9.02322366,  6.13576854,  3.1745708 ],
       [ 6.48005836,  7.5280536 ,  4.66442095],
       [ 5.78306167,  1.83922896,  9.48337683],
       [ 0.70507032,  0.20737532,  5.31191608],
       [ 3.71977864,  6.40278939,  3.81742814],
       [ 0.03938102,  6.7705456 ,  6.28841217],
       [ 3.27845597,  2.98811665,  4.81792286],
       [ 7.74422021,  9.30027671,  8.1770998 ],
       [ 0.28544716,  0.35155801,  9.77847352],
       [ 4.84536373,  4.21378476,  0.4456017 ]])

大约需要50次尝试才能找到一组好的点。在这里我尝试了1000次，其中有20次是好的。

>>> sum(all(np.linalg.norm(p - q) > 3
            for p, q in combinations(np.random.rand(10, 3) * 10, 2))
        for _ in range(1000))
20

- Stefan Pochmann

无法使用KD树解决它。对于更多的点，您可以使用scipy.spatial.distance.pdist和scipy.spatial.distance.squareform来加速计算。 - Joe

@Joe 是的，对于更大的数字可能会更好。不过他们计算所有距离或最小距离，对吧？我一开始也是这样做的，但后来意识到，只要找到一个距离太小的配对，立即停止并重试会快得多。这就是自己制作暴力检查的优势。 - Stefan Pochmann

@StefanPochmann，这对于N = 10非常好，但当我尝试运行N = 200的算法时，它需要太多时间（可能需要几个小时）！！！ - user8864088

0

拉丁超立方体将域划分为单元格，随机选择其中一组，并在每个单元格中放置一个随机点。在某些不幸的情况下，一些点可能会彼此靠得太近，但我们可以检查距离并可能生成新的情况。

这里是基于OpenTURNS的2D域的实现。对于所有点都进行了距离检查，但在更复杂的示例中，kd-tree可能更有效。绘图代码取自here。

"""
Generate random points in a rectangle, with a minimum distance.
"""

# %% Import.

import openturns as ot
import numpy as np
from scipy.spatial.distance import pdist
import matplotlib.pyplot as plt
import matplotlib.animation as animation


# %% Generate points.

# Boundaries.
x_min, x_max = -15, +15
y_min, y_max = -10, +10

# Number of points per case.
n_points = 8

# Number of cases to generate.
n_cases = 50

# Minimum distance.
min_dist = 4

# Uniform distribution.
distribution = ot.ComposedDistribution([ot.Uniform(x_min, x_max),
                                        ot.Uniform(y_min, y_max)])

# Initialize.
ot.RandomGenerator.SetSeed(0)
p_all = np.zeros((n_cases, n_points, 2))
i = 0

# Generate the points.
while i < n_cases:
    # Create a new Latin Hypercube experiment.
    experiment = ot.LHSExperiment(distribution, n_points, False, True)
    # Generate the points.
    dat = np.array(experiment.generate())
    # Check the distance.
    dist = np.min(pdist(dat))
    if dist >= min_dist:
        p_all[i, :, :] = dat
        i += 1
        print('Case {} / {}'.format(i, n_cases))
    else:
        print('Case discarded, min dist = {:.3f}'.format(dist))


# %% Plot.

def init():
    pathcol.set_offsets([[], []])
    return [pathcol]


def update(i, pathcol, points):
    pathcol.set_offsets(points[i])
    return [pathcol]


fig, ax = plt.subplots()
ax.grid()
ax.set_aspect('equal')
ax.set_xlim(1.1 * x_min, 1.1 * x_max)
ax.set_ylim(1.1 * y_min, 1.1 * y_max)
ax.plot([x_min, x_max, x_max, x_min, x_min],
        [y_min, y_min, y_max, y_max, y_min],
        color=(0.3, 0.3, 0.3))

pathcol = ax.scatter([], [])
anim = animation.FuncAnimation(
    fig, update, init_func=init, fargs=(pathcol, p_all), interval=1000, frames=n_cases,
    blit=True, repeat=False)
plt.show()

对于一个三维域，我们可以写成：

# Boundaries.
z_min, z_max = -20, +20

# Uniform distribution.
distribution = ot.ComposedDistribution([ot.Uniform(x_min, x_max),
                                        ot.Uniform(y_min, y_max),
                                        ot.Uniform(z_min, z_max)])

# Initialize.
p_all = np.zeros((n_cases, n_points, 3))

使用lhsdesign和pdist，在MATLAB中进行翻译应该是直接的。

正如Han-Kwang Nienhuys所指出的，在某些情况下可能会存在性能问题，此时应寻求其他解决方案。

- Jommy

有趣的是，但问题是关于在三维空间中放置点。如果我使用z维度（与y相同大小）调整您的代码，并设置“n_points = 40”，即使点的总排除体积仅为可用体积的20％，它也会运行很长时间。 - Han-Kwang Nienhuys

谢谢您的关注！确实，任何基于创建和丢弃策略的解决方案在某个时候都会面临性能问题。 - Jommy

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Severin Pappadeux · Accepted Answer

这种分布的常见名称是泊松球体采样。已知有一种O(n)算法可以实现此操作 - 请在这里查看。