多个2D np数组相交以确定区域

Question

多个2D np数组相交以确定区域

pythonnumpymultidimensional-arraynumpy-ndarray

8

使用这个小型可重现的示例，到目前为止我未能从3个数组中生成一个新的整数数组，该数组包含所有三个输入数组中的唯一分组。

这些数组与地形属性相关。

import numpy as np
asp = np.array([8,1,1,2,7,8,2,3,7,6,4,3,6,5,5,4]).reshape((4,4))  #aspect
slp = np.array([9,10,10,9,9,12,12,9,10,11,11,9,9,9,9,9]).reshape((4,4))  #slope
elv = np.array([13,14,14,13,14,15,16,14,14,15,16,14,13,14,14,13]).reshape((4,4)) #elevation

这个想法是使用GIS例行程序将地理轮廓划分为3个不同的属性：

1-8表示方位（1=朝北，2=朝东北等）
9-12表示坡度（9=缓坡...12=最陡峭的坡度）
13-16表示海拔高度（13=最低海拔...16=最高海拔）

下面的小图尝试展示我所需要的结果（在左下角显示数组）。请注意，图形中给出的“答案”只是一种可能的答案。只要最终数组包含一个整数，用于标识唯一分组的每个行/列索引，我就不关心最终排列。

例如，[0,1]和[0,2]处的数组索引具有相同的方位、坡度和海拔高度，因此在结果数组中接收相同的整数标识符。 numpy是否有内置的例行程序可以处理这种情况？

- user2256085

虽然Numpy没有groupby函数，但根据其作者所说，有一个库可以提供高效的groupby功能。请参见https://dev59.com/0VoT5IYBdhLWcg3w8S62#38015063。 - PM 2Ring

请参阅有关空间连续区域的相关后续帖子：https://dev59.com/pLHma4cB1Zd3GeqPP7kw - user2256085

5个回答

2

您可以使用numpy.unique()，然后进行映射：

代码:

combined = 10000 * asp + 100 * slp + elv
unique = dict(((v, i + 1) for i, v in enumerate(np.unique(combined))))
combined_unique = np.vectorize(unique.get)(combined)

测试代码：

import numpy as np

asp = np.array([8, 1, 1, 2, 7, 8, 2, 3, 7, 6, 4, 3, 6, 5, 5, 4]).reshape((4, 4))  # aspect
slp = np.array([9, 10, 10, 9, 9, 12, 12, 9, 10, 11, 11, 9, 9, 9, 9, 9]).reshape((4, 4))  # slope
elv = np.array([13, 14, 14, 13, 14, 15, 16, 14, 14, 15, 16, 14, 13, 14, 14, 13]).reshape((4, 4))

combined = 10000 * asp + 100 * slp + elv
unique = dict(((v, i + 1) for i, v in enumerate(np.unique(combined))))
combined_unique = np.vectorize(unique.get)(combined)

print(combined_unique)

结果:

[[12  1  1  2]
 [10 13  3  4]
 [11  9  6  4]
 [ 8  7  7  5]]

- Stephen Rauch

1

这似乎是一个类似于在图像中标记唯一区域的问题。这是我编写的一个函数来完成此操作，但您需要先将3个数组连接成一个3D数组。

def labelPix(pix):
    height, width, _ = pix.shape
    pixRows = numpy.reshape(pix, (height * width, 3))
    unique, counts = numpy.unique(pixRows, return_counts = True, axis = 0)

    unique = [list(elem) for elem in unique]

    labeledPix = numpy.zeros((height, width), dtype = int)
    offset = 0
    for index, zoneArray in enumerate(unique):
        index += offset
        zone = list(zoneArray)
        zoneArea = (pix == zone).all(-1)
        elementsArray, numElements = scipy.ndimage.label(zoneArea)

        elementsArray[elementsArray!=0] += offset

        labeledPix[elementsArray!=0] = elementsArray[elementsArray!=0]

        offset += numElements

    return labeledPix

这将对唯一的3值组合进行标记，同时为具有相同3值组合但彼此不接触的区域分配单独的标签。

asp = numpy.array([8,1,1,2,7,8,2,3,7,6,4,3,6,5,5,4]).reshape((4,4))  #aspect
slp = numpy.array([9,10,10,9,9,12,12,9,10,11,11,9,9,9,9,9]).reshape((4,4))  #slope
elv = numpy.array([13,14,14,13,14,15,16,14,14,15,16,14,13,14,14,13]).reshape((4,4)) #elevation

pix = numpy.zeros((4,4,3))
pix[:,:,0] = asp
pix[:,:,1] = slp
pix[:,:,2] = elv

print(labelPix(pix))

返回：

[[ 0  1  1  2]
 [10 12  3  4]
 [11  9  6  4]
 [ 8  7  7  5]]

- asheets

1

这里是一个使用itertools.groupby的简单Python技巧。它要求输入为1D列表，但这不应该是一个主要问题。策略是将列表与索引号一起压缩，然后对结果列进行排序。然后我们将相同的列分组在一起，在比较列时忽略索引号。然后我们收集每个组的索引号，并使用它们来构建最终输出列表。

from itertools import groupby

def show(label, seq):
    print(label, ' '.join(['{:2}'.format(u) for u in seq]))

asp = [8, 1, 1, 2, 7, 8, 2, 3, 7, 6, 4, 3, 6, 5, 5, 4] 
slp = [9, 10, 10, 9, 9, 12, 12, 9, 10, 11, 11, 9, 9, 9, 9, 9] 
elv = [13, 14, 14, 13, 14, 15, 16, 14, 14, 15, 16, 14, 13, 14, 14, 13]

size = len(asp)
a = sorted(zip(asp, slp, elv, range(size)))
groups = sorted([u[-1] for u in g] for _, g in groupby(a, key=lambda t:t[:-1]))
final = [0] * size
for i, g in enumerate(groups, 1):
    for j in g:
        final[j] = i

show('asp', asp)
show('slp', slp)
show('elv', elv)
show('out', final)

output

asp  8  1  1  2  7  8  2  3  7  6  4  3  6  5  5  4
slp  9 10 10  9  9 12 12  9 10 11 11  9  9  9  9  9
elv 13 14 14 13 14 15 16 14 14 15 16 14 13 14 14 13
out  1  2  2  3  4  5  6  7  8  9 10  7 11 12 12 13

不需要进行第二次排序，我们可以使用简单的列表推导。

groups = [[u[-1] for u in g] for _, g in groupby(a, key=lambda t:t[:-1])]

或生成器表达式

groups = ([u[-1] for u in g] for _, g in groupby(a, key=lambda t:t[:-1]))

我只是为了让我的输出与问题中的输出相匹配。

- PM 2Ring

1

这是一种使用基于字典查找的方法来解决此问题的方式。

from collections import defaultdict
import itertools

group_dict = defaultdict(list)
idx_count = 0

for a, s, e in np.nditer((asp, slp, elv)):
     asp_tuple = (a.tolist(), s.tolist(), e.tolist())
     if asp_tuple not in group_dict:
         group_dict[asp_tuple] = [idx_count+1]
         idx_count += 1
     else:
         group_dict[asp_tuple].append(group_dict[asp_tuple][-1])


list1d = list(itertools.chain(*list(group_dict.values())))

np.array(list1d).reshape(4, 4)

# result 
array([[ 1,  2,  2,  3],
       [ 4,  5,  6,  7],
       [ 7,  8,  9, 10],
       [11, 12, 12, 13]])

- kmario23

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- unutbu · Accepted Answer

每个网格位置都与一个元组相关联，由asp、slp和elv中的一个值组成。例如，左上角具有元组(8,9,13)。我们希望将此元组映射到一个唯一标识该元组的数字。

一种方法是将(8,9,13)视为3D数组np.arange(9*13*17).reshape(9,13,17)中的索引。选择这个特定的数组是为了容纳asp、slp和elv中最大的值。

In [107]: asp.max()+1
Out[107]: 9

In [108]: slp.max()+1
Out[108]: 13

In [110]: elv.max()+1
Out[110]: 17

现在，我们可以将元组（8,9,13）映射到数字1934：

In [113]: x = np.arange(9*13*17).reshape(9,13,17)

In [114]: x[8,9,13]
Out[114]: 1934

如果我们对网格中的每个位置都这样做，那么我们就会得到每个位置的唯一编号。我们可以在此停止，并让这些唯一的数字作为标签。

或者，我们可以使用np.unique并使用return_inverse=True生成较小的整数标签（从0开始递增）：

uniqs, labels = np.unique(vals, return_inverse=True)
labels = labels.reshape(vals.shape)

例如，

import numpy as np

asp = np.array([8,1,1,2,7,8,2,3,7,6,4,3,6,5,5,4]).reshape((4,4))  #aspect
slp = np.array([9,10,10,9,9,12,12,9,10,11,11,9,9,9,9,9]).reshape((4,4))  #slope
elv = np.array([13,14,14,13,14,15,16,14,14,15,16,14,13,14,14,13]).reshape((4,4)) #elevation

x = np.arange(9*13*17).reshape(9,13,17)
vals = x[asp, slp, elv]
uniqs, labels = np.unique(vals, return_inverse=True)
labels = labels.reshape(vals.shape)

产生。

array([[11,  0,  0,  1],
       [ 9, 12,  2,  3],
       [10,  8,  5,  3],
       [ 7,  6,  6,  4]])

上述方法只适用于asp、slp和elv中的值为小整数的情况。如果整数太大，它们的最大值乘积可能会超出可以传递给np.arange的最大允许值。此外，生成如此大的数组也是低效的。如果这些值是浮点数，则不能将它们解释为3D数组x中的索引。

因此，要解决这些问题，首先使用np.unique将asp、slp和elv中的值转换为唯一的整数标签：

indices = [ np.unique(arr, return_inverse=True)[1].reshape(arr.shape) for arr in [asp, slp, elv] ]
M = np.array([item.max()+1 for item in indices])
x = np.arange(M.prod()).reshape(M)
vals = x[indices]
uniqs, labels = np.unique(vals, return_inverse=True)
labels = labels.reshape(vals.shape)

这将产生与上述相同的结果，但即使asp、slp、elv是浮点数和/或大整数，也能正常工作。

最后，我们可以避免生成np.arange：

x = np.arange(M.prod()).reshape(M)
vals = x[indices]

通过将vals计算为索引和步长的乘积：

M = np.r_[1, M[:-1]]
strides = M.cumprod()
indices = np.stack(indices, axis=-1)
vals = (indices * strides).sum(axis=-1)

将所有东西放在一起：

import numpy as np

asp = np.array([8,1,1,2,7,8,2,3,7,6,4,3,6,5,5,4]).reshape((4,4))  #aspect
slp = np.array([9,10,10,9,9,12,12,9,10,11,11,9,9,9,9,9]).reshape((4,4))  #slope
elv = np.array([13,14,14,13,14,15,16,14,14,15,16,14,13,14,14,13]).reshape((4,4)) #elevation

def find_labels(*arrs):
    indices = [np.unique(arr, return_inverse=True)[1] for arr in arrs]
    M = np.array([item.max()+1 for item in indices])
    M = np.r_[1, M[:-1]]
    strides = M.cumprod()
    indices = np.stack(indices, axis=-1)
    vals = (indices * strides).sum(axis=-1)
    uniqs, labels = np.unique(vals, return_inverse=True)
    labels = labels.reshape(arrs[0].shape)
    return labels

print(find_labels(asp, slp, elv))

# [[ 3  7  7  0]
#  [ 6 10 12  4]
#  [ 8  9 11  4]
#  [ 2  5  5  1]]