如何使用索引和布尔索引高效地设置numpy数组的值？

Question

如何使用索引和布尔索引高效地设置numpy数组的值？

4

当使用掩码以偏移量应用时，选择多维numpy数组元素的最有效方法是什么？例如：

import numpy as np

# in real application, following line would read an image
figure = np.random.uniform(size=(4, 4))  # used as a mask
canvas = np.zeros((10, 10))

# The following doesn't do anything, because a copy is modified
canvas[np.ix_(np.arange(4) + 3, range(4))][figure > 0.5] = 1.0

print np.mean(figure > 0.5)  # should be ~ 0.5
print canvas.max()  # prints 0.0

这里有一个类似的问题：在索引数组时设置Numpy数组的值，但我正在使用掩码，而且我不是在问为什么它不起作用。

- Jonathan

你是否总是将开放网格数组作为具有连续数字的数组？ - Divakar

4个回答

1

一种方法是使用线性索引。因此，我们将从np.ix_获取行和列索引，然后从中获取线性索引的等价物。然后，使用mask选择有效的索引，最后使用有效的线性索引将新值分配给数据数组。

因此，实现如下 -

# Get the open mesh arrays from np.ix_ corresponding to row, col indices
row, col = np.ix_(np.arange(4) + 3, range(4))

# Get the linear indices from those row and column index arrays 
linear_index = (row*canvas.shape[1] + col)[figure>0.5]

# Finally, assign values
np.put(canvas, linear_index, 1.0) # Or canvas.ravel()[linear_index] = 1.0

- Divakar

1

我通常使用一个辅助函数来创建一个适当形状的数组部分（视图）：

arr = np.ones((10, 10)) * 10
mask = np.random.uniform(size=(4, 4))

def get_part(arr, shape, offs_x, offs_y):
    # This example just does 2D but can easily be expanded for ND-arrays
    return arr[offs_x : (offs_x + shape[0]), 
               offs_y : (offs_y + shape[1])]

get_part(arr, mask.shape, offs_x=3, offs_y=4)[mask > 0.5] = 1.0

一种AND实现方式如下所示：

def get_part(arr, shape, offsets):
    slices = tuple(slice(offs, offs+length) for offs, length in zip(offsets, shape))
    return arr[slices]

get_part(arr, mask.shape, (3, 4))

- MSeifert

0

mask=figure>0.5

如果ix索引确实是范围，它们可以像@jdehesa所示一样替换为切片：

canvas[3:3+4,:4][mask]=1

如果使用 arange 只是一个方便的示例，我们可以使用两个阶段的赋值。

In [277]: idx=np.ix_(np.arange(4) + 3, range(4))
In [278]: canvas = np.zeros((10, 10))
In [279]: subcanvas=np.zeros_like(figure)
In [280]: subcanvas[mask] = 1
In [281]: subcanvas
Out[281]: 
array([[ 0.,  1.,  1.,  1.],
       [ 0.,  0.,  0.,  1.],
       [ 1.,  0.,  0.,  1.],
       [ 1.,  0.,  0.,  1.]])
In [282]: canvas[idx]=subcanvas
In [283]: canvas
Out[283]: 
array([[ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  1.,  1.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.]])

- hpaulj

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- jdehesa · Accepted Answer

问题似乎在于使用np.ix_返回的数组作为索引意味着你正在进行高级索引，而正如NumPy文档所述：

高级索引总是返回数据的副本（与基本切片相反，它返回一个视图）。

但在这种情况下，如果真实应用程序类似于您发布的代码（即，如果您只需要偏移量），则可以使用基本切片来解决问题。

import numpy as np

figure = np.random.uniform(size=(4, 4))
canvas = np.zeros((10, 10))

# Either of the following works fine
canvas[3:(3 + 4), :4][figure > 0.5] = 1.0
canvas[slice(3, 3 + 4), slice(4)][figure > 0.5] = 1.0

print np.mean(figure > 0.5)  # ~ 0.5
print canvas.max()  # Prints 1.0 now