如何使用起始和结束索引切片numpy行

Question

如何使用起始和结束索引切片numpy行

4

index = np.array([[1,2],[2,4],[1,5],[5,6]])
z = np.zeros(shape = [4,10], dtype = np.float32)

如何高效地将z[np.arange(4),index[:,0]]、z[np.arange(4), index[:,1]]以及它们之间的所有元素设置为1？

期望输出：

array([[0, 1, 1, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 1, 1, 0, 0, 0, 0, 0],
       [0, 1, 1, 1, 1, 1, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 1, 1, 0, 0, 0]])

- figs_and_nuts

请明确指出，在所需更改后显示 z。 - hpaulj

2个回答

1

我认为这是您想要做的事情 - 但使用循环实现：

In [35]: z=np.zeros((4,10),int)
In [36]: index = np.array([[1,2],[2,4],[1,5],[5,6]])
In [37]: for i in range(4):
    ...:     z[i,index[i,0]:index[i,1]] = 1
    ...:     
In [38]: z
Out[38]: 
array([[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 1, 0, 0, 0, 0, 0, 0],
       [0, 1, 1, 1, 1, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 1, 0, 0, 0, 0]])

由于切片长度不同，使用单个数组表达式可能会很棘手。也许不是不可能，但足够棘手，可能不值得尝试。

看一下这个z中1的索引：

In [40]: np.where(z)
Out[40]: 
(array([0, 1, 1, 2, 2, 2, 2, 3], dtype=int32),
 array([1, 2, 3, 1, 2, 3, 4, 5], dtype=int32))

有没有一种规律可以生成 [0,1,2,3] 和 index？

我可以通过切片的拼接生成第二行：

In [39]: np.r_[1:2, 2:4, 1:5, 5:6]
Out[39]: array([1, 2, 3, 1, 2, 3, 4, 5])

但请注意，r_ 包含多个迭代步骤 - 用于生成输入、生成扩展切片并将它们连接起来。

我可以使用以下代码生成 where 的第一行：

In [41]: index[:,1]-index[:,0]
Out[41]: array([1, 2, 4, 1])
In [42]: np.arange(4).repeat(_)
Out[42]: array([0, 1, 1, 2, 2, 2, 2, 3])

正如预期的那样，这两个索引数组给出了所有的1：

In [43]: z[Out[42],Out[39]]
Out[43]: array([1, 1, 1, 1, 1, 1, 1, 1])

或者从 index 生成 Out[39]:

In [50]: np.concatenate([np.arange(i,j) for i,j in index])
Out[50]: array([1, 2, 3, 1, 2, 3, 4, 5])

将我的解决方案与 @Divakar 的进行比较。

def foo0(z,index):
    for i in range(z.shape[0]):
        z[i,index[i,0]:index[i,1]] = 1
    return z

def foo4(z,index):
    r = np.arange(z.shape[1])
    mask = (index[:,0,None] <= r) & (index[:,1,None] >= r)
    z[mask] = 1
    return z

对于这个小例子，行迭代更快：

In [155]: timeit foo0(z,index)
7.12 µs ± 224 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [156]: timeit foo4(z,index)
19.8 µs ± 890 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

即使对于较大的数组，行迭代方法也更快：

In [157]: Z.shape
Out[157]: (1000, 1000)
In [158]: Index.shape
Out[158]: (1000, 2)
In [159]: timeit foo0(Z,Index)
1.72 ms ± 16.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [160]: timeit foo4(Z,Index)
7.47 ms ± 105 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

- hpaulj

我也认为最好的方式就是使用循环。尽管如此，我并不喜欢这种方法。如果没有循环就无法高效地完成任务，那么我应该接受什么答案呢？ - figs_and_nuts

我将这两种方法都封装成函数。对于这个问题和大样本，In[37] 的速度始终更快。 - hpaulj

不太确定为什么 NumPy 没有这种索引模式。在范围索引中添加切片版本听起来非常合理。 - llllllllll

@hpaulj - 只需要一个微不足道的更改，以保持与我提出的问题一致：在末尾索引处也加上1。 - figs_and_nuts

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Divakar · Accepted Answer

我们可以利用NumPy广播来实现向量化解决方案，只需将起始和结束索引与覆盖列长度的范围数组进行比较，就可以得到一个掩码，表示需要分配为1s的输出数组中的所有位置。

因此，解决方案应该是这样的 -

ncols = z.shape[1]
r = np.arange(z.shape[1])
mask = (index[:,0,None] <= r) & (index[:,1,None] >= r)
z[mask] = 1

示例运行 -

In [39]: index = np.array([[1,2],[2,4],[1,5],[5,6]])
    ...: z = np.zeros(shape = [4,10], dtype = np.float32)

In [40]: ncols = z.shape[1]
    ...: r = np.arange(z.shape[1])
    ...: mask = (index[:,0,None] <= r) & (index[:,1,None] >= r)
    ...: z[mask] = 1

In [41]: z
Out[41]: 
array([[0., 1., 1., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 1., 1., 1., 0., 0., 0., 0., 0.],
       [0., 1., 1., 1., 1., 1., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 1., 1., 0., 0., 0.]], dtype=float32)

如果z始终是一个以零初始化的数组，我们可以直接从mask中获取输出 -

z = mask.astype(int)

样例运行 -

In [37]: mask.astype(int)
Out[37]: 
array([[0, 1, 1, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 1, 1, 0, 0, 0, 0, 0],
       [0, 1, 1, 1, 1, 1, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 1, 1, 0, 0, 0]])

基准测试

将@hpaulj的foo0和我的foo4与@hpaulj在帖子中列出的包含1000行和可变列数的数据集进行比较。我们从10列开始，因为输入样本是这样列出的，我们将其扩大到更多的行-1000。我们将增加列数到1000。

以下是时间记录-

In [14]: ncols = 10
    ...: index = np.random.randint(0,ncols,(10000,2))
    ...: z = np.zeros(shape = [len(index),ncols], dtype = np.float32)

In [15]: %timeit foo0(z,index)
    ...: %timeit foo4(z,index)
100 loops, best of 3: 6.27 ms per loop
1000 loops, best of 3: 594 µs per loop

In [16]: ncols = 100
    ...: index = np.random.randint(0,ncols,(10000,2))
    ...: z = np.zeros(shape = [len(index),ncols], dtype = np.float32)

In [17]: %timeit foo0(z,index)
    ...: %timeit foo4(z,index)
100 loops, best of 3: 6.49 ms per loop
100 loops, best of 3: 2.74 ms per loop

In [38]: ncols = 300
    ...: index = np.random.randint(0,ncols,(1000,2))
    ...: z = np.zeros(shape = [len(index),ncols], dtype = np.float32)

In [39]: %timeit foo0(z,index)
    ...: %timeit foo4(z,index)
1000 loops, best of 3: 657 µs per loop
1000 loops, best of 3: 600 µs per loop

In [40]: ncols = 1000
    ...: index = np.random.randint(0,ncols,(1000,2))
    ...: z = np.zeros(shape = [len(index),ncols], dtype = np.float32)

In [41]: %timeit foo0(z,index)
    ...: %timeit foo4(z,index)
1000 loops, best of 3: 673 µs per loop
1000 loops, best of 3: 1.78 ms per loop

因此，选择最佳方法将取决于循环和基于广播的向量化方法之间问题集的列数。