我可以帮助您翻译相关的IT技术内容。需要对以下NumPy数组进行操作,提取其中连续N个元素组成的一组:
a = numpy.array([1,2,3,4,5,6,7,8])
我希望能有(N = 5):
array([[1,2,3,4,5],
[2,3,4,5,6],
[3,4,5,6,7],
[4,5,6,7,8]])
我想创建一个数组,以便我可以运行更多的函数,如平均值和总和。我该如何创建这样一个数组?
使用广播(broadcasting
)的一种方法 -
import numpy as np
out = a[np.arange(a.size - N + 1)[:,None] + np.arange(N)]
样例运行 -
In [31]: a
Out[31]: array([4, 2, 5, 4, 1, 6, 7, 3])
In [32]: N
Out[32]: 5
In [33]: out
Out[33]:
array([[4, 2, 5, 4, 1],
[2, 5, 4, 1, 6],
[5, 4, 1, 6, 7],
[4, 1, 6, 7, 3]])
rolling_window
来实现滑动窗口计算。def rolling_window(a, window):
shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
strides = a.strides + (a.strides[-1],)
return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
In [37]: a = np.array([1,2,3,4,5,6,7,8])
In [38]: rolling_window(a, 5)
Out[38]:
array([[1, 2, 3, 4, 5],
[2, 3, 4, 5, 6],
[3, 4, 5, 6, 7],
[4, 5, 6, 7, 8]])
我喜欢@Divkar的解决方案。然而,对于更大的数组和窗口,你可能想使用rolling_window
?
In [55]: a = np.arange(1000)
In [56]: %timeit rolling_window(a, 5)
100000 loops, best of 3: 9.02 µs per loop
In [57]: %timeit broadcast_f(a, 5)
10000 loops, best of 3: 87.7 µs per loop
In [58]: %timeit rolling_window(a, 100)
100000 loops, best of 3: 8.93 µs per loop
In [59]: %timeit broadcast_f(a, 100)
1000 loops, best of 3: 1.04 ms per loop
rolling_window
方法比broadcasting
快两倍以上。希望@Divakar不介意我选择了你的答案。谢谢你们俩! - He Shiming