NumPy有没有类似于Matlab的buffer函数的功能?

9
我看到有 array_splitsplit 方法,但当你需要将长度不是块大小的整数倍的数组分割时,这些方法并不太方便。此外,这些方法的输入是切片数而不是切片大小。我需要更像Matlab的buffer方法,更适合信号处理。
例如,如果我想将信号缓冲为大小为60的块,我需要执行: np.vstack(np.hsplit(x.iloc[0:((len(x)//60)*60)], len(x)//60)),这很麻烦。

1
你尝试过使用 np.split 吗?它可以在指定的索引处进行分割,因此应该可以处理不规则间隔。我们只需要使用 range 创建这些索引即可。 - Divakar
或许这个链接可以帮到你 https://mail.scipy.org/pipermail/scipy-user/2006-November/009962.html,同时你也可以在这个链接中找到代码 https://mail.scipy.org/pipermail/scipy-user/attachments/20061119/292f81e3/attachment.py - Ed Smith
1
快速浏览buffer文档让我想起了numpystride_tricks.as_strided,特别是它处理重叠和跳过的能力。但对于这种情况来说,可能太强大和危险了。 - hpaulj
1
x.reshape(-1,60)x 分成大小为 60 的等长行。如果 x 的长度不是 60 的倍数,则需要填充或截断。但 vstack 也需要这样做。 - hpaulj
7个回答

6
我编写了下面的程序来处理我所需要的用例,但是我还没有为"underlap"实现/测试。如有建议,请随意提出改进意见。
def buffer(X, n, p=0, opt=None):
    '''Mimic MATLAB routine to generate buffer array

    MATLAB docs here: https://se.mathworks.com/help/signal/ref/buffer.html

    Parameters
    ----------
    x: ndarray
        Signal array
    n: int
        Number of data segments
    p: int
        Number of values to overlap
    opt: str
        Initial condition options. default sets the first `p` values to zero,
        while 'nodelay' begins filling the buffer immediately.

    Returns
    -------
    result : (n,n) ndarray
        Buffer array created from X
    '''
    import numpy as np

    if opt not in [None, 'nodelay']:
        raise ValueError('{} not implemented'.format(opt))

    i = 0
    first_iter = True
    while i < len(X):
        if first_iter:
            if opt == 'nodelay':
                # No zeros at array start
                result = X[:n]
                i = n
            else:
                # Start with `p` zeros
                result = np.hstack([np.zeros(p), X[:n-p]])
                i = n-p
            # Make 2D array and pivot
            result = np.expand_dims(result, axis=0).T
            first_iter = False
            continue

        # Create next column, add `p` results from last col if given
        col = X[i:i+(n-p)]
        if p != 0:
            col = np.hstack([result[:,-1][-p:], col])
        i += n-p

        # Append zeros if last row and not length `n`
        if len(col) < n:
            col = np.hstack([col, np.zeros(n-len(col))])

        # Combine result with next row
        result = np.hstack([result, np.expand_dims(col, axis=0).T])

    return result

1
对我来说起作用了,只需要进行一点小调整。因为我使用的是Python 3(我的理论),所以“cols”变量被截断了。我认为这是由于Python 3处理乘法方式的改变所导致的。在计算“cols”的方程中,我将分母转换为浮点数,然后输出与Matlab的输出完全匹配。 cols = int(np.ceil(len(x)/float((n-p)))) - user2348114
谢谢。我也在使用Python 3,但最终我没有使用它,所以也许我没有注意到那个问题。 - ryanjdillon
1
您可以尝试以下测试用例。它会给出错误的输出。data = buffer(np.arange(1,31),7,3,'nodelay') - Maxtron
1
我已经看了一下,修正了你提到的错误@Maxtron。谢谢! - ryanjdillon
1
只是一个小注释。当选择p=0nodelay选项时,该算法会抛出错误。测试用例:data = buffer(np.arange(1,31),7,0,'nodelay') - Maxtron
显示剩余2条评论

6
def buffer(X = np.array([]), n = 1, p = 0):
    #buffers data vector X into length n column vectors with overlap p
    #excess data at the end of X is discarded
    n = int(n) #length of each data vector
    p = int(p) #overlap of data vectors, 0 <= p < n-1
    L = len(X) #length of data to be buffered
    m = int(np.floor((L-n)/(n-p)) + 1) #number of sample vectors (no padding)
    data = np.zeros([n,m]) #initialize data matrix
    for startIndex,column in zip(range(0,L-n,n-p),range(0,m)):
        data[:,column] = X[startIndex:startIndex + n] #fill in by column
    return data

1
这个Keras函数可以被认为是MATLAB Buffer()的Python版本。
查看示例代码:
import numpy as np
S = np.arange(1,99) #A Demo Array

在此处查看输出

import tensorflow.keras.preprocessing as kp
list(kp.timeseries_dataset_from_array(S, targets = None,sequence_length=7,sequence_stride=7,batch_size=5))

在这里查看缓冲数组输出

参考资料:查看此处


0

ryanjdillon的答案进行了重写以显著提高性能;它将元素附加到列表中,而不是连接数组,后者会迭代地复制数组,速度要慢得多。

def buffer(x, n, p=0, opt=None):
    if opt not in ('nodelay', None):
        raise ValueError('{} not implemented'.format(opt))

    i = 0
    if opt == 'nodelay':
        # No zeros at array start
        result = x[:n]
        i = n
    else:
        # Start with `p` zeros
        result = np.hstack([np.zeros(p), x[:n-p]])
        i = n-p
    # Make 2D array, cast to list for .append()
    result = list(np.expand_dims(result, axis=0))

    while i < len(x):
        # Create next column, add `p` results from last col if given
        col = x[i:i+(n-p)]
        if p != 0:
            col = np.hstack([result[-1][-p:], col])

        # Append zeros if last row and not length `n`
        if len(col):
            col = np.hstack([col, np.zeros(n - len(col))])

        # Combine result with next row
        result.append(np.array(col))
        i += (n - p)

    return np.vstack(result).T

0
def buffer(X, n, p=0):
'''
Parameters:
x: ndarray, Signal array, input a long vector as raw speech wav
n: int, frame length
p: int, Number of values to overlap
-----------
Returns:
result : (n,m) ndarray, Buffer array created from X
'''
import numpy as np
d = n - p
#print(d)
m = len(X)//d
c = n//d
#print(c)
if m * d != len(X):
    m = m + 1
#print(m)

Xn = np.zeros(d*m)
Xn[:len(X)] = X
Xn = np.reshape(Xn,(m,d))
Xn_out = Xn
for i in range(c-1):
    Xne = np.concatenate((Xn,np.zeros((i+1,d))))
    Xn_out = np.concatenate((Xn_out, Xne[i+1:,:]),axis=1)
#print(Xn_out.shape)  
if n-d*c>0:
    Xne = np.concatenate((Xn, np.zeros((c,d))))
    Xn_out = np.concatenate((Xn_out,Xne[c:,:n-p*c]),axis=1)

return np.transpose(Xn_out)

这是Ali Khodabakhsh示例代码的改进版,适用于我的情况。欢迎评论和使用。


0

和其他答案一样,但更快。

def buffer(X, n, p=0):

    '''
    Parameters
    ----------
    x: ndarray
        Signal array
    n: int
        Number of data segments
    p: int
        Number of values to overlap

    Returns
    -------
    result : (n,m) ndarray
        Buffer array created from X
    '''
    import numpy as np

    d = n - p
    m = len(X)//d

    if m * d != len(X):
        m = m + 1

    Xn = np.zeros(d*m)
    Xn[:len(X)] = X

    Xn = np.reshape(Xn,(m,d))
    Xne = np.concatenate((Xn,np.zeros((1,d))))
    Xn = np.concatenate((Xn,Xne[1:,0:p]), axis = 1)

    return np.transpose(Xn[:-1])

0

通过运行,比较所提出答案的执行时间

x = np.arange(1,200000)
start = timer()
y = buffer(x,60,20)
end = timer()
print(end-start)

结果如下:

Andrzej May,0.005595300000095449

OverLordGoldDragon,0.06954789999986133

ryanjdillon,2.427092700000003


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接