降采样一个1维的numpy数组

Question

降采样一个1维的numpy数组

pythonnumpyscipysignal-processingresampling

26

我有一个1维的numpy数组，想要对其进行下采样。如果下采样光栅不完全适配数据，则可以采用以下任何方法：

重叠下采样间隔
将结束时剩余的任意数量值转换为单独的下采样值
插值以适配光栅

基本上，如果我有：

1 2 6 2 1

如果我按照3的倍数降采样，以下所有内容都可以：

3 3

3 1.5

或者使用插值法在这里给我提供的任何东西。

我只是在寻找最快/最简单的方法来完成这个任务。

我找到了 scipy.signal.decimate，但那好像会减少（根据需要取出数值并仅留下X中的一个）值。 scipy.signal.resample 的名称似乎正确，但我不理解他们在描述中的傅立叶问题。我的信号不是特别周期性的。

你能帮我一下吗？这似乎是一个非常简单的任务，但所有这些函数都相当复杂...

- TheChymera

1

你会如何建议我去做它？ - TheChymera

我会直接使用scipy.ndimage.zoom。虽然我确定它不会像@shx2的邻域均值那样运行得快，但如果形状不完全对齐，它更易于阅读和使用。 - askewchan

3个回答

3

以下是一些使用线性插值或傅里叶方法的方法。这些方法支持上采样和下采样。

import numpy as np
import matplotlib.pyplot as plt
from scipy.signal import resample
from scipy.interpolate import interp1d

def ResampleLinear1D(original, targetLen):
    original = np.array(original, dtype=np.float)
    index_arr = np.linspace(0, len(original)-1, num=targetLen, dtype=np.float)
    index_floor = np.array(index_arr, dtype=np.int) #Round down
    index_ceil = index_floor + 1
    index_rem = index_arr - index_floor #Remain

    val1 = original[index_floor]
    val2 = original[index_ceil % len(original)]
    interp = val1 * (1.0-index_rem) + val2 * index_rem
    assert(len(interp) == targetLen)
    return interp

if __name__=="__main__":

    original = np.sin(np.arange(256)/10.0)
    targetLen = 100

    # Method 1: Use scipy interp1d (linear interpolation)
    # This is the simplest conceptually as it just uses linear interpolation. Scipy
    # also offers a range of other interpolation methods.
    f = interp1d(np.arange(256), original, 'linear')
    plt.plot(np.apply_along_axis(f, 0, np.linspace(0, 255, num=targetLen)))

    # Method 2: Use numpy to do linear interpolation
    # If you don't have scipy, you can do it in numpy with the above function
    plt.plot(ResampleLinear1D(original, targetLen))

    # Method 3: Use scipy's resample
    # Converts the signal to frequency space (Fourier method), then back. This
    # works efficiently on periodic functions but poorly on non-periodic functions.
    plt.plot(resample(original, targetLen))

    plt.show()

- TimSC

也许你的答案会在这个问题上得到更好的回应：https://stackoverflow.com/questions/50301330/downsampling-signal-from-100-21-hz-to-8-hz-non-integer-decimation-factor。他们仍然缺少一个答案。 - Hephaestus

0

如果数组大小不能被下采样因子（R）整除，则可以使用np.linspace对数组进行重塑（分割），然后对每个子数组取平均值。

input_arr = np.arange(531)

R = 150 (number of split)

split_arr = np.linspace(0, len(input_arr), num=R+1, dtype=int)

dwnsmpl_subarr = np.split(input_arr, split_arr[1:])

dwnsmpl_arr = np.array( list( np.mean(item) for item in dwnsmpl_subarr[:-1] ) )

- Manoj Singh

10

通常情况下，答案会更有帮助，如果它们包含了对代码意图的解释，以及为什么这样做可以解决问题而不引入其他问题。 - Tom Aranda

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- shx2 · Accepted Answer

如果您的数组大小可以被下采样因子(R)整除，您可以使用reshape重新调整数组大小，并沿着新轴取平均值。

import numpy as np
a = np.array([1.,2,6,2,1,7])
R = 3
a.reshape(-1, R)
=> array([[ 1.,  2.,  6.],
         [ 2.,  1.,  7.]])

a.reshape(-1, R).mean(axis=1)
=> array([ 3.        ,  3.33333333])

通常情况下，您可以在数组中使用NaN填充到R的倍数大小，并使用scipy.nanmean计算平均值。

import math, scipy
b = np.append(a, [ 4 ])
b.shape
=> (7,)
pad_size = math.ceil(float(b.size)/R)*R - b.size
b_padded = np.append(b, np.zeros(pad_size)*np.NaN)
b_padded.shape
=> (9,)
scipy.nanmean(b_padded.reshape(-1,R), axis=1)
=> array([ 3.        ,  3.33333333,  4.])