动态归一化2D numpy数组

Question

动态归一化2D numpy数组

3

我有一个形状为（100000，1024）的2D numpy数组“signals”。每行都包含信号振幅的跟踪，我想将其归一化到0-1之间。

由于信号具有不同的振幅，因此我无法仅除以一个公共因子，所以我想知道是否有一种方法可以将每个信号归一化，使其内部的每个值都在0-1之间？

假设这些信号看起来像[[0,1,2,3,5,8,2,1]，[0,2,5,10,7,4,2,1]]，并且我希望它们成为[[0.125,0.25,0.375,0.625,1,0.25,0.125]，[0,0.2,0.5,0.7,0.4,0.2,0.1]]。

有没有一种不用循环处理所有100,000个信号的方法，因为这肯定会很慢？

谢谢！

- Beth Long

Python的Scikit-learn库有一个normalize函数，你可以尝试使用它。 - MUK

3个回答

4

添加一些基准测试以展示两种解决方案之间性能差异的显著程度:

import numpy as np
import timeit

arr = np.arange(1024).reshape(128,8)

def using_list_comp():
    return np.array([s/np.max(s) for s in arr])

def using_vectorized_max_div():
    return arr/arr.max(axis=1)[:, np.newaxis]

result1 = using_list_comp()
result2 = using_vectorized_max_div()

print("Results equal:", (result1==result2).all())

time1 = timeit.timeit('using_list_comp()', globals=globals(), number=1000)
time2 = timeit.timeit('using_vectorized_max_div()', globals=globals(), number=1000)

print(time1)
print(time2)
print(time1/time2)

在我的计算机上，输出结果为：

Results equal: True
0.9873569
0.010177099999999939
97.01750989967731

相差近100倍！

- Adam.Er8

1

这正是我预期发生的事情！感谢您的评论！ - Beth Long

3

另一种解决方案是使用 normalize:

from sklearn.preprocessing import normalize
data = [[0,1,2,3,5,8,2,1],[0,2,5,10,7,4,2,1]]
normalize(data, axis=1, norm='max')

result:

array([[0.   , 0.125, 0.25 , 0.375, 0.625, 1.   , 0.25 , 0.125],
       [0.   , 0.2  , 0.5  , 1.   , 0.7  , 0.4  , 0.2  , 0.1  ]])

请注意norm='max'参数。默认值为'l2'。

- ipj

这非常有用，但我使用Adam.Er8上面发布的脚本进行测试，似乎比向量除法方法慢了约6倍。不过还是感谢您的评论！ - Beth Long

2

我已经删除了使用列表推导作为基于循环的解决方案的先前答案。矢量化的方式确实是最快的。 - ipj

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Roland Deschain · Accepted Answer

4

生成按轴最大值的新numpy数组并将其除以该数组，这是一个简单的操作：

import numpy as np

a = np.array([[0,1,2,3,5,8,2,1],[0,2,5,10,7,4,2,1]])

b = np.max(a, axis = 1)

print(a / b[:,np.newaxis])

输出：

[[0.    0.125 0.25  0.375 0.625 1.    0.25  0.125]
 [0.    0.2   0.5   1.    0.7   0.4   0.2   0.1  ]]

- Roland Deschain

这很棒 - 唯一的问题（我之前应该说过！）是有些“信号”中没有信号，因此它们是由0组成的数组。有没有聪明的方法避免尝试除以0？ - Beth Long

1

不错的回答。原帖作者可能会在这个相关帖子 https://dev59.com/TmIk5IYBdhLWcg3wFKlK 找到一些相关信息。最好的问候。 - smile

1

@BethLong 你可以在结果数组上使用numpy.nan_to_num()。这将使你得到从除以零中得到的nan变成零。 - Roland Deschain

1

或者，您可以在此处查看文档 https://docs.scipy.org/doc/numpy-1.14.0/reference/generated/numpy.divide.html，该文档提供了有关如何处理除以零的信息。特别是，在链接的最后部分使用seterr。此致。 - smile

非常棒，非常感谢你们两个。我会选择nan_to_num选项，因为它很直接，但我也会查看另一个链接。非常感激！ - Beth Long