对于numpy数组按行进行Softmax函数操作

Question

对于numpy数组按行进行Softmax函数操作

5

我正在尝试将softmax函数应用于numpy数组，但是我没有得到期望的结果。以下是我尝试过的代码：

 import numpy as np
 x = np.array([[1001,1002],[3,4]])
 softmax = np.exp(x - np.max(x))/(np.sum(np.exp(x - np.max(x)))
 print softmax

我认为x - np.max(x)代码没有对每行减去最大值。需要从x中减去最大值以防止出现非常大的数字。

预期输出结果应该是：

 np.array([
    [0.26894142, 0.73105858],
    [0.26894142, 0.73105858]])

但我得到的是：

np.array([
    [0.26894142, 0.73105858],
    [0, 0]])

- Pranay Aryal

5个回答

4

我的5行代码（其中使用scipy logsumexp进行了一些棘手的部分）：

def softmax(a, axis=None):
    """
    Computes exp(a)/sumexp(a); relies on scipy logsumexp implementation.
    :param a: ndarray/tensor
    :param axis: axis to sum over; default (None) sums over everything
    """
    from scipy.special import logsumexp
    lse = logsumexp(a, axis=axis)  # this reduces along axis
    if axis is not None:
        lse = np.expand_dims(lse, axis)  # restore that axis for subtraction
    return np.exp(a - lse)

如果你使用的是较旧版本的scipy，你可能需要使用from scipy.misc import logsumexp。

- Yibo Yang

1

只是美丽的。 - Soren

2

编辑。从1.2.0版本开始，scipy将softmax作为特殊函数包含在内：

https://scipy.github.io/devdocs/generated/scipy.special.softmax.html

我编写了一个非常通用的softmax函数，可以在任意轴上操作，包括棘手的最大值减法位。下面是该函数，我还写了一篇关于它的博客文章。

def softmax(X, theta = 1.0, axis = None):
    """
    Compute the softmax of each element along an axis of X.

    Parameters
    ----------
    X: ND-Array. Probably should be floats. 
    theta (optional): float parameter, used as a multiplier
        prior to exponentiation. Default = 1.0
    axis (optional): axis to compute values along. Default is the 
        first non-singleton axis.

    Returns an array the same size as X. The result will sum to 1
    along the specified axis.
    """

    # make X at least 2d
    y = np.atleast_2d(X)

    # find axis
    if axis is None:
        axis = next(j[0] for j in enumerate(y.shape) if j[1] > 1)

    # multiply y against the theta parameter, 
    y = y * float(theta)

    # subtract the max for numerical stability
    y = y - np.expand_dims(np.max(y, axis = axis), axis)

    # exponentiate y
    y = np.exp(y)

    # take the sum along the specified axis
    ax_sum = np.expand_dims(np.sum(y, axis = axis), axis)

    # finally: divide elementwise
    p = y / ax_sum

    # flatten if X was 1D
    if len(X.shape) == 1: p = p.flatten()

    return p

- Nolan Conaway

1

x - np.max(x) 这段代码并不是按行进行减法运算。让我们逐步来看。首先，我们将通过平铺或复制列来创建一个“maxes”数组：

maxes = np.tile(np.max(x,1), (2,1)).T

这将创建一个2X2矩阵，通过制作重复的列（tile）来对应每行的最大值。之后，您可以执行以下操作：

 x = np.exp(x - maxes)/(np.sum(np.exp(x - maxes), axis = 1))

你应该可以通过这个得到结果。 axis = 1 是你在回答标题中提到的逐行 softmax。希望这可以帮到你。

- Pranay Aryal

1

这样怎么样？对于沿行取max，只需将参数指定为axis=1，然后使用np.newaxis/None将结果转换为列向量（实际上是2D数组）。

In [40]: x
Out[40]: 
array([[1001, 1002],
       [   3,    4]])

In [41]: z = x - np.max(x, axis=1)[:, np.newaxis]

In [42]: z
Out[42]: 
array([[-1,  0],
       [-1,  0]])

In [44]: softmax = np.exp(z) / np.sum(np.exp(z), axis=1)[:, np.newaxis]

In [45]: softmax
Out[45]: 
array([[ 0.26894142,  0.73105858],
       [ 0.26894142,  0.73105858]])

在最后一步中，当您再次进行求和时，只需指定参数axis=1即可沿着行对其进行求和。

- kmario23

1

你也必须在softmax行（第44行）中执行[:, np.newaxis]操作。通过给定的示例，你可能会得到正确的结果，但这本质上是巧合。（它能工作是因为两个行之和恰好具有相同的值，因此广播的方式无关紧要。）尝试使用x = [[1001, 1002], [1, 4]]来获取错误结果。或者使用x = [[1001, 1002, 1003], [2, 3, 4]]来获取明显的错误。 - Paul Panzer

@PaulPanzer 非常感谢！如何最好地注意到这些错误？对于我对NumPy的理解来说，这太微妙了。 - kmario23

1

在你的玩具示例中不要使用方形数组；-] 说真的，这至少可以让我避免一半的错误。 - Paul Panzer

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Paul Panzer · Accepted Answer

保留被“reduce”操作（如max或sum）消耗的轴的方便方法是使用keepdims关键字：

mx = np.max(x, axis=-1, keepdims=True)
mx
# array([[1002],
#        [   4]])
x - mx
# array([[-1,  0],
#        [-1,  0]])
numerator = np.exp(x - mx)
denominator = np.sum(numerator, axis=-1, keepdims=True)
denominator
# array([[ 1.36787944],
#        [ 1.36787944]])
numerator/denominator
# array([[ 0.26894142,  0.73105858],
         [ 0.26894142,  0.73105858]])