为什么numpy.dot会表现出这种方式？

Question

为什么numpy.dot会表现出这种方式？

pythonnumpylinear-algebramatrix-multiplicationarray-broadcasting

5

我正在尝试理解为什么numpy的dot函数会有这样的行为：

M = np.ones((9, 9))
V1 = np.ones((9,))
V2 = np.ones((9, 5))
V3 = np.ones((2, 9, 5))
V4 = np.ones((3, 2, 9, 5))

现在np.dot(M, V1)和np.dot(M, V2)的表现符合预期。但对于V3和V4，结果让我感到惊讶：

>>> np.dot(M, V3).shape
(9, 2, 5)
>>> np.dot(M, V4).shape
(9, 3, 2, 5)

我预期分别是 (2, 9, 5) 和 (3, 2, 9, 5)。另一方面，np.matmul 做到了我预期的：矩阵乘法会在第二个参数的前 N - 2 维度上进行广播，并且结果具有相同的形状。

>>> np.matmul(M, V3).shape
(2, 9, 5)
>>> np.matmul(M, V4).shape
(3, 2, 9, 5)

所以我的问题是：为什么np.dot的行为是这样的？它是否有特定的目的，还是应用了一些通用规则的结果？

- fadriaensen

1

这实际上在numpy的dot和matmul文档中有解释。 - MaxNoe

4个回答

4

根据numpy.matmul的文档：

matmul 与 dot 在两个重要方面有所不同。

不允许使用标量相乘。

矩阵堆栈像元素一样进行广播。

总之，这是您期望的标准矩阵乘法。

另一方面，numpy.dot仅对二维数组等效于矩阵乘法。对于更大的尺寸，...

it is a sum product over the last axis of a and the second-to-last of b:
dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m])

[来源: numpy.dot 的文档]

这类似于内积运算。对于向量，numpy.dot 返回点积。数组被视为向量的集合，并返回它们的点积。

- Alexander Vogt

3

为什么：

dot和matmult都是2D*2D矩阵乘法的概括。但是根据数学属性、广播规则等，有很多可能的选择。

dot和matmul的选择非常不同：

对于dot，一些维度（这里是绿色的）专门用于第一个数组，其他维度（蓝色的）专门用于第二个数组。

matmul需要关于广播规则的堆栈对齐。

Numpy诞生于图像分析背景下，dot可以通过out=dot(image(s),transformation(s))轻松地处理一些任务（请参见早期版本的numpy书中的点文档，p92）。

举例说明：

from pylab import *
image=imread('stackoverflow.png')

identity=eye(3)
NB=ones((3,3))/3
swap_rg=identity[[1,0,2]]
randoms=[rand(3,3) for _ in range(6)]

transformations=[identity,NB,swap_rg]+randoms
out=dot(image,transformations)

for k in range(9): 
    subplot(3,3,k+1)
    imshow (out[...,k,:])

现代的matmul与旧的dot可以做相同的事情，但必须考虑到矩阵的堆叠。（这里是matmul(image,transformations[:,None])）。

毫无疑问，在其他情况下它更好。

- B. M.

1

等效的einsum表达式如下：

In [92]: np.einsum('ij,kjm->kim',M,V3).shape
Out[92]: (2, 9, 5)
In [93]: np.einsum('ij,lkjm->lkim',M,V4).shape
Out[93]: (3, 2, 9, 5)

这样表达，点积等效的表示方式为'ij,lkjm->ilkm'，看起来与'matmul'等效的表示方式一样自然。

- hpaulj

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- ali_m · Accepted Answer

从 np.dot 的文档中得知：

For 2-D arrays it is equivalent to matrix multiplication, and for 1-D arrays to inner product of vectors (without complex conjugation). For N dimensions it is a sum product over the last axis of a and the second-to-last of b:
dot(a, b)[i,j,k,m] = sum(a[i,j,:] * b[k,:,m])

对于np.dot(M, V3)，

(9, 9), (2, 9, 5) --> (9, 2, 5)

对于np.dot(M, V4)，

(9, 9), (3, 2, 9, 5) --> (9, 3, 2, 5)

其中删除线表示被求和的维度，因此在结果中不存在。

相比之下，np.matmul将N维数组视为二维矩阵的“堆栈”：

引用块：

The behavior depends on the arguments in the following way.

If both arguments are 2-D they are multiplied like conventional matrices.
If either argument is N-D, N > 2, it is treated as a stack of matrices residing in the last two indexes and broadcast accordingly.

两种情况下都执行相同的降维操作，但轴的顺序不同。 np.matmul本质上执行以下操作：

for ii in range(V3.shape[0]):
    out1[ii, :, :] = np.dot(M[:, :], V3[ii, :, :])

并且

for ii in range(V4.shape[0]):
    for jj in range(V4.shape[1]):
        out2[ii, jj, :, :] = np.dot(M[:, :], V4[ii, jj, :, :])