如何在Tensorflow中不重新调整形状的情况下对向量和矩阵进行乘法运算？

Question

如何在Tensorflow中不重新调整形状的情况下对向量和矩阵进行乘法运算？

12

这个：

import numpy as np
a = np.array([1, 2, 1])
w = np.array([[.5, .6], [.7, .8], [.7, .8]])

print(np.dot(a, w))
# [ 2.6  3. ] # plain nice old matrix multiplication n x (n, m) -> m

import tensorflow as tf

a = tf.constant(a, dtype=tf.float64)
w = tf.constant(w)

with tf.Session() as sess:
    print(tf.matmul(a, w).eval())

结果为：

C:\_\Python35\python.exe C:/Users/MrD/.PyCharm2017.1/config/scratches/scratch_31.py
[ 2.6  3. ]
# bunch of errors in windows...
Traceback (most recent call last):
  File "C:\_\Python35\lib\site-packages\tensorflow\python\framework\common_shapes.py", line 671, in _call_cpp_shape_fn_impl
    input_tensors_as_shapes, status)
  File "C:\_\Python35\lib\contextlib.py", line 66, in __exit__
    next(self.gen)
  File "C:\_\Python35\lib\site-packages\tensorflow\python\framework\errors_impl.py", line 466, in raise_exception_on_not_ok_status
    pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shape must be rank 2 but is rank 1 for 'MatMul' (op: 'MatMul') with input shapes: [3], [3,2].

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:/Users/MrD/.PyCharm2017.1/config/scratches/scratch_31.py", line 14, in <module>
    print(tf.matmul(a, w).eval())
  File "C:\_\Python35\lib\site-packages\tensorflow\python\ops\math_ops.py", line 1765, in matmul
    a, b, transpose_a=transpose_a, transpose_b=transpose_b, name=name)
  File "C:\_\Python35\lib\site-packages\tensorflow\python\ops\gen_math_ops.py", line 1454, in _mat_mul
    transpose_b=transpose_b, name=name)
  File "C:\_\Python35\lib\site-packages\tensorflow\python\framework\op_def_library.py", line 763, in apply_op
    op_def=op_def)
  File "C:\_\Python35\lib\site-packages\tensorflow\python\framework\ops.py", line 2329, in create_op
    set_shapes_for_outputs(ret)
  File "C:\_\Python35\lib\site-packages\tensorflow\python\framework\ops.py", line 1717, in set_shapes_for_outputs
    shapes = shape_func(op)
  File "C:\_\Python35\lib\site-packages\tensorflow\python\framework\ops.py", line 1667, in call_with_requiring
    return call_cpp_shape_fn(op, require_shape_fn=True)
  File "C:\_\Python35\lib\site-packages\tensorflow\python\framework\common_shapes.py", line 610, in call_cpp_shape_fn
    debug_python_shape_fn, require_shape_fn)
  File "C:\_\Python35\lib\site-packages\tensorflow\python\framework\common_shapes.py", line 676, in _call_cpp_shape_fn_impl
    raise ValueError(err.message)
ValueError: Shape must be rank 2 but is rank 1 for 'MatMul' (op: 'MatMul') with input shapes: [3], [3,2].

Process finished with exit code 1

（不确定为什么在其处理中会引发相同的异常）

在Tensorflow exception with matmul中建议的解决方案是将向量重塑为矩阵，但这会导致代码变得过于复杂 - 是否仍然没有其他方法可以将向量与矩阵相乘？

顺便提一下，使用默认参数的expand_dims（如上面链接中建议的）会引发ValueError - 这在文档中没有提到，并且破坏了设置默认参数的目的。

- Mr_and_Mrs_D

1

被接受的答案有效，但这实际上是一个API错误 - 已报告：https://github.com/tensorflow/tensorflow/issues/9055 - Mr_and_Mrs_D

感谢您提出问题，这种行为也困扰着我。要了解更好的解决方案和更多用例，请查看我的答案。 - dsalaj

3个回答

11

Matmul被编写用于二维或更高维张量。老实说，我不确定原因，因为numpy可以允许矩阵向量乘法。

import numpy as np
a = np.array([1, 2, 1])
w = np.array([[.5, .6], [.7, .8], [.7, .8]])

print(np.dot(a, w))
# [ 2.6  3. ] # plain nice old matix multiplication n x (n, m) -> m
print(np.sum(np.expand_dims(a, -1) * w , axis=0))
# equivalent result [2.6, 3]

import tensorflow as tf

a = tf.constant(a, dtype=tf.float64)
w = tf.constant(w)

with tf.Session() as sess:
  # they all produce the same result as numpy above
  print(tf.matmul(tf.expand_dims(a,0), w).eval())
  print((tf.reduce_sum(tf.multiply(tf.expand_dims(a,-1), w), axis=0)).eval())
  print((tf.reduce_sum(tf.multiply(a, tf.transpose(w)), axis=1)).eval())

  # Note tf.multiply is equivalent to "*"
  print((tf.reduce_sum(tf.expand_dims(a,-1) * w, axis=0)).eval())
  print((tf.reduce_sum(a * tf.transpose(w), axis=1)).eval())

- Steven

1

哦，谢谢 - 好吧，那就不是矩阵乘法了；这两个是等价的吗？你能解释一下 reduce sum 是什么意思吗？抱歉，今天和 tf 打了太多交道，我有点晕了。 - Mr_and_Mrs_D

所以 "*" 乘法操作支持常规的numpy广播语义(可能会缺少一些花里胡哨的索引功能)。在上面的例子中，它将把向量 a 与 w 中的每个向量相乘。然后 reduce_sum 将通过沿该维度求和来折叠维度。因此我们从 a * w -> reduce_sum(product) -> ans; ([n * nxm]) -> [nxm] -> [m]。轴确定在这种情况下要添加到哪个轴，我们希望使用轴0来获得维度 m 的最终结果。 - Steven

抱歉，print(tf.reduce_sum(a * w, axis=0).eval()) 在相关代码中导致 ValueError: Dimensions must be equal, but are 3 and 2 for 'mul' (op: 'Mul') with input shapes: [3], [3,2]. 的错误。 - Mr_and_Mrs_D

抱歉广播混乱了。我已经修复了代码，并提供了在numpy和tf中产生相同结果的两个示例。 - Steven

谢谢 - 我已经在这里报告了：https://github.com/tensorflow/tensorflow/issues/9055 - Mr_and_Mrs_D

4

您可以使用tf.tensordot并设置axes=1。对于向量与矩阵的简单操作，这比tf.einsum更加简洁。

tf.tensordot(a, w, 1)

- Hooked

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- dsalaj · Accepted Answer

tf.einsum让您能够以简洁直观的形式准确地完成所需操作：

with tf.Session() as sess:
    print(tf.einsum('n,nm->m', a, w).eval())
    # [ 2.6  3. ]

你甚至可以显式编写评论 n x (n, m) -> m。在我看来，这更易读且直观。

我最喜欢的用例是当你想要用权重向量乘以一批矩阵时：

n_in = 10
n_step = 6
input = tf.placeholder(dtype=tf.float32, shape=(None, n_step, n_in))
weights = tf.Variable(tf.truncated_normal((n_in, 1), stddev=1.0/np.sqrt(n_in)))
Y_predict = tf.einsum('ijk,kl->ijl', input, weights)
print(Y_predict.get_shape())
# (?, 6, 1)

这样，您可以轻松地在所有批次中乘以权重，无需进行任何转换或复制。这是通过扩展维度的方式不可能实现的，因此您避免了tf.matmul需要具有匹配的批和其他外部维度的要求：

在进行任何转置后，输入必须是秩>=2的张量，其中内部的两个维度指定有效的矩阵乘法参数，并且任何进一步的外部维度匹配。