在矩阵中将稀疏数组的元素与行相乘

Question

在矩阵中将稀疏数组的元素与行相乘

7

如果您有一个稀疏矩阵 X：

>> X = csr_matrix([[0,2,0,2],[0,2,0,1]])
>> print type(X)    
>> print X.todense()    
<class 'scipy.sparse.csr.csr_matrix'>
[[0 2 0 2]
 [0 2 0 1]]

还有一个矩阵Y：

>> print type(Y)
>> print text_scores
<class 'numpy.matrixlib.defmatrix.matrix'>
[[8]
 [5]]

如何将X的每个元素乘以Y的行。例如：

[[0*8 2*8 0*8 2*8]
 [0*5 2*5 0*5 1*5]]

或者：

[[0 16 0 16]
 [0 10 0 5]]

我尝试了这个方法，但显然不起作用，因为维度不匹配： Z = X.data * Y

（说明：该代码段涉及矩阵计算，在IT技术中常见）

- Zach

3个回答

1

我用来进行逐行（或逐列）乘法的方法是使用左侧（或右侧）带有对角线矩阵的矩阵乘法：

import numpy as np
import scipy.sparse as sp

X = sp.csr_matrix([[0,2,0,2],
                   [0,2,0,1]])
Y = np.array([8, 5])

D = sp.diags(Y) # produces a diagonal matrix which entries are the values of Y
Z = D.dot(X) # performs D @ X, multiplication on the left for row-wise action

稀疏性在CSR格式中保留：

print(type(Z))
>>> <class 'scipy.sparse.csr.csr_matrix'>

而且输出也是正确的：

print(Z.toarray()) # Z is still sparse and gives the right output
>>> print(Z.toarray()) # Z is still sparse and gives the right output
[[ 0. 16.  0. 16.]
 [ 0. 10.  0.  5.]]

- ngazagna

0

我曾经遇到过同样的问题。个人认为scipy.sparse的文档并不是很有帮助，也没有找到直接处理它的函数。所以我尝试自己编写代码，这对我来说解决了问题：

Z = X.copy()
for row_y_idx in range(Y.shape[0]):
    Z.data[Z.indptr[row_y_idx]:Z.indptr[row_y_idx+1]] *= Y[row_y_idx, 0]

想法是：对于Y中在row_y_idx位置的每个元素，将其与X的row_y_idx行执行标量乘法。更多有关访问CSR矩阵中元素的信息，请单击此处（其中data为A，IA为indptr）。

给定您定义的X和Y：

import numpy as np
import scipy.sparse as sps

X = sps.csr_matrix([[0,2,0,2],[0,2,0,1]])
Y = np.matrix([[8], [5]])

Z = X.copy()
for row_y_idx in range(Y.shape[0]):
    Z.data[Z.indptr[row_y_idx]:Z.indptr[row_y_idx+1]] *= Y[row_y_idx, 0]

print(type(Z))
print(Z.todense())

输出与您的相同：

<class 'scipy.sparse.csr.csr_matrix'>
 [[ 0 16  0 16]
  [ 0 10  0  5]]

- Andrea

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- seberg · Accepted Answer

不幸的是，CSR矩阵的.multiply方法似乎会使另一个矩阵变得密集。因此，有一种避免这种情况的方法：

# Assuming that Y is 1D, might need to do Y = Y.A.ravel() or such...

# just to make the point that this works only with CSR:
if not isinstance(X, scipy.sparse.csr_matrix):
    raise ValueError('Matrix must be CSR.')

Z = X.copy()
# simply repeat each value in Y by the number of nnz elements in each row: 
Z.data *= Y.repeat(np.diff(Z.indptr))

这会创建一些临时变量，但至少它是完全向量化的，并且不会使稀疏矩阵变得密集。

对于 COO 矩阵，等效的代码如下：

Z.data *= Y[Z.row] # you can use np.take which is faster then indexing.

对于CSC矩阵，其等效内容如下：

Z.data *= Y[Z.indices]