NumPy“智能”对称矩阵

Question

NumPy“智能”对称矩阵

84

在numpy中是否有一种智能且空间高效的对称矩阵，当写入[i][j]位置时，它会自动（且透明地）填充[j][i]位置？

import numpy
a = numpy.symmetric((3, 3))
a[0][1] = 1
a[1][0] == a[0][1]
# True
print(a)
# [[0 1 0], [1 0 0], [0 0 0]]

assert numpy.all(a == a.T) # for any symmetric matrix

一个自动的厄米矩阵也很好，尽管在我写作时我不需要它。

- Debilski

1

如果解决了您的问题，您可以考虑将答案标记为已接受。 :) - Eric O. Lebigot

2

我想等待更好的（即内置和内存高效）答案出现。当然，你的答案没有问题，所以我仍会接受它。 - Debilski

我认为到目前为止，你只能进行子类化（不需要）或者包装numpy，例如通过更改自己的setter函数来填充矩阵，以获得类似的接口。我认为你也可以加入掩码数组，以避免双重下游计算，只要掩码数组支持足够的矩阵操作场景。没有内置的方法，也没有通用的健壮方式。 - matanster

numpy.all(a == a.T) 似乎无法处理对角线上带有 nan 的对称矩阵。 - Geremia

6个回答

24

对于在numpy中处理对称矩阵的最佳方法这个更一般的问题也困扰着我。

经过调查，我认为答案可能是numpy在处理对称矩阵时受到底层BLAS例程支持的内存布局的限制。

虽然一些BLAS例程确实利用对称性来加速对称矩阵的计算，但它们仍然使用与完整矩阵相同的内存结构，即n^2的空间，而不是n(n+1)/2。只是它们被告知矩阵是对称的，并且只使用上三角或下三角中的值。

一些scipy.linalg例程确实接受标志（例如在linalg.solve中设置sym_pos=True），这些标志会传递给BLAS例程，尽管numpy对此提供更多支持会很好，特别是为像DSYRK（对称秩k更新）这样的例程提供包装器，这将比dot(M.T, M)更快地计算出一个格拉姆矩阵。

（也许担心将时间和/或空间优化为2倍常数因子似乎有些吹毛求疵，但它可以对你能够在单台机器上处理多大问题的临界点产生影响...）

- Matt

这个问题是关于如何通过分配单个条目来自动创建对称矩阵的，而不是关于如何指示BLAS在其计算中使用对称矩阵或原则上如何更有效地存储对称矩阵的。 - Eric O. Lebigot

4

这个问题也涉及到空间效率，因此BLAS问题是相关的。 - jmmcd

@EOL，这个问题不是关于如何通过分配单个条目来自动创建对称矩阵的。 - Alexey

可以将“创建”更恰当地替换为“更新”。由于问题明确涉及在设置M_ji时透明地设置M_ji，而这个答案并不是关于这个问题的，所以你理解这实质上是我提出的观点。问题是如何高效地做到这一点（而不是高效地处理对称矩阵，尽管这可能是正确的问题：最好在评论中提出，或者给出一个解决更一般问题而不仅仅是讨论它的答案）。 - Eric O. Lebigot

10

有许多众所周知的方法可以存储对称矩阵，使它们不需要占用n^2个存储元素。此外，可以重写常见操作以访问这些修订后的存储方式。权威著作是Golub和Van Loan的《Matrix Computations》，第3版1996年，Johns Hopkins University Press，1.27-1.2.9节。例如，引用他们从形式（1.2.2）开始，在对称矩阵中只需要存储A = [a_{i,j} ] for i >= j。然后，假设保存矩阵的向量表示为V，并且A是n乘n的，请将a_{i,j}放入

V[(j-1)n - j(j-1)/2 + i]

这假设索引从1开始。

Golub和Van Loan提供了一个算法1.2.3，展示了如何访问存储的V来计算y = V x + y。

Golub和Van Loan还提供了一种将矩阵存储为对角线优势形式的方法。这并不节省存储空间，但支持对某些其他操作的快速访问。

- user1544219

1

还有矩形完全打包存储（RFP），例如Lapack ZPPTRF使用它。NumPy支持吗？ - isti_spl

@isti_spl：不，但你可以实现一个包装器来完成。 - Eric

1

为了构建一个沿主对角线对称，且主对角线上为0的NxN矩阵，可以按如下操作：

a = np.array([1, 2, 3, 4, 5])
b = np.zeros(shape=(a.shape[0], a.shape[0]))
upper = np.triu(b + a)
lower = np.tril(np.transpose(b + a))
D = (upper + lower) * (np.full(a.shape[0], fill_value=1) - np.eye(a.shape[0]))

这是一个比较特殊的情况，但最近我使用了这种矩阵来表示网络邻接关系。

希望对你有所帮助。干杯。

- Michael

1

这是纯Python而不是NumPy，但我刚刚编写了一个例程来填充对称矩阵（以及一个测试程序以确保它的正确性）。

import random

# fill a symmetric matrix with costs (i.e. m[x][y] == m[y][x]
# For demonstration purposes, this routine connect each node to all the others
# Since a matrix stores the costs, numbers are used to represent the nodes
# so the row and column indices can represent nodes

def fillCostMatrix(dim):        # square array of arrays
    # Create zero matrix
    new_square = [[0 for row in range(dim)] for col in range(dim)]
    # fill in main diagonal
    for v in range(0,dim):
        new_square[v][v] = random.randrange(1,10)

    # fill upper and lower triangles symmetrically by replicating diagonally
    for v in range(1,dim):
        iterations = dim - v
        x = v
        y = 0
        while iterations > 0:
            new_square[x][y] = new_square[y][x] = random.randrange(1,10)
            x += 1
            y += 1
            iterations -= 1
    return new_square

# sanity test
def test_symmetry(square):
    dim = len(square[0])
    isSymmetric = ''
    for x in range(0, dim):
        for y in range(0, dim):
            if square[x][y] != square[y][x]:
                isSymmetric = 'NOT'
    print "Matrix is", isSymmetric, "symmetric"

def showSquare(square):
    # Print out square matrix
    columnHeader = ' '
    for i in range(len(square)):
        columnHeader += '  ' + str(i)
    print columnHeader

    i = 0;
    for col in square:
        print i, col    # print row number and data
        i += 1

def myMain(argv):
    if len(argv) == 1:
        nodeCount = 6
    else:
        try:
            nodeCount = int(argv[1])
        except:
            print  "argument must be numeric"
            quit()

    # keep nodeCount <= 9 to keep the cost matrix pretty
    costMatrix = fillCostMatrix(nodeCount)
    print  "Cost Matrix"
    showSquare(costMatrix)
    test_symmetry(costMatrix)   # sanity test
if __name__ == "__main__":
    import sys
    myMain(sys.argv)

# vim:tabstop=8:shiftwidth=4:expandtab

- Davidka

0

如果填充[j][i]，则使用Python填充[i][j]很容易。存储问题则更有趣一些。可以通过增加一个packed属性来扩展numpy数组类，这对于节省存储空间和以后读取数据都非常有用。

class Sym(np.ndarray):

    # wrapper class for numpy array for symmetric matrices. New attribute can pack matrix to optimize storage.
    # Usage:
    # If you have a symmetric matrix A as a shape (n,n) numpy ndarray, Sym(A).packed is a shape (n(n+1)/2,) numpy array 
    # that is a packed version of A.  To convert it back, just wrap the flat list in Sym().  Note that Sym(Sym(A).packed)


    def __new__(cls, input_array):
        obj = np.asarray(input_array).view(cls)

        if len(obj.shape) == 1:
            l = obj.copy()
            p = obj.copy()
            m = int((np.sqrt(8 * len(obj) + 1) - 1) / 2)
            sqrt_m = np.sqrt(m)

            if np.isclose(sqrt_m, np.round(sqrt_m)):
                A = np.zeros((m, m))
                for i in range(m):
                    A[i, i:] = l[:(m-i)]
                    A[i:, i] = l[:(m-i)]
                    l = l[(m-i):]
                obj = np.asarray(A).view(cls)
                obj.packed = p

            else:
                raise ValueError('One dimensional input length must be a triangular number.')

        elif len(obj.shape) == 2:
            if obj.shape[0] != obj.shape[1]:
                raise ValueError('Two dimensional input must be a square matrix.')
            packed_out = []
            for i in range(obj.shape[0]):
                packed_out.append(obj[i, i:])
            obj.packed = np.concatenate(packed_out)

        else:
            raise ValueError('Input array must be 1 or 2 dimensional.')

        return obj

    def __array_finalize__(self, obj):
        if obj is None: return
        self.packed = getattr(obj, 'packed', None)

```

- Charles F

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Eric O Lebigot · Accepted Answer

如果您能在计算之前对矩阵进行对称化处理，那么以下方法应该相当快：

def symmetrize(a):
    """
    Return a symmetrized version of NumPy array a.

    Values 0 are replaced by the array value at the symmetric
    position (with respect to the diagonal), i.e. if a_ij = 0,
    then the returned array a' is such that a'_ij = a_ji.

    Diagonal values are left untouched.

    a -- square NumPy array, such that a_ij = 0 or a_ji = 0, 
    for i != j.
    """
    return a + a.T - numpy.diag(a.diagonal())

在合理的假设下（比如在运行symmetrize之前不要同时执行a[0, 1] = 42和相互矛盾的a[1, 0] = 123），这将起作用。

如果你真的需要透明的对称化，你可以考虑子类化numpy.ndarray并简单地重新定义__setitem__：

class SymNDArray(numpy.ndarray):
    """
    NumPy array subclass for symmetric matrices.

    A SymNDArray arr is such that doing arr[i,j] = value
    automatically does arr[j,i] = value, so that array
    updates remain symmetrical.
    """

    def __setitem__(self, (i, j), value):
        super(SymNDArray, self).__setitem__((i, j), value)                    
        super(SymNDArray, self).__setitem__((j, i), value)                    

def symarray(input_array):
    """
    Return a symmetrized version of the array-like input_array.

    The returned array has class SymNDArray. Further assignments to the array
    are thus automatically symmetrized.
    """
    return symmetrize(numpy.asarray(input_array)).view(SymNDArray)

# Example:
a = symarray(numpy.zeros((3, 3)))
a[0, 1] = 42
print a  # a[1, 0] == 42 too!

(或者根据您的需求使用矩阵而不是数组进行等效处理)。该方法甚至可以处理更复杂的赋值操作，如a[:, 1] = -1，这样正确地设置了a[1, :]元素。

请注意，Python 3已经删除了写入def …(…, (i, j),…)的可能性，因此在运行Python 3之前，代码必须稍作调整：def __setitem__(self, indexes, value): (i, j) = indexes...