如何在Python中实现基于Householder的QR分解?

10

我目前正在尝试实现基于Householder的QR分解,用于矩形矩阵,如http://eprints.ma.man.ac.uk/1192/1/qrupdating_12nov08.pdf(第3页、第4页、第5页)所述。

然而,显然我的伪代码有误,因为(1) 我的结果与numpy.qr.linalg()不同,且(2) 由我的例程产生的矩阵R不是上三角矩阵。

我的代码(也可在https://pyfiddle.io/fiddle/afcc2e0e-0857-4cb2-adb5-06ff9b80c9d3/?i=true中找到)

import math
import argparse
import numpy as np
from typing import Union

def householder(alpha: float, x: np.ndarray) -> Union[np.ndarray, int]:
    """
    Computes Householder vector for alpha and x.
    :param alpha:
    :param x:
    :return:
    """

    s = math.pow(np.linalg.norm(x, ord=2), 2)
    v = x

    if s == 0:
        tau = 0
    else:
        t = math.sqrt(alpha * alpha + s)
        v_one = alpha - t if alpha <= 0 else -s / (alpha + t)

        tau = 2 * v_one * v_one / (s + v_one * v_one)
        v /= v_one

    return v, tau


def qr_decomposition(A: np.ndarray, m: int, n: int) -> Union[np.ndarray, np.ndarray]:
    """
    Applies Householder-based QR decomposition on specified matrix A.
    :param A:
    :param m:
    :param n:
    :return:
    """
    H = []
    R = A
    Q = A
    I = np.eye(m, m)

    for j in range(0, n):
        # Apply Householder transformation.
        x = A[j + 1:m, j]
        v_householder, tau = householder(np.linalg.norm(x), x)
        v = np.zeros((1, m))
        v[0, j] = 1
        v[0, j + 1:m] = v_householder

        res = I - tau * v * np.transpose(v)
        R = np.matmul(res, R)
        H.append(res)

    return Q, R

m = 10
n = 8

A = np.random.rand(m, n)
q, r = np.linalg.qr(A)
Q, R = qr_decomposition(A, m, n)

print("*****")
print(Q)
print(q)
print("-----")
print(R)
print(r)

我不确定如何将零引入到我的R矩阵中/哪部分代码是错误的。如果有任何指导,我将非常感激!非常感谢您的时间。


1
你遇到的基本问题是你链接的笔记完全是垃圾。他们对于householder算法的伪代码是不完整的,而他们对实际的Householder矩阵H的描述只是纯粹混乱。在查阅了其他资料之后,我能够在下面的答案中提供一个可行的版本。 - tel
1个回答

12

你提供的笔记中存在许多问题/细节缺失。在参考了其他一些资料(包括这本非常有用的教科书)之后,我成功地实现了类似的算法。

可行的算法

以下是一个可行版本的 qr_decomposition 代码:

import numpy as np
from typing import Union

def householder(x: np.ndarray) -> Union[np.ndarray, int]:
    alpha = x[0]
    s = np.power(np.linalg.norm(x[1:]), 2)
    v = x.copy()

    if s == 0:
        tau = 0
    else:
        t = np.sqrt(alpha**2 + s)
        v[0] = alpha - t if alpha <= 0 else -s / (alpha + t)

        tau = 2 * v[0]**2 / (s + v[0]**2)
        v /= v[0]

    return v, tau

def qr_decomposition(A: np.ndarray) -> Union[np.ndarray, np.ndarray]:
    m,n = A.shape
    R = A.copy()
    Q = np.identity(m)

    for j in range(0, n):
        # Apply Householder transformation.
        v, tau = householder(R[j:, j])
        H = np.identity(m)
        H[j:, j:] -= tau * v.reshape(-1, 1) @ v
        R = H @ R
        Q = H @ Q

    return Q[:n].T, R[:n]

m = 5
n = 4

A = np.random.rand(m, n)
q, r = np.linalg.qr(A)
Q, R = qr_decomposition(A)

with np.printoptions(linewidth=9999, precision=20, suppress=True):
    print("**** Q from qr_decomposition")
    print(Q)
    print("**** Q from np.linalg.qr")
    print(q)
    print()
    
    print("**** R from qr_decomposition")
    print(R)
    print("**** R from np.linalg.qr")
    print(r)

输出:

**** Q from qr_decomposition
[[ 0.5194188817843675  -0.10699353671401633  0.4322294754656072  -0.7293293270703678 ]
 [ 0.5218635773595086   0.11737804362574514 -0.5171653705211056   0.04467925806590414]
 [ 0.34858177783013133  0.6023104248793858  -0.33329256746256875 -0.03450824948274838]
 [ 0.03371048915852807  0.6655221685383623   0.6127023580593225   0.28795294754791   ]
 [ 0.5789790833500734  -0.411189947884951    0.24337120818874305  0.618041080584351  ]]
**** Q from np.linalg.qr
[[-0.5194188817843672    0.10699353671401617   0.4322294754656068    0.7293293270703679  ]
 [-0.5218635773595086   -0.11737804362574503  -0.5171653705211053   -0.044679258065904115]
 [-0.3485817778301313   -0.6023104248793857   -0.33329256746256863   0.03450824948274819 ]
 [-0.03371048915852807  -0.665522168538362     0.6127023580593226   -0.2879529475479097  ]
 [-0.5789790833500733    0.41118994788495106   0.24337120818874317  -0.6180410805843508  ]]

**** R from qr_decomposition
[[ 0.6894219296137802      1.042676051151294       1.3418719684631446      1.2498925815126485    ]
 [ 0.00000000000000000685  0.7076056836914905      0.29883043386651403     0.41955370595004277   ]
 [-0.0000000000000000097  -0.00000000000000007292  0.5304551654027297      0.18966088433421135   ]
 [-0.00000000000000000662  0.00000000000000008718  0.00000000000000002322  0.6156558913022807    ]]
**** R from np.linalg.qr
[[-0.6894219296137803  -1.042676051151294   -1.3418719684631442  -1.2498925815126483 ]
 [ 0.                  -0.7076056836914905  -0.29883043386651376 -0.4195537059500425 ]
 [ 0.                   0.                   0.53045516540273     0.18966088433421188]
 [ 0.                   0.                   0.                  -0.6156558913022805 ]]

这个版本的qr_decomposition几乎完全复制了np.linalg.qr的输出。下面会对它们之间的差异进行注释。

输出的数值精度

np.linalg.qrqr_decomposition的输出值非常接近。然而,qr_decomposition用于产生R中零的计算组合并没有完全消除,所以这些零实际上并不完全等于零。

事实证明,np.linalg.qr并没有使用任何花哨的浮点技巧来确保其输出中的零为0.0。它只是调用了np.triu,将这些值强制设置为0.0。因此,要实现相同的结果,只需将qr_decomposition中的return行更改为:

return Q[:n].T, np.triu(R[:n])

输出中的符号 (+/-)

np.linalg.qrqr_decomposition 的输出中,Q 和 R 中的一些 +/- 符号是不同的,但这并不是一个问题,因为有很多有效的符号选择(请参见关于 Q 和 R 唯一性的讨论)。您可以使用替代算法生成 vtau,以完全匹配 np.linalg.qr 的符号约定。

def householder_vectorized(a):
    """Use this version of householder to reproduce the output of np.linalg.qr 
    exactly (specifically, to match the sign convention it uses)
    
    based on https://rosettacode.org/wiki/QR_decomposition#Python
    """
    v = a / (a[0] + np.copysign(np.linalg.norm(a), a[0]))
    v[0] = 1
    tau = 2 / (v.T @ v)
    
    return v,tau

完全匹配np.linalg.qr的输出结果

将所有内容综合起来,这个版本的qr_decomposition将会完全匹配np.linalg.qr的输出结果:

import numpy as np
from typing import Union

def qr_decomposition(A: np.ndarray) -> Union[np.ndarray, np.ndarray]:
    m,n = A.shape
    R = A.copy()
    Q = np.identity(m)
    
    for j in range(0, n):
        # Apply Householder transformation.
        v, tau = householder_vectorized(R[j:, j, np.newaxis])
        
        H = np.identity(m)
        H[j:, j:] -= tau * (v @ v.T)
        R = H @ R
        Q = H @ Q
        
    return Q[:n].T, np.triu(R[:n])
​
m = 5
n = 4
​
A = np.random.rand(m, n)
q, r = np.linalg.qr(A)
Q, R = qr_decomposition(A)
​
with np.printoptions(linewidth=9999, precision=20, suppress=True):
    print("**** Q from qr_decomposition")
    print(Q)
    print("**** Q from np.linalg.qr")
    print(q)
    print()
    
    print("**** R from qr_decomposition")
    print(R)
    print("**** R from np.linalg.qr")
    print(r)

输出:

**** Q from qr_decomposition
[[-0.10345123000824041   0.6455437884382418    0.44810714367794663  -0.03963544711256745 ]
 [-0.55856415402318     -0.3660716543156899    0.5953932791844518    0.43106504879433577 ]
 [-0.30655198880585594   0.6606757192118904   -0.21483067305535333   0.3045011114089389  ]
 [-0.48053620675695174  -0.11139783377793576  -0.6310958848894725    0.2956864520726446  ]
 [-0.5936453158283703   -0.01904935140131578  -0.016510508076204543 -0.79527388379824    ]]
**** Q from np.linalg.qr
[[-0.10345123000824041   0.6455437884382426    0.44810714367794663  -0.039635447112567376]
 [-0.5585641540231802   -0.3660716543156898    0.5953932791844523    0.4310650487943359  ]
 [-0.30655198880585594   0.6606757192118904   -0.21483067305535375   0.30450111140893893 ]
 [-0.48053620675695186  -0.1113978337779356   -0.6310958848894725    0.29568645207264455 ]
 [-0.5936453158283704   -0.01904935140131564  -0.0165105080762043   -0.79527388379824    ]]

**** R from qr_decomposition
[[-1.653391466100325   -1.0838054573405895  -1.0632037969249921  -1.1825735233596888 ]
 [ 0.                   0.7263519982452554   0.7798481878600413   0.5496287509656425 ]
 [ 0.                   0.                  -0.26840760341581243 -0.2002757085967938 ]
 [ 0.                   0.                   0.                   0.48524469321440966]]
**** R from np.linalg.qr
[[-1.6533914661003253 -1.0838054573405895 -1.0632037969249923 -1.182573523359689 ]
 [ 0.                  0.7263519982452559  0.7798481878600418  0.5496287509656428]
 [ 0.                  0.                 -0.2684076034158126 -0.2002757085967939]
 [ 0.                  0.                  0.                  0.4852446932144096]]

除了最后几位数字的不可避免的四舍五入误差外,输出现在匹配。

1
太棒了,谢谢!你提供的书更清晰地解释了这个过程,我同意。我非常感激这个深入的回答。 - justonemorething
2
@tel 很好的例子!在你的例子中,"@" 的确切操作是什么? - BirdANDBird
1
@矩阵乘法运算符 - tel
我给这个点踩了下来,因为它的实现不够好。例如,在每一步中,你不需要分配一个可能非常巨大的单位矩阵 H = np.identity(m)H 应该隐式地应用。 - Nico Schlömer

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接