勘误(划掉+错误)Berlekamp-Massey用于Reed-Solomon解码

26
我正在尝试在Python中实现一个Reed-Solomon编码器-解码器,支持纠删码的解码,但这让我感到非常困惑。
目前的实现只支持解码错误或擦除,但不能同时支持两者(即使它低于2*错误+擦除的理论界限)。
从Blahut的论文(此处此处)中看来,我们只需要使用擦除定位多项式初始化错误定位多项式,就可以隐式地在Berlekamp-Massey中计算出纠正位置多项式。
这种方法对我来说部分有效:当2*errors+erasures < (n-k)/2时,它可以使用,但实际上在调试后只是因为BM计算一个错误定位多项式,该多项式得到与擦除定位多项式完全相同的值(因为我们低于仅纠错的误差限制),因此通过伽罗华域截短,我们最终获得了正确的擦除定位多项式的值(至少我是这样理解的,我可能是错的)。
然而,当我们超过(n-k)/2时,例如如果n=20且k=11,则我们有(n-k)=9个擦除的符号可以纠正,如果我们输入5个擦除,则BM就会出现问题。如果我们输入4个擦除+1个错误(由于我们仍然远低于界限,因为2*errors+erasures = 2+4 = 6 < 9),BM仍然会出现问题。
我实现的Berlekamp-Massey算法的确切算法可以在此演示文稿(第15-17页)中找到,但非常类似的描述可以在这里这里找到,并在此处附上数学描述的副本:

Berlekamp-Massey algorithm

现在,我已经将这个数学算法几乎完全复制到了Python代码中。我想扩展它以支持纠删码,我尝试通过使用纠删定位器初始化错误定位器sigma来实现:
def _berlekamp_massey(self, s, k=None, erasures_loc=None):
    '''Computes and returns the error locator polynomial (sigma) and the
    error evaluator polynomial (omega).
    If the erasures locator is specified, we will return an errors-and-erasures locator polynomial and an errors-and-erasures evaluator polynomial.
    The parameter s is the syndrome polynomial (syndromes encoded in a
    generator function) as returned by _syndromes. Don't be confused with
    the other s = (n-k)/2

    Notes:
    The error polynomial:
    E(x) = E_0 + E_1 x + ... + E_(n-1) x^(n-1)

    j_1, j_2, ..., j_s are the error positions. (There are at most s
    errors)

    Error location X_i is defined: X_i = a^(j_i)
    that is, the power of a corresponding to the error location

    Error magnitude Y_i is defined: E_(j_i)
    that is, the coefficient in the error polynomial at position j_i

    Error locator polynomial:
    sigma(z) = Product( 1 - X_i * z, i=1..s )
    roots are the reciprocals of the error locations
    ( 1/X_1, 1/X_2, ...)

    Error evaluator polynomial omega(z) is here computed at the same time as sigma, but it can also be constructed afterwards using the syndrome and sigma (see _find_error_evaluator() method).
    '''
    # For errors-and-erasures decoding, see: Blahut, Richard E. "Transform techniques for error control codes." IBM Journal of Research and development 23.3 (1979): 299-315. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.92.600&rep=rep1&type=pdf and also a MatLab implementation here: http://www.mathworks.com/matlabcentral/fileexchange/23567-reed-solomon-errors-and-erasures-decoder/content//RS_E_E_DEC.m
    # also see: Blahut, Richard E. "A universal Reed-Solomon decoder." IBM Journal of Research and Development 28.2 (1984): 150-158. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.84.2084&rep=rep1&type=pdf
    # or alternatively see the reference book by Blahut: Blahut, Richard E. Theory and practice of error control codes. Addison-Wesley, 1983.
    # and another good alternative book with concrete programming examples: Jiang, Yuan. A practical guide to error-control coding using Matlab. Artech House, 2010.
    n = self.n
    if not k: k = self.k

    # Initialize:
    if erasures_loc:
        sigma = [ Polynomial(erasures_loc.coefficients) ] # copy erasures_loc by creating a new Polynomial
        B = [ Polynomial(erasures_loc.coefficients) ]
    else:
        sigma =  [ Polynomial([GF256int(1)]) ] # error locator polynomial. Also called Lambda in other notations.
        B =    [ Polynomial([GF256int(1)]) ] # this is the error locator support/secondary polynomial, which is a funky way to say that it's just a temporary variable that will help us construct sigma, the error locator polynomial
    omega =  [ Polynomial([GF256int(1)]) ] # error evaluator polynomial. We don't need to initialize it with erasures_loc, it will still work, because Delta is computed using sigma, which itself is correctly initialized with erasures if needed.
    A =  [ Polynomial([GF256int(0)]) ] # this is the error evaluator support/secondary polynomial, to help us construct omega
    L =      [ 0 ] # necessary variable to check bounds (to avoid wrongly eliminating the higher order terms). For more infos, see https://www.cs.duke.edu/courses/spring11/cps296.3/decoding_rs.pdf
    M =      [ 0 ] # optional variable to check bounds (so that we do not mistakenly overwrite the higher order terms). This is not necessary, it's only an additional safe check. For more infos, see the presentation decoding_rs.pdf by Andrew Brown in the doc folder.

    # Polynomial constants:
    ONE = Polynomial(z0=GF256int(1))
    ZERO = Polynomial(z0=GF256int(0))
    Z = Polynomial(z1=GF256int(1)) # used to shift polynomials, simply multiply your poly * Z to shift

    s2 = ONE + s

    # Iteratively compute the polynomials 2s times. The last ones will be
    # correct
    for l in xrange(0, n-k):
        K = l+1
        # Goal for each iteration: Compute sigma[K] and omega[K] such that
        # (1 + s)*sigma[l] == omega[l] in mod z^(K)

        # For this particular loop iteration, we have sigma[l] and omega[l],
        # and are computing sigma[K] and omega[K]

        # First find Delta, the non-zero coefficient of z^(K) in
        # (1 + s) * sigma[l]
        # This delta is valid for l (this iteration) only
        Delta = ( s2 * sigma[l] ).get_coefficient(l+1) # Delta is also known as the Discrepancy, and is always a scalar (not a polynomial).
        # Make it a polynomial of degree 0, just for ease of computation with polynomials sigma and omega.
        Delta = Polynomial(x0=Delta)

        # Can now compute sigma[K] and omega[K] from
        # sigma[l], omega[l], B[l], A[l], and Delta
        sigma.append( sigma[l] - Delta * Z * B[l] )
        omega.append( omega[l] - Delta * Z * A[l] )

        # Now compute the next B and A
        # There are two ways to do this
        # This is based on a messy case analysis on the degrees of the four polynomials sigma, omega, A and B in order to minimize the degrees of A and B. For more infos, see https://www.cs.duke.edu/courses/spring10/cps296.3/decoding_rs_scribe.pdf
        # In fact it ensures that the degree of the final polynomials aren't too large.
        if Delta == ZERO or 2*L[l] > K \
            or (2*L[l] == K and M[l] == 0):
            # Rule A
            B.append( Z * B[l] )
            A.append( Z * A[l] )
            L.append( L[l] )
            M.append( M[l] )

        elif (Delta != ZERO and 2*L[l] < K) \
            or (2*L[l] == K and M[l] != 0):
            # Rule B
            B.append( sigma[l] // Delta )
            A.append( omega[l] // Delta )
            L.append( K - L[l] )
            M.append( 1 - M[l] )

        else:
            raise Exception("Code shouldn't have gotten here")

    return sigma[-1], omega[-1]

Polynomial和GF256int分别是多项式和2^8上的伽罗瓦域的通用实现。这些类已经过单元测试,通常是无缺陷的。Reed-Solomon编码/解码方法的其余部分(例如Forney和Chien搜索)也是如此。这里有一个快速测试案例的完整代码:http://codepad.org/l2Qi0y8o
以下是示例输出:
Encoded message:
hello world�ꐙ�Ī`>
-------
Erasures decoding:
Erasure locator: 189x^5 + 88x^4 + 222x^3 + 33x^2 + 251x + 1
Syndrome: 149x^9 + 113x^8 + 29x^7 + 231x^6 + 210x^5 + 150x^4 + 192x^3 + 11x^2 + 41x
Sigma: 189x^5 + 88x^4 + 222x^3 + 33x^2 + 251x + 1
Symbols positions that were corrected: [19, 18, 17, 16, 15]
('Decoded message: ', 'hello world', '\xce\xea\x90\x99\x8d\xc4\xaa`>')
Correctly decoded:  True
-------
Errors+Erasures decoding for the message with only erasures:
Erasure locator: 189x^5 + 88x^4 + 222x^3 + 33x^2 + 251x + 1
Syndrome: 149x^9 + 113x^8 + 29x^7 + 231x^6 + 210x^5 + 150x^4 + 192x^3 + 11x^2 + 41x
Sigma: 101x^10 + 139x^9 + 5x^8 + 14x^7 + 180x^6 + 148x^5 + 126x^4 + 135x^3 + 68x^2 + 155x + 1
Symbols positions that were corrected: [187, 141, 90, 19, 18, 17, 16, 15]
('Decoded message: ', '\xf4\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00.\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00P\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\xe3\xe6\xffO> world', '\xce\xea\x90\x99\x8d\xc4\xaa`>')
Correctly decoded:  False
-------
Errors+Erasures decoding for the message with erasures and one error:
Erasure locator: 77x^4 + 96x^3 + 6x^2 + 206x + 1
Syndrome: 49x^9 + 107x^8 + x^7 + 109x^6 + 236x^5 + 15x^4 + 8x^3 + 133x^2 + 243x
Sigma: 38x^9 + 98x^8 + 239x^7 + 85x^6 + 32x^5 + 168x^4 + 92x^3 + 225x^2 + 22x + 1
Symbols positions that were corrected: [19, 18, 17, 16]
('Decoded message: ', "\xda\xe1'\xccA world", '\xce\xea\x90\x99\x8d\xc4\xaa`>')
Correctly decoded:  False

在这里,擦除解码总是正确的,因为它根本不使用BM来计算擦除定位器。通常,另外两个测试用例应该输出相同的sigma,但它们却没有。

当你比较前两个测试用例时,问题来自于BM是显而易见的:综合症和擦除定位器是相同的,但结果sigma完全不同(在第二个测试中使用了BM,而在只有擦除的第一个测试用例中没有调用BM)。

非常感谢您的任何帮助或任何关于如何调试此问题的想法。请注意,您的答案可以是数学或代码,但请解释我的方法出了什么问题。

/编辑:仍然没有找到如何正确实现勘误BM解码器的方法(请参见下面的答案)。将奖励提供给能够解决问题(或至少指导我解决问题)的人。

/编辑2:我真傻,抱歉,我刚刚重新阅读了模式,并发现我错过了赋值L = r - L - erasures_count的更改... 我已经更新了代码以修复并重新接受了我的答案。


l不是一个好的变量名,因为它与1相似。i通常用作循环计数器。 - Craig McQueen
@CraigMcQueen 抱歉,但我使用了Blahut论文中发现的命名法,但确实最好将其更改为另一个字母(在我的实际代码中,我使用不同的字母)。这也是为什么我直接创建变量“K”以避免在其余代码中使用“l”的原因。 - gaborous
3个回答

12
阅读了许多研究论文和书籍后,我唯一找到答案的地方是在这本书中(在线可读谷歌图书链接,但不提供PDF版本):

“数据传输的代数编码”,Blahut, Richard E.,2003年,剑桥大学出版社。

以下是这本书的一些摘录,详细描述了我实现的Berlekamp-Massey算法的确切描述(除了多项式运算的矩阵/向量表示)。

Berlekamp-Massey algorithm for Reed-Solomon

这里是Reed-Solomon的勘误表(错误和擦除)Berlekamp-Massey算法:

Errors-and-erasures Berlekamp-Massey algorithm for Reed-Solomon

如您所见--与通常的描述相反,您不仅需要使用先前计算出来的纠删码定位多项式的值对Lambda进行初始化,还需要跳过前v次迭代,其中v是纠删码的数量。请注意,这并不等同于跳过最后v次迭代:您需要跳过前v次迭代,因为r(迭代计数器,在我的实现中为K)不仅用于计算迭代次数,还用于生成正确的差错因子Delta。

以下是经过修改以支持除了错误外的纠删码,且误差范围小于v+2*e <= (n-k)的代码:

def _berlekamp_massey(self, s, k=None, erasures_loc=None, erasures_eval=None, erasures_count=0):
    '''Computes and returns the errata (errors+erasures) locator polynomial (sigma) and the
    error evaluator polynomial (omega) at the same time.
    If the erasures locator is specified, we will return an errors-and-erasures locator polynomial and an errors-and-erasures evaluator polynomial, else it will compute only errors. With erasures in addition to errors, it can simultaneously decode up to v+2e <= (n-k) where v is the number of erasures and e the number of errors.
    Mathematically speaking, this is equivalent to a spectral analysis (see Blahut, "Algebraic Codes for Data Transmission", 2003, chapter 7.6 Decoding in Time Domain).
    The parameter s is the syndrome polynomial (syndromes encoded in a
    generator function) as returned by _syndromes.

    Notes:
    The error polynomial:
    E(x) = E_0 + E_1 x + ... + E_(n-1) x^(n-1)

    j_1, j_2, ..., j_s are the error positions. (There are at most s
    errors)

    Error location X_i is defined: X_i = α^(j_i)
    that is, the power of α (alpha) corresponding to the error location

    Error magnitude Y_i is defined: E_(j_i)
    that is, the coefficient in the error polynomial at position j_i

    Error locator polynomial:
    sigma(z) = Product( 1 - X_i * z, i=1..s )
    roots are the reciprocals of the error locations
    ( 1/X_1, 1/X_2, ...)

    Error evaluator polynomial omega(z) is here computed at the same time as sigma, but it can also be constructed afterwards using the syndrome and sigma (see _find_error_evaluator() method).

    It can be seen that the algorithm tries to iteratively solve for the error locator polynomial by
    solving one equation after another and updating the error locator polynomial. If it turns out that it
    cannot solve the equation at some step, then it computes the error and weights it by the last
    non-zero discriminant found, and delays the weighted result to increase the polynomial degree
    by 1. Ref: "Reed Solomon Decoder: TMS320C64x Implementation" by Jagadeesh Sankaran, December 2000, Application Report SPRA686

    The best paper I found describing the BM algorithm for errata (errors-and-erasures) evaluator computation is in "Algebraic Codes for Data Transmission", Richard E. Blahut, 2003.
    '''
    # For errors-and-erasures decoding, see: "Algebraic Codes for Data Transmission", Richard E. Blahut, 2003 and (but it's less complete): Blahut, Richard E. "Transform techniques for error control codes." IBM Journal of Research and development 23.3 (1979): 299-315. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.92.600&rep=rep1&type=pdf and also a MatLab implementation here: http://www.mathworks.com/matlabcentral/fileexchange/23567-reed-solomon-errors-and-erasures-decoder/content//RS_E_E_DEC.m
    # also see: Blahut, Richard E. "A universal Reed-Solomon decoder." IBM Journal of Research and Development 28.2 (1984): 150-158. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.84.2084&rep=rep1&type=pdf
    # and another good alternative book with concrete programming examples: Jiang, Yuan. A practical guide to error-control coding using Matlab. Artech House, 2010.
    n = self.n
    if not k: k = self.k

    # Initialize, depending on if we include erasures or not:
    if erasures_loc:
        sigma = [ Polynomial(erasures_loc.coefficients) ] # copy erasures_loc by creating a new Polynomial, so that we initialize the errata locator polynomial with the erasures locator polynomial.
        B = [ Polynomial(erasures_loc.coefficients) ]
        omega =  [ Polynomial(erasures_eval.coefficients) ] # to compute omega (the evaluator polynomial) at the same time, we also need to initialize it with the partial erasures evaluator polynomial
        A =  [ Polynomial(erasures_eval.coefficients) ] # TODO: fix the initial value of the evaluator support polynomial, because currently the final omega is not correct (it contains higher order terms that should be removed by the end of BM)
    else:
        sigma =  [ Polynomial([GF256int(1)]) ] # error locator polynomial. Also called Lambda in other notations.
        B =    [ Polynomial([GF256int(1)]) ] # this is the error locator support/secondary polynomial, which is a funky way to say that it's just a temporary variable that will help us construct sigma, the error locator polynomial
        omega =  [ Polynomial([GF256int(1)]) ] # error evaluator polynomial. We don't need to initialize it with erasures_loc, it will still work, because Delta is computed using sigma, which itself is correctly initialized with erasures if needed.
        A =  [ Polynomial([GF256int(0)]) ] # this is the error evaluator support/secondary polynomial, to help us construct omega
    L = [ 0 ] # update flag: necessary variable to check when updating is necessary and to check bounds (to avoid wrongly eliminating the higher order terms). For more infos, see https://www.cs.duke.edu/courses/spring11/cps296.3/decoding_rs.pdf
    M = [ 0 ] # optional variable to check bounds (so that we do not mistakenly overwrite the higher order terms). This is not necessary, it's only an additional safe check. For more infos, see the presentation decoding_rs.pdf by Andrew Brown in the doc folder.

    # Fix the syndrome shifting: when computing the syndrome, some implementations may prepend a 0 coefficient for the lowest degree term (the constant). This is a case of syndrome shifting, thus the syndrome will be bigger than the number of ecc symbols (I don't know what purpose serves this shifting). If that's the case, then we need to account for the syndrome shifting when we use the syndrome such as inside BM, by skipping those prepended coefficients.
    # Another way to detect the shifting is to detect the 0 coefficients: by definition, a syndrome does not contain any 0 coefficient (except if there are no errors/erasures, in this case they are all 0). This however doesn't work with the modified Forney syndrome (that we do not use in this lib but it may be implemented in the future), which set to 0 the coefficients corresponding to erasures, leaving only the coefficients corresponding to errors.
    synd_shift = 0
    if len(s) > (n-k): synd_shift = len(s) - (n-k)

    # Polynomial constants:
    ONE = Polynomial(z0=GF256int(1))
    ZERO = Polynomial(z0=GF256int(0))
    Z = Polynomial(z1=GF256int(1)) # used to shift polynomials, simply multiply your poly * Z to shift

    # Precaching
    s2 = ONE + s

    # Iteratively compute the polynomials n-k-erasures_count times. The last ones will be correct (since the algorithm refines the error/errata locator polynomial iteratively depending on the discrepancy, which is kind of a difference-from-correctness measure).
    for l in xrange(0, n-k-erasures_count): # skip the first erasures_count iterations because we already computed the partial errata locator polynomial (by initializing with the erasures locator polynomial)
        K = erasures_count+l+synd_shift # skip the FIRST erasures_count iterations (not the last iterations, that's very important!)

        # Goal for each iteration: Compute sigma[l+1] and omega[l+1] such that
        # (1 + s)*sigma[l] == omega[l] in mod z^(K)

        # For this particular loop iteration, we have sigma[l] and omega[l],
        # and are computing sigma[l+1] and omega[l+1]

        # First find Delta, the non-zero coefficient of z^(K) in
        # (1 + s) * sigma[l]
        # Note that adding 1 to the syndrome s is not really necessary, you can do as well without.
        # This delta is valid for l (this iteration) only
        Delta = ( s2 * sigma[l] ).get_coefficient(K) # Delta is also known as the Discrepancy, and is always a scalar (not a polynomial).
        # Make it a polynomial of degree 0, just for ease of computation with polynomials sigma and omega.
        Delta = Polynomial(x0=Delta)

        # Can now compute sigma[l+1] and omega[l+1] from
        # sigma[l], omega[l], B[l], A[l], and Delta
        sigma.append( sigma[l] - Delta * Z * B[l] )
        omega.append( omega[l] - Delta * Z * A[l] )

        # Now compute the next support polynomials B and A
        # There are two ways to do this
        # This is based on a messy case analysis on the degrees of the four polynomials sigma, omega, A and B in order to minimize the degrees of A and B. For more infos, see https://www.cs.duke.edu/courses/spring10/cps296.3/decoding_rs_scribe.pdf
        # In fact it ensures that the degree of the final polynomials aren't too large.
        if Delta == ZERO or 2*L[l] > K+erasures_count \
            or (2*L[l] == K+erasures_count and M[l] == 0):
        #if Delta == ZERO or len(sigma[l+1]) <= len(sigma[l]): # another way to compute when to update, and it doesn't require to maintain the update flag L
            # Rule A
            B.append( Z * B[l] )
            A.append( Z * A[l] )
            L.append( L[l] )
            M.append( M[l] )

        elif (Delta != ZERO and 2*L[l] < K+erasures_count) \
            or (2*L[l] == K+erasures_count and M[l] != 0):
        # elif Delta != ZERO and len(sigma[l+1]) > len(sigma[l]): # another way to compute when to update, and it doesn't require to maintain the update flag L
            # Rule B
            B.append( sigma[l] // Delta )
            A.append( omega[l] // Delta )
            L.append( K - L[l] ) # the update flag L is tricky: in Blahut's schema, it's mandatory to use `L = K - L - erasures_count` (and indeed in a previous draft of this function, if you forgot to do `- erasures_count` it would lead to correcting only 2*(errors+erasures) <= (n-k) instead of 2*errors+erasures <= (n-k)), but in this latest draft, this will lead to a wrong decoding in some cases where it should correctly decode! Thus you should try with and without `- erasures_count` to update L on your own implementation and see which one works OK without producing wrong decoding failures.
            M.append( 1 - M[l] )

        else:
            raise Exception("Code shouldn't have gotten here")

    # Hack to fix the simultaneous computation of omega, the errata evaluator polynomial: because A (the errata evaluator support polynomial) is not correctly initialized (I could not find any info in academic papers). So at the end, we get the correct errata evaluator polynomial omega + some higher order terms that should not be present, but since we know that sigma is always correct and the maximum degree should be the same as omega, we can fix omega by truncating too high order terms.
    if omega[-1].degree > sigma[-1].degree: omega[-1] = Polynomial(omega[-1].coefficients[-(sigma[-1].degree+1):])

    # Return the last result of the iterations (since BM compute iteratively, the last iteration being correct - it may already be before, but we're not sure)
    return sigma[-1], omega[-1]

def _find_erasures_locator(self, erasures_pos):
    '''Compute the erasures locator polynomial from the erasures positions (the positions must be relative to the x coefficient, eg: "hello worldxxxxxxxxx" is tampered to "h_ll_ worldxxxxxxxxx" with xxxxxxxxx being the ecc of length n-k=9, here the string positions are [1, 4], but the coefficients are reversed since the ecc characters are placed as the first coefficients of the polynomial, thus the coefficients of the erased characters are n-1 - [1, 4] = [18, 15] = erasures_loc to be specified as an argument.'''
    # See: http://ocw.usu.edu/Electrical_and_Computer_Engineering/Error_Control_Coding/lecture7.pdf and Blahut, Richard E. "Transform techniques for error control codes." IBM Journal of Research and development 23.3 (1979): 299-315. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.92.600&rep=rep1&type=pdf and also a MatLab implementation here: http://www.mathworks.com/matlabcentral/fileexchange/23567-reed-solomon-errors-and-erasures-decoder/content//RS_E_E_DEC.m
    erasures_loc = Polynomial([GF256int(1)]) # just to init because we will multiply, so it must be 1 so that the multiplication starts correctly without nulling any term
    # erasures_loc is very simple to compute: erasures_loc = prod(1 - x*alpha[j]**i) for i in erasures_pos and where alpha is the alpha chosen to evaluate polynomials (here in this library it's gf(3)). To generate c*x where c is a constant, we simply generate a Polynomial([c, 0]) where 0 is the constant and c is positionned to be the coefficient for x^1. See https://en.wikipedia.org/wiki/Forney_algorithm#Erasures
    for i in erasures_pos:
        erasures_loc = erasures_loc * (Polynomial([GF256int(1)]) - Polynomial([GF256int(self.generator)**i, 0]))
    return erasures_loc

注意:Sigma、Omega、A、B、L和M都是多项式列表(因此我们保留了每次迭代计算的所有中间多项式的完整历史记录)。当然,这可以进行优化,因为我们实际上只需要Sigma[l]、Sigma[l-1]、Omega[l]、Omega[l-1]、A[l]、B[l]、L[l]和M[l](因此只有Sigma和Omega需要在内存中保留前一个迭代,其他变量不需要)。
注2:更新标志L很棘手:在一些实现中,像Blahut的模式一样做会导致解码错误。在我以前的实现中,必须使用L = K-L-erasures_count才能正确解码错误和擦除直到Singleton界限,但在我最新的实现中,即使存在擦除,我也必须使用L = K-L(避免错误解码失败)。您应该在自己的实现中尝试两种方法,看哪一种不会产生任何错误的解码失败。请参见下面的问题以获取更多信息。
这个算法唯一的问题是它没有描述如何同时计算Omega,即错误评估多项式(该书仅描述了如何为错误初始化Omega,但未描述在解码错误和擦除时如何初始化)。我尝试了几种变化,上面的方法可以工作,但不完全:最终,Omega将包括本应被取消的高阶项。可能Omega或A(错误评估支持多项式)未用正确的值初始化。
然而,您可以通过修剪Omega多项式中过高阶的项来解决这个问题(因为它应该与Lambda / Sigma具有相同的次数):
if omega[-1].degree > sigma[-1].degree: omega[-1] = Polynomial(omega[-1].coefficients[-(sigma[-1].degree+1):])

或者您可以在BM后使用总是正确计算的勘误定位器Lambda/Sigma完全重新计算Omega:
def _find_error_evaluator(self, synd, sigma, k=None):
    '''Compute the error (or erasures if you supply sigma=erasures locator polynomial) evaluator polynomial Omega from the syndrome and the error/erasures/errata locator Sigma. Omega is already computed at the same time as Sigma inside the Berlekamp-Massey implemented above, but in case you modify Sigma, you can recompute Omega afterwards using this method, or just ensure that Omega computed by BM is correct given Sigma (as long as syndrome and sigma are correct, omega will be correct).'''
    n = self.n
    if not k: k = self.k

    # Omega(x) = [ Synd(x) * Error_loc(x) ] mod x^(n-k+1) -- From Blahut, Algebraic codes for data transmission, 2003
    return (synd * sigma) % Polynomial([GF256int(1)] + [GF256int(0)] * (n-k+1)) # Note that you should NOT do (1+Synd(x)) as can be seen in some books because this won't work with all primitive generators.

我正在寻找一个更好的解决方案,来解决 CSTheory 上以下问题

/编辑:我将描述一些我遇到的问题以及如何解决它们:

  • 不要忘记使用纠错位置多项式(从综合和擦除位置中轻松计算)初始化错误定位器多项式。
  • 如果您只能无误地解码错误和擦除,但限制为 2*errors + erasures <= (n-k)/2,那么您忘记跳过前面的 v 次迭代。
  • 如果您可以解码擦除和错误,但仅限于 2*(errors+erasures) <= (n-k),那么您忘记更新 L 的赋值:L = i+1 - L - erasures_count 而不是 L = i+1 - L。但这可能会导致您的解码器在某些情况下失败,具体取决于您如何实现解码器,请参见下一点。
  • 我的第一个解码器仅限于一个生成器/素多项式/fcr,但当我将其更新为通用版本并添加了严格的单元测试时,解码器在不应该失败的情况下失败了。似乎 Blahut 上面的模式对于 L(更新标志)是错误的:必须使用 L = K - L 而不是 L = K - L - erasures_count 更新它,因为这样会导致解码器有时候失败,即使我们处于 Singleton 界限之下(因此我们应该正确解码!)。这似乎得到了证实,即计算 L = K - L 不仅可以修复这些解码问题,而且还将给出与不使用更新标志 L 的另一种更新方式完全相同的结果(即条件 if Delta == ZERO or len(sigma[l+1]) <= len(sigma[l]):)。但这很奇怪:在我的过去实现中,L = K - L - erasures_count 对于错误和擦除解码是必需的,但现在似乎会产生错误的失败。因此,您应该尝试在自己的实现中尝试使用或不使用,并查看哪个会为您产生错误的失败。
  • 请注意,条件 2*L[l] > K 更改为 2*L[l] > K+erasures_count。您可能在没有添加条件 +erasures_count 的情况下不会注意到任何副作用,但在某些情况下,解码将失败,而不应该失败。
  • 如果您只能修复一个错误或擦除,请检查您的条件是否为 2*L[l] > K+erasures_count 而不是 2*L[l] >= K+erasures_count(注意 > 而不是 >=)。
  • 如果您可以纠正 2*errors + erasures <= (n-k-2)(接近极限,例如,如果您有 10 个 ECC 符号,则只能纠正 4 个错误,而不是通常的 5 个),则请检查您的综合和 BM 算法内部的循环:如果综合以常数项 x^0 的 0 系数开头(有时建议在书中),那么您的综合将被移位,然后您的 BM 内部循环必须从 1 开始,并以 n-k+1 结束,而不是从未移位的 0:(n-k)
  • 如果您可以修复除最后一个符号(最后一个

我在源代码中犯了一个错误:不是 2*L[l] >= K 而是 2*L[l] > K,否则你只能修复一个错误。我编辑了我的答案以纠正这个错误。 - gaborous
对于那些感兴趣的人,我在Python中实现了一个完整的通用Reed-Solomon编码器/解码器(包括上面提到的BM算法):unireedsolomon。该库的更快但等效(但注释较少)版本在此处:reedsolomon - gaborous
你是否解决了 L = K - L - {擦除计数} 的问题?还有一种选择是使用修改后的(Forney)综合值,从修改后的综合值中消除擦除,然后使用任何 RS 解码器方法与修改后的综合值生成错误定位多项式,然后将其与擦除定位多项式相结合(相乘)。 - rcgldr
@rcgldr非常感谢您的建议。然而,我仍然不知道,这是一个未解决的问题(尽管我提到的使用L = K - L的替代方法已经为我解决了问题)。我对您所描述的解码方法一无所知,因此可能剩下的解码方法与L = K - L - {erasures_count}不太匹配,这可能就是为什么它在我的特定解码器上无法正常工作的原因... - gaborous
请点击链接查看NASA文章,该文章解释了如何在第202页上生成修改后的(Forney)综合症(它从第200页开始,因此是第三页)。关键方程式是第202页上的(7)。 - rcgldr

2

我参考了你的Python代码并用C ++进行了重写。

它能够正常工作,你提供的信息和示例代码真的非常有帮助。

我发现错误可能是由于M值引起的。

根据“数据传输的代数编码”,M值不应该是if-else语句的成员。

在移除M后,我没有遇到任何错误失败。(或者仅是还未出现)

非常感谢您分享知识。

    // calculate C
    Ref<ModulusPoly> T = C;

    // M just for shift x
    ArrayRef<int> m_temp(2);
    m_temp[0]=1;
    m_poly = new ModulusPoly(field_, m_temp);

    // C = C - d*B*x
    ArrayRef<int> d_temp(1);
    d_temp[0] = d;
    Ref<ModulusPoly> d_poly (new ModulusPoly(field_, d_temp));
    d_poly = d_poly->multiply(m_poly);
    d_poly = d_poly->multiply(B);
    C = C->subtract(d_poly);

    if(2*L<=n+e_size-1 && d!=0)
    {
        // b = d^-1
        ArrayRef<int> b_temp(1);
        b_temp[0] = field_.inverse(d); 
        b_poly = new ModulusPoly(field_, b_temp);

        L = n-L-e_size;
        // B = B*b = B*d^-1
        B = T->multiply(b_poly);
    }
    else
    {
        // B = B*x
        B = B->multiply(m_poly);
    }

你能否澄清一下你对于 M 所做的代码更改,以使其正常工作? - Craig McQueen
你不应该移除 M。确实,你可以在特定条件下移除它。M 值只是一个额外的标志,以确保在必要时正确更新。我有另一段代码移除了 M,但在某些情况下提供了错误的结果。因此,我强烈建议保留 M “标志”,除非你能找到数学证明表明它不是必要的。 - gaborous
我已经更新了我的代码,从if-else语句中删除了M。 我认为唯一的区别发生在: (2*L[l] == K+erasures_count and M[l] != 0) 你的代码将与我的不同。据我所知,如果有太多的错误和擦除码字,BM算法无法确保Sigma是正确的。 有时它可能会生成错误的位置多项式并修复错误的码字。我是对的吗? - Cliff
是的,BM算法可能会出错,但只有在达到或超过单态界限时才会出错。在此之下,数学上保证始终能够正确解码(从我的大量测试中看来,这似乎是正确的)。 - gaborous
非常感谢分享,Cliff。我相信这会对其他人有所帮助(并且在不同语言中拥有算法以便复现总是很好的)。 - gaborous

1
这个答案是针对gaborous的评论提供的。它没有展示如何修改Berlekamp Massey来处理擦除。相反,它展示了生成修改(Forney)综合数的替代方法,该方法通过消除综合数中的擦除,然后可以将修改后的综合数与任何标准错误解码器算法一起使用来生成错误定位多项式。然后将擦除和错误定位多项式组合(相乘)以创建勘误定位多项式。
这种方法并不是最优的,因为有方法可以增强常见的解码器来处理擦除和错误,但它更加通用。
以下是用于在C中生成修改后的综合数的示例遗留代码。此代码中的“向量”包括一个大小和一个数组。vErsf是一个与数据(码字)大小相同的数组,在非擦除位置为零,在擦除位置为一。擦除被转换为擦除定位多项式(Root2Poly),然后用于将标准综合数转换为修改后的(Forney)综合数。
typedef unsigned char ELEM;             /* element type */

typedef struct{                         /* vector structure */
    ELEM  size;
    ELEM  data[255];
}VECTOR;

static VECTOR   vData;                  /* data */
static VECTOR   vErsf;                  /* erasure flags (same size as data) */
static VECTOR   vSyndromes;             /* syndromes */
static VECTOR   vErsLocators;           /* erasure locators */
static VECTOR   pErasures;              /* erasure poly */
static VECTOR   vModSyndromes;          /* modified syndromes */

/*----------------------------------------------------------------------*/
/*      Root2Poly(pPDst, pVSrc)         convert roots into polynomial   */
/*----------------------------------------------------------------------*/
static void Root2Poly(VECTOR *pPDst, VECTOR *pVSrc)
{
int i, j;

    pPDst->size = pVSrc->size+1;
    pPDst->data[0] = 1;
    memset(&pPDst->data[1], 0, pVSrc->size*sizeof(ELEM));
    for(j = 0; j < pVSrc->size; j++){
        for(i = j; i >= 0; i--){
            pPDst->data[i+1] = GFSub(pPDst->data[i+1],
                    GFMpy(pPDst->data[i], pVSrc->data[j]));}}
}

/*----------------------------------------------------------------------*/
/*      GenErasures     generate vErsLocat...and pErasures              */
/*                                                                      */
/*      Scan vErsf right to left; for each non-zero flag byte,          */
/*      set vErsLocators to Alpha**power of corresponding location.     */
/*      Then convert these locators into a polynomial.                  */
/*----------------------------------------------------------------------*/
static int GenErasures(void)
{
int     i, j;
ELEM    eLcr;                           /* locator */

    j = 0;                              /* j = index into vErs... */
    eLcr = 1;                           /* init locator */
    for(i = vErsf.size; i;){
        i--;
        if(vErsf.data[i]){              /* if byte flagged */
            vErsLocators.data[j] = eLcr; /*     set locator */
            j++;
            if(j > vGenRoots.size){      /*    exit if too many */
                return(1);}}
        eLcr = GFMpy(eLcr, eAlpha);}     /* bump locator */

    vErsLocators.size = j;              /* set size */

    Root2Poly(&pErasures, &vErsLocators); /* generate poly */

    return(0);
}

/*----------------------------------------------------------------------*/
/*      GenModSyndromes         Generate vModSyndromes                  */
/*                                                                      */
/*      generate modified syndromes                                     */
/*----------------------------------------------------------------------*/
static void     GenModSyndromes(void)
{
int     iM;                             /* vModSyn.. index */
int     iS;                             /* vSyn.. index */
int     iP;                             /* pErs.. index */

    vModSyndromes.size = vSyndromes.size-vErsLocators.size;

    if(vErsLocators.size == 0){         /* if no erasures, copy */
        memcpy(vModSyndromes.data, vSyndromes.data, 
               vSyndromes.size*sizeof(ELEM));
        return;}

    iS = 0;
    for(iM = 0; iM < vModSyndromes.size; iM++){ /* modify */
        vModSyndromes.data[iM] = 0;
        iP = pErasures.size;
        while(iP){
            iP--;
            vModSyndromes.data[iM] = GFAdd(vModSyndromes.data[iM],
                GFMpy(vSyndromes.data[iS], pErasures.data[iP]));
            iS++;}
        iS -= pErasures.size-1;}
}

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接