智能算法识别符号错误的数字。

3

我有一个包含1070个数字(账户余额)的列表,从一个源系统中提取出来,但由于技术问题,其中一些值的符号错误(例如,我得到100而不是-100)

我只知道一件事,即所有值都有正确的符号后,它们的总和将为零(但我不知道多少个数字的符号是错误的)。

我尝试使用下面的算法进行暴力解决方案,但仅当我只组合3个数字时(即对1070个元素中的3个数字进行符号翻转)就已经需要很长时间。

我认为,即使我优化代码,例如,如果我反转1070个元素中的565个元素,则组合数量如此之大(5,740547E+319),以至于任何暴力尝试都会失败。

基本上,我正在尝试反转列表中的一个值,然后检查总和是否为0,然后是2个值、3个值等等。

当然,如果只有6个数字,它运行得很好,如下所示。

有没有人有更聪明的方法来解决这个问题?

import itertools
import numpy as np

test_list = np.array([1,2,3,4,5,9]) 
index_list = np.arange(0, len(test_list), dtype=np.intp)
sum_list = test_list.sum()

for L in range(1, len(test_list)+1):
    print('current list lenght: ' + str(L))
    for subset in itertools.combinations(index_list, L):
        cur_list = np.array(test_list)        
        check = sum_list - (2 * cur_list[[subset]].sum())
        if check == 0:
            print('found:', subset)

这个简单的例子有多个解决方案,实际值更加独特(正负都有),所以不必担心存在多个解决方案。
我的列表如下(已删除零值):
[511720.02, 39409.62, 14680.43, 731.29, 10387.38, 2256.56, 262681.13, 5897.66, 11060.76, 144271.31, 25914.87, 13015.5, 19316.85, 40208606.69, -5846350.44, -33538929.18, -540682.74, 68089.94, 561696.88, 58050.78, 469706.4, 524.67, 472701.44, 2158456.56, 1158292.03, 100024.37, 18134.69, 337825.89, 10180.76, 2228315.58, 772.53, 8549.91, 43307.27, 39642.75, 72176.45, 323822.22, 962998.36, 135.5, 134388.73, 1473523.1, 275221.49, 34267.25, 549705.04, 247.87, 12958.72, 847714.28, 1781791.26, 3010.0, 1895276.98, 14434.11, 104587.1, 38056.23, 24962.62, 92497.96, 1227702.25, 110346.36, 391.85, 208.41, 8756.85, 782736.47, 3147519.19, 194767.75, 295568.31, 100694.2, 1893.93, 1133698.81, 523.2, 280280.91, 3078.82, 289841.77, 515.2, 175112.67, 24084.99, 35048.85, 49398.15, 41109.64, 1023.53, 782319.87, 28.14, 20.75, 13227.61, 14765.65, 41182.6, 4297.29, 1340.99, 826014.08, 6111.16, 965389.36, 673482.76, 109298.96, 39807.35, 311.79, 437149.28, 1950968.88, 325.21, 2764917.36, 8838.14, 202885.02, 25.49, 999.66, 743027.37, 759.22, 2950.0, 96146.21, 165.2, 21956.69, 1811289.92, 58.0, 6982282.03, 169464.71, 537268.28, 32655.3, 2474912.35, 1066214.61, 87962.0, 81526.67, 49053309.4, 45032405.02, 49548828.59, 48930853.47, 3533339.79, 3343489.4, 61758.14, 52772.61, 1095708.2, 819624.62, 141.17, 8611.75, 88752.23, 14669477.14, 22308370.28, 1215208.59, -183222.0, 99579.07, 621585.75, -26967.75, -787397.0, 18509117.59, 1212170.47, 60214.78, -749.37, 77455.94, 787350.1, 32760.21, 8117.1, 18493.03, 8267483.6, -15627.16, 58625.11, 38622.21, 57.11, 47565.71, 3523.6, 3205.0, 113242.59, 73.76, 782949.24, 5368.09, -4300.0, 2160047.12, -3533.08, 2460435.1, 50990.31, 93463.6, 1487.07, 22470.0, 22470.0, 999344.79, 135237.18, 9281.62, 3430281.72, 7126642.53, -53338.0, 9412.32, 3344946.88, 83075155.43, 17856133.2, 13449.34, 3845.0, 689998.56, 337645.3, 5937356.47, -2491.16, 2199069.77, 167652.42, 3331515.65, 1124710.59, 34075.85, 494639.09, 385540.84, 5891.39, 14378783.44, 7111253.62, 1247695.94, 1.0, 1.0, 138841.22, 0.5, 5979482.63, 1263134.4, 1.0, 5700.4, 12000000.0, 6100000.0, 1850000.0, 1000000.0, 500000.0, 51000.0, 77081.0, 19288853.74, 11600000.0, 4094000.0, 100000.0, 32400.0, 80000.0, 100000.0, 40810000.0, 33500000.0, 2336976.0, 812290.0, 177655.1, 1187432.01, 5224021.89, 4733986.5, 6174765.0, 1646604.82, 535146.3, 695690.19, 2858833.86, 863125.08, 859612.55, 631772.99, 100000.0, -200000.0, -177654.1, 10950.0, 25040.2, 1659670.0, 1143444.46, 3861357.87, 505930.63, 573072.0, 105000.0, 57500.0, -505929.63, -57499.0, 3402777.8, 54270.0, 105000.0, 44717.27, 1725619.5, 928972.0, 382924.78, 110015.09, 500.0, 100000.0, -1779887.33, -105000.0, -44717.27, -1311895.78, 19257544.9, 786109.73, 3885330.13, 9200000.0, 4347805.0, 464486.0, 13300000.0, 500000.0, 900000.0, 4000000.0, 550000.0, 46633144.87, 3657123.02, 2400000.0, 775000.0, 600000.0, 4980000.0, 715092.4, 250000.0, -715092.4, 5407573.99, 6516939.56, 71570143.67, -59186479.4, 1058557.22, -772031.22, 210000.0, 1753591.07, -1630557.07, 37550879.01, 2346460.63, -25169822.77, -1890843.63, 727919.84, -528559.84, 1073986.29, -556999.29, 5547811.48, -3130409.48, 8207549.38, 81942.17, 4489158.99, -7039209.16, -81942.17, -2143400.99, 89544215.17, -37091376.17, 1572011.0, -355928.0, 7478446.23, -595640.23, 32986206.95, -6325980.34, 13565577.0, -2178294.0, 32642339.14, -1005400.0, 3047129.66, 4586021.0, 67050.27, 60878226.61, -2654606.01, -2904000.0, 8393687.95, -18019508.06, -24442013.89, 1215208.59, 177570.85, -18047334.03, -1345725.77, 60214.78, -18212.52, -1823768.35, -354430.28, -8385048.47, 74505.96, -113659.2, 663309.32, -7694.18, -14335.0, 2160047.12, 4953.83, -228189.6, -100945.64, -8846.8, -2000000.0, -1102232.0, 205.2, 700000.0, 1550000.0, 500000.0, -29170503.05, -9741127.06, 2489697.93, -13599529.61, -8942775.78, -32726828.34, -14763682.06, -627846.99, -2340642.03, -4760527.23, -23148108.83, -33891335.24, -24840602.25, -39314900.39, -22877589.68, -2846969.98, -581650.3, -1269764.39, -665159.0, -2846969.98, -418.13, -4545969.04, -2281584.61, -15697827.3, -14778873.52, -18421816.45, -6420171.98, -781896.2, -1038820.41, -80650.97, -142900.19, -614263.27, -15871.81, -3386450.0, 3301075.71, -4058.89, -10407668.35, -8988470.62, -2236.0, -60178.92, -6202.25, -200000.0, -91700.3, -303039.14, -18543.12, -25149.83, -51468.44, -10100.0, -11755.05, -254974.16, -121982.45, -434.1, -853763.93, -100.08, 233961.15, 3521.99, -3776.15, -46650.33, -11412.02, -7970.43, -9798.65, -3942496.09, -1196495.36, -246462.91, -178049.1, -0.01, -18351592.93, -6551763.1, -5598.84, -264535.23, -425063.65, -48942.22, -6434834.64, -684003.27, -7617.96, -543416.91, -596612.84, -2896846.65, -203847.0, 5389.94, -37406548.94, -2609.35, -3543154.17, -83500.0, -1685476.4, -261289.34, -3342018.62, -4850000.0, 1500000.0, 4125000.0, -0.04, -89352.19, -178114.13, -54520.6, -15193100.0, -30960000.0, -3900000.0, -1000000.0, -10000000.0, -8361980.61, -8516941.31, -25000000.0, -5000000.0, -5000000.0, -8170843.24, -472127.77, -25000000.0, -1615441.58, -5697557.72, -18000.0, -810000.0, -362623.0, -526089.98, -8300000.0, -220000.0, 7.62, 15542.39, -25800625.31, -33795061.06, -20780610.38, -4665030.75, -39482892.35, -6840474.74, -18446631.31, -10152810.85, 2216320.89, -83747.0, -403572.0, 21763733.0, -3865209.08, -42267747.62, -263520555.75, 195087.06, -542668069.06, -54499963.22, -46824.57, -80703953.44, -34388289.6, -1120605.64, -1029611.15, -542066.51, -154157.05, -7932466.7, -13257498.78, -11093844.91, -2844716.35, -1361609.65, -5349483.43, -1439399.68, -2694.5, -1181.62, -418050.47, -400674.41, -246043195.73, -44799841.1, -326603330.98, -63937178.0, -37506337.11, -15076181.32, -488966.6, -230800995.11, -10509.4, -7575.0, -106584.18, -36375883.12, -45115231.08, -4305941.08, -9238.75, -516324.15, -10065.15, -2316745.72, -643656.45, -53267.0, -298191.95, -3537748.57, -399554.25, -13437.34, -2112.0, -93177.32, -306454.74, 24222.0, -12000.0, 14364.81, 19851.3, 39903.56, 12000.0, 175747.84, -353732.71, 926.66, -1311895.78, 1365487.35, 623561637.34, 1103900.52, 74.0, 36243106.75, 400408.55, 31379746.27, 1152340.15, 277020.97, 351518.86, 24432.73, 370113.15, 57822.7, 14334602.55, 5085111.06, 141482.62, 2960.2, 17357.8, 933574.99, 116678.06, 7191271.0, 1686605.85, 867515.1, -113006.87, 29156.65, 143305.39, 38947.38, 2992693.68, -504.46, -40954.64, 114478.4, 194370206.88, 24548647.12, 745995.52, 603106175.89, 15577814.31, 38526313.97, 43750465.62, 3.1, 35806829.4, 26656590.6, 1957311.8, 8649.7, 7575.15, 5230.3, 10083.33, 1076.43, 1047611.42, 172125.45, 3686.0, 134997.45, 8605.48, 677961.04, 82822.75, 133569.18, 1961196.0, 91776.21, 245448.12, 821.0, 1341.74, 137930.46, -8990.0, -26766176.74, 424769.32, 142428.86, -18789.13, 268399.38, 50263.7, -85550.06, 125538.91, 6263.29, 19500.75, 245829.52, -1365487.35, 40307915.86, 295222.37, -96036.81, 3166665.78, 1932755.66, 40400.0, -525840.25, 266808.43, 199000.0, -3737.2, 492669.16, -1969978.35, 1708500.83, 2748188.97, 643513.78, 2964400.04, 642333.18, 341338.4, 15289.73, 380235.94, 310.13, 108315.14, 638.16, 11507.44, 48799.64, -382413.8, 406563.18, 11080.63, 129207.66, 32419.8, 465.23, 53502.74, 263531.85, 370042.5, 123341.62, -9786.44, -124800.25, 71673.89, 3318959.25, 597466.94, 182928.82, 28019.14, 319467.85, 1563.75, 1660609.86, -1651940.0, 2806793.29, 84600.0, 27733.67, 714.71, 31198.43, 224970.45, 10020.87, 68734.49, 103156.32, 246821.22, 19097.3, 256539.34, 144102.4, 12122.3, 2604333.41, 191148.48, 94008.23, 6807.85, 2723.89, 5438.84, 122772.35, 156237.16, 24024.39, 705426.92, 187992.86, 132421.04, 241566.49, 102682.38, 41216.24, 4088.52, 1070878.31, 409464.57, 170804.67, 187236.66, 1298978.37, 42352.12, 1483572.01, 2083796.08, 9745.32, 15594.62, 58735.3, 83374.68, 20845.17, 739655.98, 45957.1, 24405.0, 111086.39, 150567.3, 23445.92, 2990.48, 8795.7, 26405.51, 459362.72, 51837.97, 144697.12, 4458.01, 300.0, 29541.52, 1954234.69, 66521.56, 40637.54, 29210.19, 11358.95, 38928.39, 95187.44, 113840.33, 66843.61, 34180.65, 19445.79, 52893.81, 138057.3, 205782.91, 98563.65, 111475.3, 197494.77, 6158.15, 1601.86, 116963.83, -1182305.05, 233113.73, -401948.82, 129.03, 497791.24, 6686.47, 131146.21, 26251.23, 368095.15, 25589.06, 39885.15, 882466.81, 193038.84, 293224.67, 123307.79, -35000.0, 3231.0, -5653952.45, 5469195.26, 243.15, 32610.33, 11495.93, 689585.9, 745.95, 221338.04, 83678.41, 52225.97, 14265.34, 4517.46, 2348.95, 98884.76, 1929.14, 8676.78, 126999.52, 48450.0, 23018.67, 642.63, 11311.66, 86076.58, 59632.21, 1200.0, 244068.03, 738908.91, 1111.06, 53129.02, 18323.65, 254206.26, 16409.53, 22472.94, 49361.21, 796671.0, 2618127.73, 158343.65, 5502.42, -21168.21, -33261.68, -3124673.81, -52827.56, -75.82, -71499.0, -185842.54, 776135.12, -56041.39, -151492.94, 146416.64, -363807.65, -1041536159.36, 1042111293.86, -343423458.56, 343105397.4, -8215190.59, 8189738.0, 10790777.98, 88377.0, 3968265.59, 1089147.0, 177964.0, 203973.0, 249376.0, 1107136.21, -1302612.04, -20665748.07, -268427.98, -1032648.0, -255228.63, -627322.81, -2680023.09, -2759982.21, -1260600.0, 121498.6, 64347.8, 76153.8, -146127.65, -317598.55, -147069.85, 245057.06, -245058.0, -85600.0, -728220.16, -121788.4, 41998.97, -236262.33, 330582.33, 225235.91, -86822.99, 3280734.73, 1734714.87, 14327.0, 6795.0, 49137.0, 334939.46, -375322.2, 2108892.54, 205427.44]

1
这可能会对你有所帮助:https://zh.wikipedia.org/wiki/%E5%AD%90%E9%9B%86%E5%92%8C%E9%97%AE%E9%A2%98 - jrbergen
3
这只是不同形式的子集和问题。如果S是所有元素绝对值之和,那么你要寻找的是总和为S/2的子集。子集中的数字可以给予正号,其他数字则给予负号。由于问题陈述存在歧义,所以通常没有唯一解。 - Paul Hankin
我认为这个问题没有一个明确的解决方案 - 也就是说,在同一组数据中可能有几种解决方案。其中一种方法是查看当前的总和。假设它是200。你需要做的是找到数据集中任何总和为200的数字组合,然后反转它们的符号。目前只是一些想法。我会尝试实现这个方法。 - user2668284
1
最有前途的方法是减少搜索空间。检查所有2^1070种组合是不可能的。因此,我强烈建议通过应用一些商业知识来缩小搜索空间。如果这些是真实的银行账户,那么就会有一些模式,比如月工资等等... - Hans
1
值的范围是多少? - Kelly Bundy
显示剩余4条评论
1个回答

0

我可以想到很多解决这个问题的方法,但归根结底,你永远不会知道方程式的哪一边应该是负数。

+9-6-3 = 0

但这也适用于

-9+6+3 = 0

换句话说:即使你设法解决了方程式的两侧,你仍然会得到两个同样可能的结果。


1
我认为OP已经意识到了这一点,但即使算法无法找到明确的罪犯,它仍然可以提供帮助,例如,它可以要求用户仔细检查一些数字(但不是全部)。 - tobias_k
实际上,三个错误标志比1067更有可能。 - Olivier
@Olivier 实际上,一个错误的值(例如42而不是69)可能更有可能发生。 - Kelly Bundy
我完全同意这一点,可能只有5%到10%的值是错误的(基于一些样本检查),但是不可能知道哪些是错误的(很遗憾,那里没有逻辑...) - marcel f

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接