如何高效地将大量小浮点数相加？

Question

如何高效地将大量小浮点数相加？

13

假设你有一个包含100000000个32位浮点数的数组，每个浮点数的值都在0.0和1.0之间。如果你尝试这样累加它们：

result = 0.0;
for (i = 0; i < 100000000; i++) {
    result += array[i];
}

当 result 大于 1.0 时，你会遇到问题。

那么有哪些更准确的方法来执行求和操作呢？

- splicer

2

你为什么期望结果小于1？我感到困惑！ - lexu

我认为他的意思是，一旦结果超过1.0，问题就开始出现了。具体是什么问题我不知道，但这是我的理解。 - Adam Robinson

1

我认为从示例代码中可以推断出它不是Python。 - Ken

@splicer：你能具体说明一下吗？你所说的“问题”是指什么？ - ire_and_curses

我正在尝试避免积累数值误差。Daniel Pryden的答案链接到的维基百科文章很好地解释了这个问题。 - splicer

显示剩余2条评论

5个回答

1

如果你可以容忍一点额外的空间（在Java中）：

float temp = new float[1000000];
float temp2 = new float[1000];
float sum = 0.0f;
for (i=0 ; i<1000000000 ; i++) temp[i/1000] += array[i];
for (i=0 ; i<1000000 ; i++) temp2[i/1000] += temp[i];
for (i=0 ; i<1000 ; i++) sum += temp2[i];

基本上是标准的分治算法。但这仅适用于数值随机分布的情况，如果前500000000个数字是1e-12而后500000000个数字要大得多，则无法奏效。

但在进行任何操作之前，最好将结果累加到double中。这将有很大帮助。

- Rex Kerr

1

假设使用C或C++，将结果设为double。

- Tuomas Pelkonen

是的，那会有所帮助，但如果您要对远超100000000个值进行求和怎么办？我选择100000000只是为了举例。 - splicer

0

如果在.NET中使用LINQ .Sum()扩展方法，该方法存在于IEnumerable上。那么代码就是这样的：

var result = array.Sum();

- Rob Packwood

谢谢，但我应该更具体：我正在使用C和OpenCL进行工作。 - splicer

这也并没有真正解决错误积累的问题。 - recursive

0

绝对最优的方法是使用优先队列，具体做法如下：

PriorityQueue<Float> q = new PriorityQueue<Float>();
for(float x : list) q.add(x);
while(q.size() > 1) q.add(q.pop() + q.pop());
return q.pop();

（这段代码假设数字都是正数；一般来说，队列应该按绝对值排序）

解释：给定一个数字列表，为了尽可能精确地将它们相加，你应该努力使这些数字接近，即消除小数字和大数字之间的差距。这就是为什么你想要将最小的两个数字相加，从而增加列表中的最小值，减少列表中最小值和最大值之间的差异，并将问题规模减少1。

不幸的是，我不知道这个如何向量化，考虑到你正在使用OpenCL。但我几乎确定它是可以的。你可以看一下关于向量算法的书，它们实际上非常强大：数据并行计算的向量模型

- jkff

2

实际上，这不是一个最优解。你想要最小化中间结果的绝对值，并不意味着你总是应该先加最小的数。例如，如果你想要求和[1.01, -0.001, -1.02, 0.0012]，最好表达为(0.0012 - 0.001) + (1.01 - 1.02)。 - quant_dev

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Daniel Pryden · Accepted Answer

看起来你想使用Kahan求和算法。

根据维基百科的介绍，

The Kahan summation algorithm (also known as compensated summation) significantly reduces the numerical error in the total obtained by adding a sequence of finite precision floating point numbers, compared to the obvious approach. This is done by keeping a separate running compensation (a variable to accumulate small errors).

In pseudocode, the algorithm is:
function kahanSum(input)
 var sum = input[1]
 var c = 0.0          //A running compensation for lost low-order bits.
 for i = 2 to input.length
  y = input[i] - c    //So far, so good: c is zero.
  t = sum + y         //Alas, sum is big, y small, so low-order digits of y are lost.
  c = (t - sum) - y   //(t - sum) recovers the high-order part of y; subtracting y recovers -(low part of y)
  sum = t             //Algebraically, c should always be zero. Beware eagerly optimising compilers!
 next i               //Next time around, the lost low part will be added to y in a fresh attempt.
return sum