如果有帮助的话,这里是我用于此类事情的一些C代码。它对Mad Physicist所说的内容没有太大的补充。
首先,一个公式。如果你通过一些点拟合一条直线y^ : x->a*x+b,那么误差由以下公式给出:
E = Sum{ sqr(y[i]-y^(x[i])) }/ N = Vy - Cxy*Cxy/Vx
where
Vx is the variance of the xs
Vy that of the ys
Cxy the covariance of the as and the ys
下面的代码使用一个结构体来保存均值、方差、协方差和计数。
函数moms_acc_pt()在添加新点时更新这些值。函数moms_line()返回线的a和b,以及上述误差。返回值中的fmax(0,)是为了防止近乎完美拟合的情况下舍入误差导致结果变为负数(在数学上是非负数)。
虽然有可能编写一个从momentsT中删除点的函数,但我发现通过复制momentsT并将点累加到副本中、获取线并将副本保留在适合该点的一侧、原始momentsT保留在另一侧更容易处理。
typedef struct
{ int n; // number points
double xbar,ybar; // means of x,y
double Vx, Vy; // variances of x,y
double Cxy; // covariance of x,y
} momentsT;
// update the moments to include the point x,y
void moms_acc_pt( momentsT* M, double x, double y)
{ M->n += 1;
double f = 1.0/M->n;
double dx = x-M->xbar;
double dy = y-M->ybar;
M->xbar += f*dx;
M->ybar += f*dy;
double g = 1.0 - f;
M->Vx = g*(M->Vx + f*dx*dx);
M->Cxy = g*(M->Cxy + f*dx*dy);
M->Vy = g*(M->Vy + f*dy*dy);
}
// return the moments for the combination of A and B (assumed disjoint)
momentsT moms_combine( const momentsT* A, const momentsT* B)
{
momentsT C;
C.n = A->n + B->n;
double alpha = (double)A->n/(double)C.n;
double beta = (double)B->n/(double)C.n;
C.xbar = alpha*A->xbar + beta*B->xbar;
C.ybar = alpha*A->ybar + beta*B->ybar;
double dx = A->xbar - B->xbar;
double dy = A->ybar - B->ybar;
C.Vx = alpha*A->Vx + beta*B->Vx + alpha*beta*dx*dx;
C.Cxy= alpha*A->Cxy+ beta*B->Cxy+ alpha*beta*dx*dy;
C.Vy = alpha*A->Vy + beta*B->Vy + alpha*beta*dy*dy;
return C;
}
// line is y^ : x -> a*x + b; return Sum{ sqr( y[i] - y^(x[i])) }/N
double moms_line( momentsT* M, double* a, double *b)
{ *a = M->Cxy/M->Vx;
*b = M->ybar - *a*M->xbar;
return fmax( 0.0, M->Vy - *a*M->Cxy);
}