Java中的梯度下降线性回归

Question

Java中的梯度下降线性回归

3

这可能有点冒险，但我想知道是否有人能够查看这个。我是否正确地进行了线性回归的批量梯度下降？对于单个独立变量和截距，它给出了预期的答案，但对于多个独立变量则不然。

/**
 * (using Colt Matrix library)
 * @param alpha Learning Rate
 * @param thetas Current Thetas
 * @param independent 
 * @param dependent
 * @return new Thetas
 */
public DoubleMatrix1D descent(double         alpha,
                              DoubleMatrix1D thetas,
                              DoubleMatrix2D independent,
                              DoubleMatrix1D dependent ) {
    Algebra algebra     = new Algebra();

    // ALPHA*(1/M) in one.
    double  modifier    = alpha / (double)independent.rows();

    //I think this can just skip the transpose of theta.
    //This is the result of every Xi run through the theta (hypothesis fn)
    //So each Xj feature is multiplied by its Theata, to get the results of the hypothesis
    DoubleMatrix1D hypothesies = algebra.mult( independent, thetas );

    //hypothesis - Y  
    //Now we have for each Xi, the difference between predictect by the hypothesis and the actual Yi
    hypothesies.assign(dependent, Functions.minus);

    //Transpose Examples(MxN) to NxM so we can matrix multiply by hypothesis Nx1
    DoubleMatrix2D transposed = algebra.transpose(independent);

    DoubleMatrix1D deltas     = algebra.mult(transposed, hypothesies );


    // Scale the deltas by 1/m and learning rate alhpa.  (alpha/m)
    deltas.assign(Functions.mult(modifier));

    //Theta = Theta - Deltas
    thetas.assign( deltas, Functions.minus );

    return( thetas );
}

- Jeremy

关于算法步骤和数学方面，我看不到任何问题。我不熟悉Colt库，但我认为函数名称具有表现力并且意义明确。我假设您将independent矩阵的第一列作为包含全部值为1的向量来估计截距。在多元回归中，值如何不同？ - iTech

第一列是截距项。我认为它可能是正确的，但我的测试数据出现了共线性。我创建了测试数据，使得我有x1和x2，其中x2只是2 * x1。我将因变量设置为y = .5 * x1 +（1/3）* x2。它收敛了，但不是我所期望的。 - Jeremy

例如在上述情况中，我得到了.6333（x1）和.2666（x2）的Theta值。它确实正确地选择了我放入函数中的任何截距。（例如y = .5 * x1 +（1/3）* x2 + 10）。如果我在相同的数据集上使用WEKA，它会自动处理共线性，并只执行1.1666 * x1。 - Jeremy

2个回答

0

我认为添加

  // ALPHA*(1/M) in one.
double  modifier    = alpha / (double)independent.rows();

这是一个不好的想法，因为你把梯度函数和梯度下降算法混在一起了，更好的做法是在Java中将gradientDescent算法放在公共方法中，如下所示：

import org.la4j.Matrix;
import org.la4j.Vector;

public Vector gradientDescent(Matrix x, Matrix y, int kmax, double alpha)
{
    int k=1;
    Vector  thetas = Vector.fromArray(new double[] { 0.0, 0.0});
    while (k<kmax)
    {
        thetas = thetas.subtract(gradient(x, y, thetas).multiply(alpha));
        k++;
    }
    return thetas;
}

- Jose Luis Soto Posada

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- iTech · Accepted Answer

根据您的评论，您的实现没有问题，问题在于生成x2时引入了collinearity。这在回归估计中是有问题的。

为了测试您的算法，可以生成两个独立的随机数列。选择w0、w1和w2的值，即截距、x1和x2的系数。计算依赖值y。

然后查看您的随机/批量梯度下降算法是否能恢复w0、w1和w2的值。