用Python进行简单线性回归

23

我正在尝试实现这个单变量算法来查找截距和斜率:

线性回归的算法

以下是我的 Python 代码,用于更新拦截和斜率。但它并没有收敛。RSS 随着迭代而增加,而不是减少,经过一些迭代后变为无穷大。我在实现算法时没有发现任何错误。如何解决这个问题? 我已经附上了 csv 文件。

以下是代码:

import pandas as pd
import numpy as np

#Defining gradient_decend
#This Function takes X value, Y value and vector of w0(intercept),w1(slope)
#INPUT FEATURES=X(sq.feet of house size)
#TARGET VALUE=Y (Price of House)
#W=np.array([w0,w1]).reshape(2,1)
#W=[w0,
#    w1]

def gradient_decend(X,Y,W):
    intercept=W[0][0]
    slope=W[1][0]

    #Here i will get a list
    #list is like this
    #gd=[sum(predicted_value-(intercept+slope*x)),
    #     sum(predicted_value-(intercept+slope*x)*x)]
    gd=[sum(y-(intercept+slope*x) for x,y in zip(X,Y)),
        sum(((y-(intercept+slope*x))*x) for x,y in zip(X,Y))]
    return np.array(gd).reshape(2,1)

#Defining Resudual sum of squares
def RSS(X,Y,W):
    return sum((y-(W[0][0]+W[1][0]*x))**2 for x,y in zip(X,Y))


#Reading Training Data
training_data=pd.read_csv("kc_house_train_data.csv")

#Defining fixed parameters
#Learning Rate
n=0.0001
iteration=1500
#Intercept
w0=0
#Slope
w1=0

#Creating 2,1 vector of w0,w1 parameters
W=np.array([w0,w1]).reshape(2,1)

#Running gradient Decend
for i in range(iteration):
     W=W+((2*n)*    (gradient_decend(training_data["sqft_living"],training_data["price"],W)))
     print RSS(training_data["sqft_living"],training_data["price"],W)

这里 是CSV文件。


这是来自华盛顿大学机器学习课程的内容,我也参加了这门课程,非常有趣和启发性。我建议您使用Coursera上的论坛,您可以从导师、志愿者和其他学生那里得到非常好的答案。https://www.coursera.org/learn/ml-regression/discussions - alvas
3个回答

11
首先,我发现在编写机器学习代码时,最好不要使用复杂的列表推导式,因为任何可以迭代的东西,
  • 如果使用普通的循环和缩进编写,阅读起来更容易,或者
  • 可以使用numpy广播完成

而使用适当的变量名称可以帮助您更好地理解代码。使用Xs、Ys、Ws等简写只有在您擅长数学时才有用。就个人而言,在编写Python代码时,我不会在代码中使用它们。从import this中得出:明确比隐式更好。

我的经验法则是记住,如果我写的代码一周后无法阅读,那么这是糟糕的代码。


首先,让我们确定梯度下降的输入参数,您需要:

  • feature_matrixX矩阵,类型:numpy.array,大小为N * D的矩阵,其中N是行/数据点的数量,D是列/特征的数量)
  • outputY向量,类型:numpy.array,大小为N的向量)
  • initial_weights(类型:numpy.array,大小为D的向量)。

此外,为了检查收敛性,您还需要:

  • step_size(在迭代中更改权重时的变化量;类型:float,通常是一个小数)
  • tolerance(打破迭代的标准,当梯度大小小于容差时,假设您的权重已经收敛,类型:float,通常是一个小数,但比步长大得多)。

现在看代码。

def regression_gradient_descent(feature_matrix, output, initial_weights, step_size, tolerance):
    converged = False # Set a boolean to check for convergence
    weights = np.array(initial_weights) # make sure it's a numpy array

    while not converged:
        # compute the predictions based on feature_matrix and weights.
        # iterate through the row and find the single scalar predicted
        # value for each weight * column.
        # hint: a dot product can solve this easily
        predictions = [??? for row in feature_matrix]
        # compute the errors as predictions - output
        errors = predictions - output
        gradient_sum_squares = 0 # initialize the gradient sum of squares
        # while we haven't reached the tolerance yet, update each feature's weight
        for i in range(len(weights)): # loop over each weight
            # Recall that feature_matrix[:, i] is the feature column associated with weights[i]
            # compute the derivative for weight[i]:
            # Hint: the derivative is = 2 * dot product of feature_column  and errors.
            derivative = 2 * ????
            # add the squared value of the derivative to the gradient magnitude (for assessing convergence)
            gradient_sum_squares += (derivative * derivative)
            # subtract the step size times the derivative from the current weight
            weights[i] -= (step_size * derivative)

        # compute the square-root of the gradient sum of squares to get the gradient magnitude:
        gradient_magnitude = ???
        # Then check whether the magnitude is lower than the tolerance.
        if ???:
            converged = True
    # Once it while loop breaks, return the loop.
    return(weights)

我希望这份扩展的伪代码能够帮助你更好地理解梯度下降算法。我不会填写 ???,以免破坏你的作业。


请注意,您的RSS代码也是不可读和难以维护的。只需执行以下操作即可简单完成:
>>> import numpy as np
>>> prediction = np.array([1,2,3])
>>> output = np.array([1,1,5])
>>> residual = output - prediction
>>> RSS = sum(residual * residual)
>>> RSS
5

学习numpy基础知识对于机器学习和矩阵向量操作非常重要,可以避免进行繁琐的迭代:http://docs.scipy.org/doc/numpy-1.10.1/user/basics.html


1
你可以轻松地将公差代码更改为迭代次数(for循环),只需要更改如何控制外部循环即可。但是我的偏好是采用公差收敛(while循环)。 - alvas

4
我已经解决了我的问题!
以下是解决方法。
import numpy as np
import pandas as pd
import math
from sys import stdout

#function Takes the pandas dataframe, Input features list and the target column name
def get_numpy_data(data, features, output):

    #Adding a constant column with value 1 in the dataframe.
    data['constant'] = 1    
    #Adding the name of the constant column in the feature list.
    features = ['constant'] + features
    #Creating Feature matrix(Selecting columns and converting to matrix).
    features_matrix=data[features].as_matrix()
    #Target column is converted to the numpy array
    output_array=np.array(data[output])
    return(features_matrix, output_array)

def predict_outcome(feature_matrix, weights):
    weights=np.array(weights)
    predictions = np.dot(feature_matrix, weights)
    return predictions

def errors(output,predictions):
    errors=predictions-output
    return errors

def feature_derivative(errors, feature):
    derivative=np.dot(2,np.dot(feature,errors))
    return derivative


def regression_gradient_descent(feature_matrix, output, initial_weights, step_size, tolerance):
    converged = False
    #Initital weights are converted to numpy array
    weights = np.array(initial_weights)
    while not converged:
        # compute the predictions based on feature_matrix and weights:
        predictions=predict_outcome(feature_matrix,weights)
        # compute the errors as predictions - output:
        error=errors(output,predictions)
        gradient_sum_squares = 0 # initialize the gradient
        # while not converged, update each weight individually:
        for i in range(len(weights)):
            # Recall that feature_matrix[:, i] is the feature column associated with weights[i]
            feature=feature_matrix[:, i]
            # compute the derivative for weight[i]:
            #predict=predict_outcome(feature,weights[i])
            #err=errors(output,predict)
            deriv=feature_derivative(error,feature)
            # add the squared derivative to the gradient magnitude
            gradient_sum_squares=gradient_sum_squares+(deriv**2)
            # update the weight based on step size and derivative:
            weights[i]=weights[i] - np.dot(step_size,deriv)

        gradient_magnitude = math.sqrt(gradient_sum_squares)
        stdout.write("\r%d" % int(gradient_magnitude))
        stdout.flush()
        if gradient_magnitude < tolerance:
            converged = True
    return(weights)


#Example of Implementation
#Importing Training and Testing Data
# train_data=pd.read_csv("kc_house_train_data.csv")
# test_data=pd.read_csv("kc_house_test_data.csv")

# simple_features = ['sqft_living', 'sqft_living15']
# my_output= 'price'
# (simple_feature_matrix, output) = get_numpy_data(train_data, simple_features, my_output)
# initial_weights = np.array([-100000., 1., 1.])
# step_size = 7e-12
# tolerance = 2.5e7
# simple_weights = regression_gradient_descent(simple_feature_matrix, output,initial_weights, step_size,tolerance)
# print simple_weights

0

这很简单

def mean(values):
    return sum(values)/float(len(values))

def variance(values, mean):
    return sum([(x-mean)**2 for x in values])

def covariance(x, mean_x, y, mean_y):
    covar = 0.0
    for i in range(len(x)):
        covar+=(x[i]-mean_x) * (y[i]-mean_y)
    return covar
def coefficients(dataset):
    x = []
    y = []

    for line in dataset:
        xi, yi = map(float, line.split(','))

        x.append(xi)
        y.append(yi)

    dataset.close()                             

    x_mean, y_mean = mean(x), mean(y)

    b1 = covariance(x, x_mean, y, y_mean)/variance(x, x_mean)
    b0 = y_mean-b1*x_mean

    return [b0, b1]

dataset = open('trainingdata.txt')

b0, b1 = coefficients(dataset)

n=float(raw_input())

print(b0+b1*n)

参考资料:www.machinelearningmastery.com/implement-simple-linear-regression-scratch-python/


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接