TensorFlow线性回归结果与Numpy/SciKit-Learn不匹配。

Question

TensorFlow线性回归结果与Numpy/SciKit-Learn不匹配。

pythonnumpytensorflowscikit-learnlinear-regression

5

我正在学习Aurelien Geron的《Hands-On Machine Learning》一书中的Tensorflow示例。然而，我无法复制这个支持笔记本中的简单线性回归示例。为什么Tensorflow的结果与Numpy / SciKit-Learn不匹配？

据我所知，没有优化（我们使用正规方程，因此只涉及矩阵计算），而且答案看起来太不同了，无法归咎于精度误差。

import numpy as np
import tensorflow as tf
from sklearn.datasets import fetch_california_housing

housing = fetch_california_housing()
m, n = housing.data.shape
housing_data_plus_bias = np.c_[np.ones((m, 1)), housing.data]

X = tf.constant(housing_data_plus_bias, dtype=tf.float32, name="X")
y = tf.constant(housing.target.reshape(-1, 1), dtype=tf.float32, name="y")
XT = tf.transpose(X)
theta = tf.matmul(tf.matmul(tf.matrix_inverse(tf.matmul(XT, X)), XT), y)

with tf.Session() as sess:
    theta_value = theta.eval()

theta_value

答案：

array([[ -3.74651413e+01],
       [  4.35734153e-01],
       [  9.33829229e-03],
       [ -1.06622010e-01],
       [  6.44106984e-01],
       [ -4.25131839e-06],
       [ -3.77322501e-03],
       [ -4.26648885e-01],
       [ -4.40514028e-01]], dtype=float32)

###### 与纯NumPy的比较

X = housing_data_plus_bias
y = housing.target.reshape(-1, 1)
theta_numpy = np.linalg.inv(X.T.dot(X)).dot(X.T).dot(y)

print(theta_numpy)

回答：

[[ -3.69419202e+01]
 [  4.36693293e-01]
 [  9.43577803e-03]
 [ -1.07322041e-01]
 [  6.45065694e-01]
 [ -3.97638942e-06]
 [ -3.78654265e-03]
 [ -4.21314378e-01]
 [ -4.34513755e-01]]

###### 与Scikit-Learn比较

from sklearn.linear_model import LinearRegression
lin_reg = LinearRegression()
lin_reg.fit(housing.data, housing.target.reshape(-1, 1))

print(np.r_[lin_reg.intercept_.reshape(-1, 1), lin_reg.coef_.T])

回答：

[[ -3.69419202e+01]
 [  4.36693293e-01]
 [  9.43577803e-03]
 [ -1.07322041e-01]
 [  6.45065694e-01]
 [ -3.97638942e-06]
 [ -3.78654265e-03]
 [ -4.21314378e-01]
 [ -4.34513755e-01]]

更新：我的问题听起来与这个类似，但是遵循建议并没有解决问题。

- atkat12

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- ChoF · Accepted Answer

我刚刚比较了tensorflow和numpy的结果。由于您在X和y中使用了dtype=tf.float32，因此我将在以下numpy示例中使用np.float32：

X_numpy = housing_data_plus_bias.astype(np.float32)
y_numpy = housing.target.reshape(-1, 1).astype(np.float32)

现在让我们试着比较使用tensorflow中的tf.matmul(XT, X)和numpy中的X.T.dot(X)得出的结果：

with tf.Session() as sess:
    XTX_value = tf.matmul(XT, X).eval()
XTX_numpy = X_numpy.T.dot(X_numpy)

np.allclose(XTX_value, XTX_numpy, rtol=1e-06) # True
np.allclose(XTX_value, XTX_numpy, rtol=1e-07) # False

这是与float精度有关的问题。如果你将精度改为tf.float64和np.float64，则theta的结果将相同。