我将尝试使用MLPRegressor来训练和测试我的数据集。我有两个数据集(训练数据集和测试数据集),它们都具有完全相同的特征和标签列。以下是我的数据集示例:
训练数据集:
这是我的代码:
训练数据集:
Full,Gold Standard
1.176,3.571
4.231,3.467
3.75,4.333
3.519,3.5
1.154,2.833
3.2,3.643
2.692,3.4
3.611,2.733
4.0,2.393
2.727,1.933
3.529,3.423
2.647,2.733
1.538,2.786
2.0,2.967
2.647,2.533
1.786,2.552
5.0,5.0
3.158,4.6
1.875,2.733
测试数据集:
Full,Gold Standard
1.667,2.345
3.056,1.9
1.765,2.2
0.714,0.0
1.538,2.586
2.188,1.667
3.333,2.8
2.5,2.481
1.667,2.433
1.842,0.0
2.381,0.793
0.588,1.0
1.176,1.433
1.538,2.3
0.588,1.655
0.909,2.333
0.833,3.333
1.111,2.5
0.0,2.067
这是我的代码:
import csv
import numpy as np
import random
import os.path
from sklearn import preprocessing as pre
from sklearn.neural_network import MLPRegressor
with open('FullFeatures2017.csv') as f:
reader = csv.reader(f)
next(reader, None)
training = list(reader)
f.closed
#print("Row number of data training : ", len(training))
print
#------------training-------------------
train_data = [list(map(float, training[i])) for i in range(len(training))]
data1 = np.array(train_data)
print("Row number of training data : ", len(train_data))
X_train = data1[:, :-1]
y_train = data1[:, -1:]
scaler = pre.StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
print
#------------open csv data testing-------------------
with open('FullFeatures2016.csv') as f:
reader = csv.reader(f)
next(reader, None)
testing = list(reader)
f.closed
#print("Row number data testing : ", len(testing))
print
#------------testing-------------------
test_data = [list(map(float, testing[i])) for i in range(len(testing))]
data2 = np.array(test_data)
print("Row number of testing data : ", len(test_data))
X_test = data2[:, :-1]
y_test = data2[:, -1:]
X_test_scaled = scaler.fit_transform(X_test)
print
#------------Model Training-------------------
mlp = MLPRegressor(max_iter=500, learning_rate_init=0.1, random_state=1, solver='lbfgs', tol=0.001 )
y_train2 = np.ravel(y_train)
mlp.fit(X_train_scaled, y_train2)
#print(mlp.fit(X_train, y_train2))
print
#------------Model Testing or Prediction-------------------
prediction = mlp.predict(X_test_scaled)
print len(prediction)
print
print prediction
print
程序可以正常运行,但每次运行结果都不同。我已经尝试使用随机种子的数字(如1、2或3而非0),但结果仍然会改变。
有没有人知道如何使用MLPRegressor预测相同和一致的结果?
谢谢
np.random.seed(0)
,例如。 - erasmortgfit()
或fit_transform()
,就像你现在在这行代码中所做的那样:X_test_scaled = scaler.fit_transform(X_test)
。它会将测试数据缩放到与训练数据不同的比例,从而导致错误的结果。只需使用X_test_scaled = scaler.transform(X_test)
。 - Vivek Kumar