data = df_train.as_matrix(columns=train_vars) # All columns aside from 'output'
target = df_train.as_matrix(columns=['output']).ravel()
# Get training and testing splits
splits = cross_validation.train_test_split(data, target, test_size=0.2)
data_train, data_test, target_train, target_test = splits
# Fit the training data to the model
model = RandomForestRegressor(100)
model.fit(data_train, target_train)
# Make predictions
expected = target_test
predicted = model.predict(data_test)
当我运行这段代码,预测变量“output”作为此文件中所有其他变量的函数时:https://www.dropbox.com/s/cgyh09q2liew85z/uuu.csv?dl=0 期望和预测数组完全相同。看起来我可能过度拟合或者做错了什么。如何解决?