我正在尝试按照这个教程学习基于机器学习的预测,但我有两个问题:
问题1:如何设置下面代码中的n_estimators
,否则它将始终假定默认值。
from sklearn.cross_validation import KFold
def run_cv(X,y,clf_class,**kwargs):
# Construct a kfolds object
kf = KFold(len(y),n_folds=5,shuffle=True)
y_pred = y.copy()
# Iterate through folds
for train_index, test_index in kf:
X_train, X_test = X[train_index], X[test_index]
y_train = y[train_index]
# Initialize a classifier with key word arguments
clf = clf_class(**kwargs)
clf.fit(X_train,y_train)
y_pred[test_index] = clf.predict(X_test)
return y_pred
这被称为:
从sklearn.svm导入SVC
print“%.3f”%accuracy(y,run_cv(X,y,SVC))
问题2:如何使用已经训练好的模型文件(例如从SVM获得)来预测更多未用于训练的(测试)数据?