有没有一种方法可以使用GridSearchCV或任何其他内置的sklearn函数来查找OneClassSVM分类器的最佳超参数?
目前我所做的是像这样使用训练/测试集进行搜索:
Gamma和nu的值定义如下:
gammas = np.logspace(-9, 3, 13)
nus = np.linspace(0.01, 0.99, 99)
探索所有可能的超参数并找到最佳超参数的函数:
clf = OneClassSVM()
results = []
train_x = vectorizer.fit_transform(train_contents)
test_x = vectorizer.transform(test_contents)
for gamma in gammas:
for nu in nus:
clf.set_params(gamma=gamma, nu=nu)
clf.fit(train_x)
y_pred = clf.predict(test_x)
if 1. in y_pred: # Check if at least 1 review is predicted to be in the class
results.append(((gamma, nu), (accuracy_score(y_true, y_pred),
precision_score(y_true, y_pred),
recall_score(y_true, y_pred),
f1_score(y_true, y_pred),
roc_auc_score(y_true, y_pred),
))
)
# Determine and print the best parameter settings and their performance
print_best_parameters(results, best_parameters(results))
结果存储在元组列表中,其形式为:
((gamma, nu)(accuracy_score, precision_score, recall_score, f1_score, roc_auc_score))
为了找到最佳准确率、f1和roc_auc分数以及参数,我编写了自己的函数:
best_parameters(results)