在scikit-learn中使用LogisticRegression进行GridSearchCV

11

我正在尝试通过使用交叉验证的网格参数搜索来优化Scikit-learn中的逻辑回归函数,但我似乎无法实现它。

它说逻辑回归没有实现get_params(),但文档上却说它有。我该如何在我的真实数据上优化这个函数呢?

>>> param_grid = {'C': [0.001, 0.01, 0.1, 1, 10, 100, 1000] }
>>> clf = GridSearchCV(LogisticRegression(penalty='l2'), param_grid)
>>> clf
GridSearchCV(cv=None,
       estimator=LogisticRegression(C=1.0, intercept_scaling=1, dual=False, fit_intercept=True,
          penalty='l2', tol=0.0001),
       fit_params={}, iid=True, loss_func=None, n_jobs=1,
       param_grid={'C': [0.001, 0.01, 0.1, 1, 10, 100, 1000]},
       pre_dispatch='2*n_jobs', refit=True, score_func=None, verbose=0)
>>> clf = clf.fit(gt_features, labels)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Python/2.7/site-packages/scikit_learn-0.14_git-py2.7-macosx-10.8-x86_64.egg/sklearn/grid_search.py", line 351, in fit
    base_clf = clone(self.estimator)
  File "/Library/Python/2.7/site-packages/scikit_learn-0.14_git-py2.7-macosx-10.8-x86_64.egg/sklearn/base.py", line 42, in clone
    % (repr(estimator), type(estimator)))
TypeError: Cannot clone object 'LogisticRegression(C=1.0, intercept_scaling=1, dual=False, fit_intercept=True,
          penalty='l2', tol=0.0001)' (type <class 'scikits.learn.linear_model.logistic.LogisticRegression'>): it does not seem to be a scikit-learn estimator a it does not implement a 'get_params' methods.
>>> 
3个回答

8

类名scikits.learn.linear_model.logistic.LogisticRegression指的是 scikit-learn 的一个非常老的版本。自从至少2到3个版本之后,顶层包名称现在是sklearn。很可能您在python路径中同时安装了旧版本的scikit-learn。请卸载它们全部,然后重新安装0.14或更高版本并重试。


1
非常感谢。我所要做的就是切换导入语句,然后它就可以工作了。简单明了。你(或其他人)能告诉我在逻辑回归网格搜索中哪些参数可以被优化吗?只有C吗?这是完整的对象:clf.best_estimator_LogisticRegression(C=1, class_weight=None, dual=False, fit_intercept=True, intercept_scaling=1, penalty='l2', random_state=None, tol=0.0001) - genekogan
3
是的,“C”是最重要的。 - Andreas Mueller

8

您可以在C参数旁边添加罚项作为参数。例如:

grid_values = {'penalty': ['l1','l2'], 'C': [0.001,0.01,0.1,1,10,100,1000]},然后model_lr = GridSearchCV(lr, param_grid=grid_values)


4
from sklearn.model_selection import GridSearchCV

根据您的计算机性能,您可以选择以下选项:

parameters = [{'penalty':['l1','l2']}, 
              {'C':[1, 10, 100, 1000]}]
grid_search = GridSearchCV(estimator = logreg,  
                           param_grid = parameters,
                           scoring = 'accuracy',
                           cv = 5,
                           verbose=0)


grid_search.fit(X_train, y_train)   

或者那个深色的。
parameters = [{'solver': ['newton-cg', 'lbfgs', 'liblinear', 'sag', 'saga']},
              {'penalty':['none', 'elasticnet', 'l1', 'l2']},
              {'C':[0.001, 0.01, 0.1, 1, 10, 100]}]



grid_search = GridSearchCV(estimator = logreg,  
                           param_grid = parameters,
                           scoring = 'accuracy',
                           cv = 5,
                           verbose=0)


grid_search.fit(X_train, y_train)

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接