目标是多类别但平均值设为“二元”。请选择另一个平均设置之一,其中包括[无、微型、宏观、加权]。

4

我正在从事多类分类工作,并且正在查找分类器的准确性,我使用以下工具:

model = RandomForestClassifier(random_state=2)
model.fit(X_train, y_train)
preds = model.predict(X_test)
Accuracy=accuracy_score(y_test, preds, average='micro')

出现错误:

TypeError: accuracy_score() got an unexpected keyword argument 'average'

当我使用时:

model = RandomForestClassifier(random_state=2)
model.fit(X_train, y_train)
preds = model.predict(X_test)
Accuracy=accuracy_score(y_test, preds)

遇到错误:

ValueError: Target is multiclass but average='binary'. Please choose another average setting, one of [None, 'micro', 'macro', 'weighted'].

请问有人可以帮我找到多分类问题的准确性吗?

以下是我在XGBoost函数中的代码:

scorers = {
        'f1_score':make_scorer(f1_score),
        'precision_score': make_scorer(precision_score),
        'recall_score': make_scorer(recall_score),
        'accuracy_score': make_scorer(accuracy_score)
      }
#fitting the training dataset to the model
xgb_model = XGBClassifier(n_jobs=-1, objective='multi:softmax')
#setattr(xgb_model, 'verbosity', 2)
param_dist = {'n_estimators': stats.randint(150, 1000),
              'learning_rate': stats.uniform(0.01, 0.59),
              'subsample': stats.uniform(0.3, 0.6),
              'max_depth': [3, 4, 5, 6, 7, 8, 9],
              'colsample_bytree': stats.uniform(0.5, 0.4),
              'min_child_weight': [1, 2, 3, 4]
             }

#     numFolds = 5
#     kfold_5 = cross_validation.KFold(n = len(X), shuffle = True, n_folds = numFolds)
    skf = StratifiedKFold(n_splits=3, shuffle = True)
    gridCV = RandomizedSearchCV(xgb_model, 
                             param_distributions = param_dist,
                             cv = skf,  
                             n_iter = 5,  
                             scoring = scorers, 
                             verbose = 3, 
                             n_jobs = -1,
                             return_train_score=True,
                             refit = False)
gridCV.fit(x_train,y_train)

尝试使用make_scorer(f1_score,average='micro')时出现了以下错误:

TypeError                                 Traceback (most recent call last)
<ipython-input-68-8b24047fa926> in <module>
      1 print("********** Xgboost classifier *************")
      2 start_time = time.monotonic()
----> 3 y_test, xgb_predict, xgb_pred_prob = xgboost_classifier(x,y)
      4 end_time = time.monotonic()
      5 print(timedelta(seconds=end_time - start_time))

<ipython-input-67-2661dd9c3c1a> in xgboost_classifier(x, y)
     36     scorers = {
     37             'f1_score':make_scorer(f1_score,average='micro'),
---> 38             'precision_score': make_scorer(precision_score()),
     39             'recall_score': make_scorer(recall_score()),
     40             'accuracy_score': make_scorer(accuracy_score())

TypeError: precision_score() missing 2 required positional arguments: 'y_true' and 'y_pred'

我不知道为什么 gridCV.fit(x_train, y_train)没有向评分器提供Y值?

accuracy_score 不应该有关键字参数 average,这是正确的。你确定第二个错误消息来自 accuracy_score 函数吗?如果你有多个类并且忘记设置平均关键字,则我会期望出现此类错误,例如 F1_score。 你使用的 sklearn 版本是什么?你能给我们整个错误堆栈输出吗? - Tinu
这是我使用的整个代码,我已经进行了准确性测试,以检查它是否来自于此。Sklearn 版本为 0.22.1。 - tunned
1个回答

2

我通过在F1、精度和召回率中添加平均值来解决了这个问题。只有准确率不需要这个参数!

scorers = {
            'f1_score': make_scorer(f1_score, average='micro'),
            'precision_score': make_scorer(precision_score, average='micro'),
            'recall_score': make_scorer(recall_score, average='micro'),
            'accuracy_score': make_scorer(accuracy_score)
          }

2
请编辑您的问题。此部分是用于回答的。此帖子将被删除,因为在回答部分发布非答案违反了网站规则。 - ZygD
谢谢,如果您想在问题中添加内容,请使用“编辑”功能。 - Tinu
可能的问题是:make_scorer(f1_score)。您还必须将任何关键字参数传递给make_scorer。您可以尝试make_scorer(f1_score,average='micro') - Tinu
谢谢,我无法编辑我的问题,这就是为什么我再次发布的原因。 - tunned

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接