目标是多类别但平均值设为“二元”。请选择另一个平均设置之一，其中包括[无、微型、宏观、加权]。

Question

目标是多类别但平均值设为“二元”。请选择另一个平均设置之一，其中包括[无、微型、宏观、加权]。

4

我正在从事多类分类工作，并且正在查找分类器的准确性，我使用以下工具：

model = RandomForestClassifier(random_state=2)
model.fit(X_train, y_train)
preds = model.predict(X_test)
Accuracy=accuracy_score(y_test, preds, average='micro')

出现错误：

TypeError: accuracy_score() got an unexpected keyword argument 'average'

当我使用时：

model = RandomForestClassifier(random_state=2)
model.fit(X_train, y_train)
preds = model.predict(X_test)
Accuracy=accuracy_score(y_test, preds)

遇到错误:

ValueError: Target is multiclass but average='binary'. Please choose another average setting, one of [None, 'micro', 'macro', 'weighted'].

请问有人可以帮我找到多分类问题的准确性吗？

以下是我在XGBoost函数中的代码：

scorers = {
        'f1_score':make_scorer(f1_score),
        'precision_score': make_scorer(precision_score),
        'recall_score': make_scorer(recall_score),
        'accuracy_score': make_scorer(accuracy_score)
      }
#fitting the training dataset to the model
xgb_model = XGBClassifier(n_jobs=-1, objective='multi:softmax')
#setattr(xgb_model, 'verbosity', 2)
param_dist = {'n_estimators': stats.randint(150, 1000),
              'learning_rate': stats.uniform(0.01, 0.59),
              'subsample': stats.uniform(0.3, 0.6),
              'max_depth': [3, 4, 5, 6, 7, 8, 9],
              'colsample_bytree': stats.uniform(0.5, 0.4),
              'min_child_weight': [1, 2, 3, 4]
             }

#     numFolds = 5
#     kfold_5 = cross_validation.KFold(n = len(X), shuffle = True, n_folds = numFolds)
    skf = StratifiedKFold(n_splits=3, shuffle = True)
    gridCV = RandomizedSearchCV(xgb_model, 
                             param_distributions = param_dist,
                             cv = skf,  
                             n_iter = 5,  
                             scoring = scorers, 
                             verbose = 3, 
                             n_jobs = -1,
                             return_train_score=True,
                             refit = False)
gridCV.fit(x_train,y_train)

尝试使用make_scorer(f1_score,average='micro')时出现了以下错误：

TypeError                                 Traceback (most recent call last)
<ipython-input-68-8b24047fa926> in <module>
      1 print("********** Xgboost classifier *************")
      2 start_time = time.monotonic()
----> 3 y_test, xgb_predict, xgb_pred_prob = xgboost_classifier(x,y)
      4 end_time = time.monotonic()
      5 print(timedelta(seconds=end_time - start_time))

<ipython-input-67-2661dd9c3c1a> in xgboost_classifier(x, y)
     36     scorers = {
     37             'f1_score':make_scorer(f1_score,average='micro'),
---> 38             'precision_score': make_scorer(precision_score()),
     39             'recall_score': make_scorer(recall_score()),
     40             'accuracy_score': make_scorer(accuracy_score())

TypeError: precision_score() missing 2 required positional arguments: 'y_true' and 'y_pred'

我不知道为什么 gridCV.fit(x_train, y_train)没有向评分器提供Y值？

- tunned

accuracy_score 不应该有关键字参数 average，这是正确的。你确定第二个错误消息来自 accuracy_score 函数吗？如果你有多个类并且忘记设置平均关键字，则我会期望出现此类错误，例如 F1_score。你使用的 sklearn 版本是什么？你能给我们整个错误堆栈输出吗？ - Tinu

这是我使用的整个代码，我已经进行了准确性测试，以检查它是否来自于此。Sklearn 版本为 0.22.1。 - tunned

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- tunned · Accepted Answer

我通过在F1、精度和召回率中添加平均值来解决了这个问题。只有准确率不需要这个参数！

scorers = {
            'f1_score': make_scorer(f1_score, average='micro'),
            'precision_score': make_scorer(precision_score, average='micro'),
            'recall_score': make_scorer(recall_score, average='micro'),
            'accuracy_score': make_scorer(accuracy_score)
          }