我正在尝试在xgboost中使用MAPE
作为评估指标,但结果很奇怪:
def xgb_mape(preds, dtrain):
labels = dtrain.get_label()
return('mape', np.mean(np.abs((labels - preds) / (labels+1))))
xgp = {"colsample_bytree": 0.9,
"min_child_weight": 24,
"subsample": 0.9,
"eta": 0.05,
"objective": "reg:linear",
"seed": 70}
cv = xgb.cv(params = xgp,
dtrain = xgb.DMatrix(train_set[cols_to_use], label=train_set.y),
folds = KFold(n = len(train_set), n_folds=4, random_state = 707, shuffle=True),
feval = xgb_mape,
early_stopping_rounds=10,
num_boost_round=1000,
verbose_eval=10,
maximize=False
)
它返回:
[0] train-mape:0.780683+0.00241932 test-mape:0.779896+0.0024619
[10] train-mape:0.84939+0.0196102 test-mape:0.858054+0.0184669
[20] train-mape:1.0778+0.0313676 test-mape:1.10751+0.0293785
[30] train-mape:1.26066+0.0343771 test-mape:1.30707+0.0323237
[40] train-mape:1.37713+0.0347438 test-mape:1.43339+0.030565
[50] train-mape:1.45653+0.042433 test-mape:1.52176+0.0383677
[60] train-mape:1.52268+0.0386395 test-mape:1.5909+0.0353497
[70] train-mape:1.5636+0.0383622 test-mape:1.63482+0.0301809
[80] train-mape:1.59408+0.0378158 test-mape:1.66748+0.0315529
[90] train-mape:1.61712+0.0403532 test-mape:1.69134+0.0325177
[100] train-mape:1.63028+0.0389446 test-mape:1.70578+0.0316045
[110] train-mape:1.63556+0.0375842 test-mape:1.71153+0.031564
[120] train-mape:1.63509+0.0393198 test-mape:1.7117+0.0320471
当maximize=False
时,训练和测试结果会提高,但是早期停止无法正常工作。错误在哪里?
更新:在xgb_mape
中添加-1*
后,问题得到解决。看起来自定义评估函数的maximize
参数无法正常工作。