xgboost预测对概率的贡献

Question

xgboost预测对概率的贡献

pythonmachine-learningdata-sciencexgboost

3

我正在使用xgboost的特性pred_contribs，以便为我的模型中的每个样本获取一种可解释性（shapley值）。

booster.predict(test, pred_contribs=True)

它返回一个形状为(样本数)x(特征数)的贡献向量。贡献之和等于边际得分。

但是，我想使用概率来替代边际得分，并且为了简单起见，我想将贡献转换为概率（经过近似）。

有没有办法做到这一点？

代码示例：

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
import xgboost as xgb

X, y = make_classification()
X_train, X_test, y_train, y_test = train_test_split(X, y)

dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)

param = {
    'max_depth': 2,
    'eta': 1,
    'silent': 1,
    'objective': 'binary:logistic',
    'eval_metric': 'auc'
}

booster = xgb.train(param, dtrain, 50)

probabilites = booster.predict(dtest)

margin_score = booster.predict(dtest, output_margin=True)

contributions = booster.predict(dtest, pred_contribs=True)

- Thomas

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- StefanPopov · Answer 1

我不确定这是否是同一个问题和答案，你可能想看一下我在类似问题这里的回答。

基本上，你需要将贡献向量除以其总和，并乘以预测概率：

contributions = contributions / sum(contributions) * predicted_probability 其中predicted_probability是感兴趣类别的概率。

再次强调，我不能百分之百确定这是正确的做法，但在我的使用情况下它可以正常工作。