我正在使用xgboost的特性pred_contribs
,以便为我的模型中的每个样本获取一种可解释性(shapley值)。
booster.predict(test, pred_contribs=True)
它返回一个形状为(样本数)x(特征数)的贡献向量。贡献之和等于边际得分。
但是,我想使用概率来替代边际得分,并且为了简单起见,我想将贡献转换为概率(经过近似)。
有没有办法做到这一点?
代码示例:
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
import xgboost as xgb
X, y = make_classification()
X_train, X_test, y_train, y_test = train_test_split(X, y)
dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)
param = {
'max_depth': 2,
'eta': 1,
'silent': 1,
'objective': 'binary:logistic',
'eval_metric': 'auc'
}
booster = xgb.train(param, dtrain, 50)
probabilites = booster.predict(dtest)
margin_score = booster.predict(dtest, output_margin=True)
contributions = booster.predict(dtest, pred_contribs=True)