在网上搜索后,我仍然对线性增强器 gblinear
的具体含义感到困惑,而且我并不是孤单的(参见)。
根据文档,它只有3个参数: lambda
, lambda_bias
和alpha
- 也许应该说"额外的参数"。
如果我理解正确,那么线性增强器执行(相当标准的)线性提升(带有正则化)。在这种情况下,我只能理解上面的三个参数和 eta
(提升率)。这也是在github上如何描述的。
尽管如此,我发现参数gamma
、max_depth
和min_child_weight
也会影响算法。
这怎么可能呢?是否有一个完全清晰的线性增强器的描述可以在网上找到?
请查看我的示例:
library(xgboost)
data(agaricus.train, package='xgboost')
data(agaricus.test, package='xgboost')
train <- agaricus.train
test <- agaricus.test
接下来是设置过程。
set.seed(100)
model <- xgboost(data = train$data, label = train$label, nrounds = 5,
objective = "binary:logistic",
params = list(booster = "gblinear", eta = 0.5, lambda = 1, lambda_bias = 1,gamma = 2,
early_stopping_rounds = 3))
提供
> [1] train-error:0.018271 [2] train-error:0.003071
> [3] train-error:0.001075 [4] train-error:0.001075
> [5] train-error:0.000614
当 gamma=1
时
set.seed(100)
model <- xgboost(data = train$data, label = train$label, nrounds = 5,
objective = "binary:logistic",
params = list(booster = "gblinear", eta = 0.5, lambda = 1, lambda_bias = 1,gamma = 1,
early_stopping_rounds = 3))
导致
> [1] train-error:0.013051 [2] train-error:0.001842
> [3] train-error:0.001075 [4] train-error:0.001075
> [5] train-error:0.001075
这是另一种“路径”。
max_depth
同理:
set.seed(100)
model <- xgboost(data = train$data, label = train$label, nrounds = 5,
objective = "binary:logistic",
params = list(booster = "gblinear", eta = 0.5, lambda = 1, lambda_bias = 1, max_depth = 3,
early_stopping_rounds = 3))
> [1] train-error:0.016122 [2] train-error:0.002764
> [3] train-error:0.001075 [4] train-error:0.001075
> [5] train-error:0.000768
并且
set.seed(100)
model <- xgboost(data = train$data, label = train$label, nrounds = 10,
objective = "binary:logistic",
params = list(booster = "gblinear", eta = 0.5, lambda = 1, lambda_bias = 1, max_depth = 4,
early_stopping_rounds = 3))
> [1] train-error:0.014740 [2] train-error:0.004453
> [3] train-error:0.001228 [4] train-error:0.000921
> [5] train-error:0.000614