我试图理解Caret包中的5倍交叉验证算法,但我无法找到如何为每个fold获取训练集和测试集,也无法从类似的推荐问题中找到答案。如果我想通过随机森林方法进行交叉验证,我会按照以下步骤操作:
set.seed(12)
train_control <- trainControl(method="cv", number=5,savePredictions = TRUE)
rfmodel <- train(Species~., data=iris, trControl=train_control, method="rf")
first_holdout <- subset(rfmodel$pred, Resample == "Fold1")
str(first_holdout)
'data.frame': 90 obs. of 5 variables:
$ pred : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1
$ obs : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1
$ rowIndex: int 2 3 9 11 25 29 35 36 41 50 ...
$ mtry : num 2 2 2 2 2 2 2 2 2 2 ...
$ Resample: chr "Fold1" "Fold1" "Fold1" "Fold1" ...
这90个观测值是否用作Fold1的训练集?如果是,那么这个折叠的测试集在哪里?