我有一个二分类的数据集,包含YES/NO的响应变量。使用下面的代码来运行随机森林模型。但是在获取混淆矩阵结果时遇到了问题。
dataR <- read_excel("*:/*.xlsx")
Train <- createDataPartition(dataR$Class, p=0.7, list=FALSE)
training <- dataR[ Train, ]
testing <- dataR[ -Train, ]
model_rf <- train( Class~., tuneLength=3, data = training, method =
"rf", importance=TRUE, trControl = trainControl (method = "cv", number =
5))
结果:
Random Forest
3006 samples
82 predictor
2 classes: 'NO', 'YES'
No pre-processing
Resampling: Cross-Validated (5 fold)
Summary of sample sizes: 2405, 2406, 2405, 2404, 2404
Addtional sampling using SMOTE
Resampling results across tuning parameters:
mtry Accuracy Kappa
2 0.7870921 0.2750655
44 0.7787721 0.2419762
87 0.7767760 0.2524898
Accuracy was used to select the optimal model using the largest value.
The final value used for the model was mtry = 2.
到目前为止还好,但是当我运行这段代码时:
# Apply threshold of 0.50: p_class
class_log <- ifelse(model_rf[,1] > 0.50, "YES", "NO")
# Create confusion matrix
p <-confusionMatrix(class_log, testing[["Class"]])
##gives the accuracy
p$overall[1]
我收到了这个错误信息:
Error in model_rf[, 1] : incorrect number of dimensions
希望你们能帮我获取混淆矩阵结果。
model_rf[, 1]
打印到控制台并查看它。 - Samuel