我想知道有没有人能够解释 dplyr::slice_min() /dplyr::slice_max() 函数在 with_ties 参数下的行为。对于分组数据,当 with_ties = TRUE 时,为什么该函数会排除 NA 值,而当 with_ties = FALSE 时,则包括 NA 值?以下是重现代码:
library(tidyverse)
tbl <- tibble(ID = rep(c("a","b","c","d"), each = 3),
measure = c(NA, NA, NA, NA, 1, 1, 2, 3, 4, NA, NA, NA))
tbl |>
group_by(ID) |>
slice_max(measure, with_ties = TRUE)
# A tibble: 3 × 2
# Groups: ID [2]
ID measure
<chr> <dbl>
1 b 1
2 b 1
3 c 4
tbl |>
group_by(ID) |>
slice_max(measure, with_ties = FALSE)
# A tibble: 4 × 2
# Groups: ID [4]
ID measure
<chr> <dbl>
1 a NA
2 b 1
3 c 4
4 d NA
with_ties = TRUE
进行调用,它会调用多个其他函数,例如smaller_ranks
等。如果是FALSE
,则从顺序中创建索引。idx <- function(x, n) head(order(x, decreasing = TRUE), size(n))
- akrun