计算中位数对于R来说似乎有点棘手(即没有数据框方法)。使用dplyr从数据框获取组中位数需要输入的最少字符是什么?
这里最省力的解决方法是什么?
my_data <- structure(list(group = c("Group 1", "Group 1", "Group 1", "Group 1",
"Group 1", "Group 1", "Group 1", "Group 1", "Group 1", "Group 1",
"Group 1", "Group 1", "Group 1", "Group 1", "Group 1", "Group 2",
"Group 2", "Group 2", "Group 2", "Group 2", "Group 2", "Group 2",
"Group 2", "Group 2", "Group 2", "Group 2", "Group 2", "Group 2",
"Group 2", "Group 2"), value = c("5", "3", "6", "8", "10", "13",
"1", "4", "18", "4", "7", "9", "14", "15", "17", "7", "3", "9",
"10", "33", "15", "18", "6", "20", "30", NA, NA, NA, NA, NA)), .Names = c("group",
"value"), class = c("tbl_df", "data.frame"), row.names = c(NA,
-30L))
library(dplyr)
# groups 1 & 2
my_data_groups_1_and_2 <- my_data[my_data$group %in% c("Group 1", "Group 2"), ]
# compute medians per group
medians <- my_data_groups_1_and_2 %>%
group_by(group) %>%
summarize(the_medians = median(value, na.rm = TRUE))
这将会给出:
Error in summarise_impl(.data, dots) :
STRING_ELT() can only be applied to a 'character vector', not a 'double'
In addition: Warning message:
In mean.default(sort(x, partial = half + 0L:1L)[half + 0L:1L]) :
argument is not numeric or logical: returning NA
这里最省力的解决方法是什么?
is.character(my_data_groups_1_and_2$value)
зҡ„з»“жһңжҳҜTRUE
еҗ—пјҹж·»еҠ дёҖдёӘmutateпјҢ并е°ҶvalueиҪ¬жҚўдёәdoubleзұ»еһӢеҸҜд»Ҙи®©дёӯдҪҚж•°еҫ—еҲ°и®Ўз®—гҖӮ - Matt Upson