我正在尝试将同一用户的文本数据从不同的行按名称组合在一起:
df <- read.table(header = TRUE, text = 'name text
"katy" "tomorrow I go"
"lauren" "and computing"
"katy" "to the store"
"stephanie" "foo and foos"')
得到一个结果:
df2 <- read.table(header=TRUE, text='name text
"katy" "tomorrow I go to the store"
"lauren" "and computing"
"stephanie" "foo and foos"')
建议?
aggregate
不会处理NAs!然而,使用dplyr的解决方案完美地解决了这个问题。例如:EnsemblID <- c("ENSG00000138594", "ENSG00000138594", "ENSG00000138594", "ENSG00000253251", "ENSG00000001629", "ENSG00000001629", "ENSG00000001629", "ENSG00000001629", "ENSG00000005513", "ENSG00000186448") EntrezIDs <- c(112268148, 112268148, 112268148, 112441434, NA, NA, NA, NA, NA, 110354863)
- emr2GO_function <- c("GO:0003779", "GO:0005523", "GO:0098641", "GO:0005515", "GO:0004842", "GO:0004843", "GO:0004844", "GO:0004845", "GO:0016740", "GO:0016743")
data <- data.frame(EnsemblID, EntrezIDs, GO_function)
data_collapsed = aggregate(data=data, GO_function~., FUN=paste, collapse=", ") # we don't have NAs
data_collapsed3 = data %>% group_by(EnsemblID, EntrezIDs) %>% summarise(GO_function=paste(GO_function, collapse=',')) # we keep the NAs
- emr2aggregate
函数时,需要加上na.action = NULL
参数,并且使用function(x) paste(x[!is.na(x)], collapse = " ")
函数。 - akrunaggregate(data=data, GO_function~., na.action = NULL, function(x) paste(x[!is.na(x)], collapse = " "))
?但我仍然丢失了NA值。 - emr2aggregate(data=replace(data, is.na(data), "NA"), GO_function~., na.action = NULL, function(x) paste(x[!is.na(x)], collapse = " "))
。 - akrun