将属于同一id的数据框行粘贴在一起

Question

将属于同一id的数据框行粘贴在一起

6

我正在尝试将同一用户的文本数据从不同的行按名称组合在一起：

df <- read.table(header = TRUE, text = 'name text
"katy" "tomorrow I go"
"lauren" "and computing"
"katy" "to the store"
"stephanie" "foo and foos"')

得到一个结果：

df2 <- read.table(header=TRUE, text='name text
"katy" "tomorrow I go to the store"
"lauren" "and computing"
"stephanie" "foo and foos"')

建议？

- lmcshane

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- akrun · Accepted Answer

9

我们可以使用 data.table 或者 dplyr 或者 aggregate 来按照 'name' 分组并拼接 'text' 列。使用 data.table 的话，需要先将 'data.frame' 转换为 'data.table' (setDT(df))。

library(data.table)
setDT(df)[, list(text=paste(text, collapse=' ')), by = name]

使用dplyr

library(dplyr)
df %>%
   group_by(name) %>%
   summarise(text=paste(text, collapse=' '))

或者使用 基本 R

aggregate(text~name, df, FUN= paste, collapse=' ')

- akrun

如果您想按两列（名称和其他不同的文本）进行分组，并且其中一列有NAs，则aggregate不会处理NAs！然而，使用dplyr的解决方案完美地解决了这个问题。例如：

EnsemblID <- c("ENSG00000138594", "ENSG00000138594", "ENSG00000138594", "ENSG00000253251", "ENSG00000001629", "ENSG00000001629", "ENSG00000001629", "ENSG00000001629", "ENSG00000005513", "ENSG00000186448") EntrezIDs <- c(112268148, 112268148, 112268148, 112441434, NA, NA, NA, NA, NA, 110354863)

- emr2

GO_function <- c("GO:0003779", "GO:0005523", "GO:0098641", "GO:0005515", "GO:0004842", "GO:0004843", "GO:0004844", "GO:0004845", "GO:0016740", "GO:0016743")

data <- data.frame(EnsemblID, EntrezIDs, GO_function) data_collapsed = aggregate(data=data, GO_function~., FUN=paste, collapse=", ") # we don't have NAs

data_collapsed3 = data %>%   group_by(EnsemblID, EntrezIDs) %>%   summarise(GO_function=paste(GO_function, collapse=',')) # we keep the NAs

- emr2

@emr2 在使用aggregate函数时，需要加上na.action = NULL参数，并且使用function(x) paste(x[!is.na(x)], collapse = " ")函数。 - akrun

像这样吗？aggregate(data=data, GO_function~., na.action = NULL, function(x) paste(x[!is.na(x)], collapse = " "))？但我仍然丢失了NA值。 - emr2

1

@emr2，缺失值在分组列中。也许您可以将其转换为字符

aggregate(data=replace(data, is.na(data), "NA"), GO_function~., na.action = NULL,  function(x) paste(x[!is.na(x)], collapse = " "))

。 - akrun