按行和列重叠合并2个数据框架

4

我希望能够将两个数据框以加法的方式合并,使得

      taxonomy A B C
1          rat 0 1 2
2          dog 1 2 3
3          cat 2 3 0

并且

      taxonomy A D C
1          rat 0 1 9
2        Horse 0 2 6
3          cat 2 0 2

生成

      taxonomy A B C  D
1          rat 0 1 11 1
2        Horse 0 0 6  2 
3          cat 4 3 2  0
4          dog 1 2 3  0

我尝试了聚合、合并、应用和ddply等方法,但都没有成功......这将在两个数据框中进行,每个数据框有几百行和列。

3个回答

4
使用dplyr中的bind_rows函数:
library(dplyr)

bind_rows(df1, df2) %>%
  group_by(taxonomy) %>%
  summarize_all(sum, na.rm = TRUE)

输出:

# A tibble: 4 x 5
  taxonomy     A     B     C     D
  <chr>    <int> <int> <int> <int>
1 cat          4     3     2     0
2 dog          1     2     3     0
3 Horse        0     0     6     2
4 rat          0     1    11     1

数据:

df1 <- structure(list(taxonomy = c("rat", "dog", "cat"), A = 0:2, B = 1:3, 
    C = c(2L, 3L, 0L)), .Names = c("taxonomy", "A", "B", "C"), class = "data.frame", row.names = c("1", 
"2", "3"))

df2 <- structure(list(taxonomy = c("rat", "Horse", "cat"), A = c(0L, 
0L, 2L), D = c(1L, 2L, 0L), C = c(9L, 6L, 2L)), .Names = c("taxonomy", 
"A", "D", "C"), class = "data.frame", row.names = c("1", "2", 
"3"))

2
data.table是@avid_useR答案的等效方法。
library(data.table)
rbindlist(list(df1, df2), fill = TRUE)[, lapply(.SD, sum, na.rm = TRUE), by = taxonomy]
#   taxonomy A B  C D
#1:      rat 0 1 11 1
#2:      dog 1 2  3 0
#3:      cat 4 3  2 0
#4:    Horse 0 0  6 2

这也应该是最快的答案。 - 5th

1

你可以做...

> library(reshape2)
> dcast(rbind(melt(DF1), melt(DF2)), taxonomy ~ variable, fun.aggregate = sum)
Using taxonomy as id variables
Using taxonomy as id variables
  taxonomy A B  C D
1      cat 4 3  2 0
2      dog 1 2  3 0
3    Horse 0 0  6 2
4      rat 0 1 11 1

这将按字母顺序对行和列进行排序,但我猜通过使用一个factor可能可以避免这种情况。
数据:
DF1 = structure(list(taxonomy = c("rat", "dog", "cat"), A = 0:2, B = 1:3, 
    C = c(2L, 3L, 0L)), .Names = c("taxonomy", "A", "B", "C"), row.names = c(NA, 
-3L), class = "data.frame")
DF2 = structure(list(taxonomy = c("rat", "Horse", "cat"), A = c(0L, 
0L, 2L), D = c(1L, 2L, 0L), C = c(9L, 6L, 2L)), .Names = c("taxonomy", 
"A", "D", "C"), row.names = c(NA, -3L), class = "data.frame")

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接