如何在R中根据多列展开单个列?

3
每个独特的年份、站点、象限和物种在数据集中有两个值“Val”。我想将这些值分散到两列“Val1”和“Val2”中。我尝试使用常规分散函数,但似乎不太合适。有什么建议吗?
Year Site Quadrant Species Val
2019   1     1        A    20
2019   1     1        A    30
2019   1     1        B    20
2019   1     1        B    25
2019   1     2        A    20
2019   1     2        A    10
2019   1     2        B    11
2019   1     2        B    22

期望输出

Year Site Quadrant Species Val1 Val2
2019   1     1        A    20    30
2019   1     1        B    20    25
2019   1     2        A    20    10
2019   1     2        B    11    22
3个回答

3

您可以按列进行group_by操作,使用mutate创建新的列标题,然后进行spread(或pivot_wider)操作:

library(dplyr)

mydata %>% 
  group_by(Year, Site, Quadrant, Species) %>% 
  mutate(Var = paste0("Val", row_number())) %>% 
  spread(Var, Val) %>%
  ungroup()

结果:

# A tibble: 4 x 6
   Year  Site Quadrant Species  Val1  Val2
  <int> <int>    <int> <chr>   <int> <int>
1  2019     1        1 A          20    30
2  2019     1        1 B          20    25
3  2019     1        2 A          20    10
4  2019     1        2 B          11    22

数据:

mydata <- read.table(text = "Year Site Quadrant Species Val
2019   1     1        A    20
2019   1     1        A    30
2019   1     1        B    20
2019   1     1        B    25
2019   1     2        A    20
2019   1     2        A    10
2019   1     2        B    11
2019   1     2        B    22", header = TRUE)

2
你可以这样做:使用lead
library(tidyverse)
df %>% 
  mutate(id = row_number(),
         Val2 = lead(Val)) %>% 
  filter(id %% 2 == 1) %>% 
  select(-id, Val1  = Val)

输出:

   Year  Site Quadrant Species  Val1  Val2
  <dbl> <dbl>    <dbl> <chr>   <dbl> <dbl>
1  2019     1        1 A          20    30
2  2019     1        1 B          20    25
3  2019     1        2 A          20    10
4  2019     1        2 B          11    22

数据:

df <- tribble(
~Year, ~Site, ~Quadrant, ~Species, ~Val,
2019, 1, 1, "A", 20, 
  2019, 1, 1, "A", 30, 
  2019, 1, 1, "B", 20, 
  2019, 1, 1, "B", 25, 
  2019, 1, 2, "A", 20, 
  2019, 1, 2, "A", 10, 
  2019, 1, 2, "B", 11, 
  2019, 1, 2, "B", 22)

1
使用 data.table::dcastrowid
library(data.table)
dcast(dtt,
    Year + Site + Quadrant + Species ~ rowid(Year, Site, Quadrant, Species),
    value.var = 'Val')
#    Year Site Quadrant Species  1  2
# 1: 2019    1        1       A 20 30
# 2: 2019    1        1       B 20 25
# 3: 2019    1        2       A 20 10
# 4: 2019    1        2       B 11 22

如果您喜欢,可以用tidyverse的方式进行类似的操作:

dtt %>%
    group_by(Year, Site, Quadrant, Species) %>%
    mutate(grp = row_number()) %>%
    pivot_wider(names_from = grp, values_from = Val, names_prefix = 'Val') %>%
    ungroup()

#  A tibble: 4 x 6
#    Year  Site Quadrant Species  Val1  Val2
#   <int> <int>    <int> <chr>   <int> <int>
# 1  2019     1        1 A          20    30
# 2  2019     1        1 B          20    25
# 3  2019     1        2 A          20    10
# 4  2019     1        2 B          11    22

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接