在R中,如何将特定的列转置为行,并与原始值对齐?

4
当我整理数据表时,我是这样做的。
A <- data.frame(
  Season = rep(2023, 5),
  crop = rep("Soybean", 5),
  treatment = rep("Inside panel", 5),
  plot = rep(6, 5),
  row = 1:5,
  rep1 = c(24.30, 30.30, 21.80, 22.90, 20.60),
  rep2 = c(10.30, 0.00, 29.30, 0.00, 30.80),
  rep3 = c(30.70, 28.40, 15.50, 22.30, 17.40),
  SS1 = c("tr1", "tr1", "tr1", "tr3", "tr1"),
  SS2 = c("tr3", "tr2", "tr2", "tr2", "tr3"),
  SS3 = c("tr2", "tr3", "tr3", "tr1", "tr2")
)

  Season    crop    treatment plot row rep1 rep2 rep3 SS1 SS2 SS3
1   2023 Soybean Inside panel    6   1 24.3 10.3 30.7 tr1 tr3 tr2
2   2023 Soybean Inside panel    6   2 30.3  0.0 28.4 tr1 tr2 tr3
3   2023 Soybean Inside panel    6   3 21.8 29.3 15.5 tr1 tr2 tr3
4   2023 Soybean Inside panel    6   4 22.9  0.0 22.3 tr3 tr2 tr1
5   2023 Soybean Inside panel    6   5 20.6 30.8 17.4 tr1 tr3 tr2

SS1对齐rep1,SS2对齐rep2,SS3对齐rep3。例如,在第一行中,24.3是给tr1的,10.3是给tr3的,30.7是给tr2的。
我想要的是将SS列转置为行。
B=data.frame(
  Season = rep(2023, 15),
  crop = rep("Soybean", 15),
  treatment = rep("Inside panel", 15),
  plot = rep(6, 15),
  row = rep(1:5, each = 3),
  rep = rep(1:3, times = 5),
  SS = c("tr1", "tr3", "tr2", "tr1", "tr2", "tr3", "tr1", "tr2", "tr3", "tr3", "tr2", "tr1", "tr1", "tr3", "tr2"),
  yield = c(24.30, 10.30, 30.70, 30.30, 0.00, 28.40, 21.80, 29.30, 15.50, 22.90, 0.00, 22.30, 20.60, 30.80, 17.40))

   Season    crop    treatment plot row rep  SS yield
1    2023 Soybean Inside panel    6   1   1 tr1  24.3
2    2023 Soybean Inside panel    6   1   2 tr3  10.3
3    2023 Soybean Inside panel    6   1   3 tr2  30.7
4    2023 Soybean Inside panel    6   2   1 tr1  30.3
5    2023 Soybean Inside panel    6   2   2 tr2   0.0
6    2023 Soybean Inside panel    6   2   3 tr3  28.4
7    2023 Soybean Inside panel    6   3   1 tr1  21.8
8    2023 Soybean Inside panel    6   3   2 tr2  29.3
9    2023 Soybean Inside panel    6   3   3 tr3  15.5
10   2023 Soybean Inside panel    6   4   1 tr3  22.9
11   2023 Soybean Inside panel    6   4   2 tr2   0.0
12   2023 Soybean Inside panel    6   4   3 tr1  22.3
13   2023 Soybean Inside panel    6   5   1 tr1  20.6
14   2023 Soybean Inside panel    6   5   2 tr3  30.8
15   2023 Soybean Inside panel    6   5   3 tr2  17.4

你能告诉我如何将列转置为行,并与原始值对齐吗?
3个回答

4
更新:请查看 @Onyambu 的输入。
library(dplyr)
library(tidyr)

A %>%
  pivot_longer(
    cols = c(starts_with("rep"), starts_with("SS")),
    names_to = c(".value", NA),
    names_pattern = "(rep|SS)(\\d+)"
  ) 

第一个答案: 我们可以这样做:这是使用pivot_longer两次的紧凑模式:

我们使用names_to = c(".value", "set")names_pattern = "(rep|SS)(\\d+)"将列名分成两部分。.value保留列名的共同部分(rep和SS),set保留数字后缀(然后被移除)。

library(dplyr)
library(tidyr)

A %>%
  pivot_longer(
    cols = c(starts_with("rep"), starts_with("SS")),
    names_to = c(".value", "set"),
    names_pattern = "(rep|SS)(\\d+)"
  ) %>%
  select(-set)

  Season crop    treatment     plot   row   rep SS   
    <dbl> <chr>   <chr>        <dbl> <int> <dbl> <chr>
 1   2023 Soybean Inside panel     6     1  24.3 tr1  
 2   2023 Soybean Inside panel     6     1  10.3 tr3  
 3   2023 Soybean Inside panel     6     1  30.7 tr2  
 4   2023 Soybean Inside panel     6     2  30.3 tr1  
 5   2023 Soybean Inside panel     6     2   0   tr2  
 6   2023 Soybean Inside panel     6     2  28.4 tr3  
 7   2023 Soybean Inside panel     6     3  21.8 tr1  
 8   2023 Soybean Inside panel     6     3  29.3 tr2  
 9   2023 Soybean Inside panel     6     3  15.5 tr3  
10   2023 Soybean Inside panel     6     4  22.9 tr3  
11   2023 Soybean Inside panel     6     4   0   tr2  
12   2023 Soybean Inside panel     6     4  22.3 tr1  
13   2023 Soybean Inside panel     6     5  20.6 tr1  
14   2023 Soybean Inside panel     6     5  30.8 tr3  
15   2023 Soybean Inside panel     6     5  17.4 tr2 

1
非常感谢你!!你帮我节省了很多时间!!非常感谢!! - undefined
2
使用names_to = c('.value', NA)代替select(-set),或者使用names_to = '.value'并且只捕获第一部分的模式,即names_pattern = "(rep|SS)\\d+" - undefined
1
这非常有用。谢谢@Onyambu! - undefined

3
在tidyr中,pivot_longer()函数可以解决这个问题。使用"name_pattern"将前面的部分(rep或SS)与测试/试验编号分开。
library(tidyr)
pivot_longer(A, cols= -c("Season", "crop", "treatment", "plot", "row"), 
             names_pattern="(\\D+)(\\d)", names_to = c(".value", "test"))
#or
#pivot_longer(A, cols= matches("\\d$"), names_pattern="(\\D+)(\\d)", 
         names_to = c(".value", "test"))

# A tibble: 15 × 8
Season crop    treatment     plot   row test    rep SS   
<dbl> <chr>   <chr>        <dbl> <int> <chr> <dbl> <chr>
1   2023 Soybean Inside panel     6     1 1      24.3 tr1  
2   2023 Soybean Inside panel     6     1 2      10.3 tr3  
3   2023 Soybean Inside panel     6     1 3      30.7 tr2  
4   2023 Soybean Inside panel     6     2 1      30.3 tr1  
5   2023 Soybean Inside panel     6     2 2       0   tr2  
6   2023 Soybean Inside panel     6     2 3      28.4 tr3  
...

你需要将列重命名为最终期望的结果。

非常感谢!!!我解决了我的问题!! - undefined

2
基本上你想要的是
> reshape(A, var=list(6:8, 9:11), sep='', dir='l')

为了获得所需的列名,我们可以进行精确调整。
> reshape(A, var=list(6:8, 9:11), sep='', dir='l', v.names=c('yield', 'SS'), 
+         timevar='rep')

最后,为了使所显示的期望输出进行排序。
> reshape(A, var=list(6:8, 9:11), sep='', dir='l', v.names=c('yield', 'SS'),
+         timevar='rep') |> 
+   {\(.) .[with(., order(row, rep)), setdiff(names(.), 'id')]}()
    Season    crop    treatment plot row rep yield  SS
1.1   2023 Soybean Inside panel    6   1   1  24.3 tr1
1.2   2023 Soybean Inside panel    6   1   2  10.3 tr3
1.3   2023 Soybean Inside panel    6   1   3  30.7 tr2
2.1   2023 Soybean Inside panel    6   2   1  30.3 tr1
2.2   2023 Soybean Inside panel    6   2   2   0.0 tr2
2.3   2023 Soybean Inside panel    6   2   3  28.4 tr3
3.1   2023 Soybean Inside panel    6   3   1  21.8 tr1
3.2   2023 Soybean Inside panel    6   3   2  29.3 tr2
3.3   2023 Soybean Inside panel    6   3   3  15.5 tr3
4.1   2023 Soybean Inside panel    6   4   1  22.9 tr3
4.2   2023 Soybean Inside panel    6   4   2   0.0 tr2
4.3   2023 Soybean Inside panel    6   4   3  22.3 tr1
5.1   2023 Soybean Inside panel    6   5   1  20.6 tr1
5.2   2023 Soybean Inside panel    6   5   2  30.8 tr3
5.3   2023 Soybean Inside panel    6   5   3  17.4 tr2

数据:

> dput(A)
structure(list(Season = c(2023, 2023, 2023, 2023, 2023), crop = c("Soybean", 
"Soybean", "Soybean", "Soybean", "Soybean"), treatment = c("Inside panel", 
"Inside panel", "Inside panel", "Inside panel", "Inside panel"
), plot = c(6, 6, 6, 6, 6), row = 1:5, rep1 = c(24.3, 30.3, 21.8, 
22.9, 20.6), rep2 = c(10.3, 0, 29.3, 0, 30.8), rep3 = c(30.7, 
28.4, 15.5, 22.3, 17.4), SS1 = c("tr1", "tr1", "tr1", "tr3", 
"tr1"), SS2 = c("tr3", "tr2", "tr2", "tr2", "tr3"), SS3 = c("tr2", 
"tr3", "tr3", "tr1", "tr2")), class = "data.frame", row.names = c(NA, 
-5L))

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接