将单列数据透视成多列数据。

4

我试图一次转置多个值。

我有如下代码:

>  head(data)
# A tibble: 6 x 3
  variable       `Mean (SD)`    `Median (IQR)`
  <chr>          <glue>         <glue>        
1 VarA_VVV_Cond1 268.59 (80.6)  276 (86)      
2 VarA_WWW_Cond1 149.07 (39.79) 155 (40.5)    
3 VarA_XXX_Cond1 147.71 (39.65) 155 (41)      
4 VarA_YYY_Cond1 18.85 (10.76)  18 (15.5)     
5 VarA_ZZZ_Cond1 20.98 (11.34)  20 (14)       
6 VarA_VVV_Cond2 228.49 (83.77) 241 (116)  

所需的输出是这样的:
          VVV(Mean/SD) VVV(Median/IQR) XXX(Mean/SD) XXX(Median/IQR)... 
VAR_A_Cond_1
VAR_A_Cond_2
VAR_B_Cond_1
VAR_B_Cond_2

这是我的数据集:
>  dput(data)
structure(list(variable = c("VarA_VVV_Cond1", "VarA_WWW_Cond1", 
"VarA_XXX_Cond1", "VarA_YYY_Cond1", "VarA_ZZZ_Cond1", "VarA_VVV_Cond2", 
"VarA_WWW_Cond2", "VarA_XXX_Cond2", "VarA_YYY_Cond2", "VarA_ZZZ_Cond2", 
"VarB_VVV_Cond1", "VarB_WWW_Cond1", "VarB_XXX_Cond1", "VarB_YYY_Cond1", 
"VarB_ZZZ_Cond1", "VarB_VVV_Cond2", "VarB_WWW_Cond2", "VarB_XXX_Cond2", 
"VarB_YYY_Cond2", "VarB_ZZZ_Cond2"), `Mean (SD)` = structure(c("268.59 (80.6)", 
"149.07 (39.79)", "147.71 (39.65)", "18.85 (10.76)", "20.98 (11.34)", 
"228.49 (83.77)", "113.66 (35.91)", "112.64 (35.75)", "24.07 (15.79)", 
"26.36 (16.51)", "250.72 (61.53)", "140.71 (30.52)", "138.93 (30.37)", 
"21.02 (10.46)", "22.72 (11.05)", "225.98 (81.32)", "112.43 (36.09)", 
"111.1 (36.41)", "24.71 (16.77)", "26.59 (17.49)"), class = c("glue", 
"character")), `Median (IQR)` = structure(c("276 (86)", "155 (40.5)", 
"155 (41)", "18 (15.5)", "20 (14)", "241 (116)", "116 (51)", 
"116 (48)", "23 (21.5)", "24 (22.5)", "259 (60)", "142 (36)", 
"142 (34)", "21 (15)", "21 (15)", "244.5 (93.5)", "107.5 (51.5)", 
"107 (50.75)", "24 (20.75)", "24.5 (21.75)"), class = c("glue", 
"character"))), row.names = c(NA, -20L), class = c("tbl_df", 
"tbl", "data.frame"))

我尝试了很多方法,但都没有解决问题:

就像这样:

data1 <- data %>% 
   tidyr::pivot_wider(.,
                      names_from = "variable",
                      values_from = c("Mean (SD)", "Median (IQR)")) %>% 
   pivot_longer(cols = 1:40,
   names_to = c("Names"),
   values_to = c("Mean_SD", "Median_IQR"))

有什么想法吗?

提前感谢您!

附加信息:原始“数据”中有20行/观测值。每个称为:VarA_VVV_Cond1,这意味着我有2个变量:(AB),5个测试(VVV,WWW,XXX,YYY,ZZZ)和2个条件(Cond1Cond2)。

鉴于此,我还有测试Mean(SD)Median(IQR)。这就是想法。

注:我看到了很多关于枢轴的帖子,但似乎没有考虑这一点(例如,例子…我真的希望能得到一些帮助!


原始的Mean (SD)Median (IQR)值 :) 这样我就有了十列(每个测试一个),像这样:````VVV(Mean/SD), VVV(Median/IQR), .....``` - Larissa Cury
3个回答

4

在对“宽”进行旋转之前,我们可能需要分离列“变量”

library(dplyr)
library(tidyr)
library(stringr)
data %>%
   mutate(variable = str_replace(variable, "^(\\w+)_(\\w+)_(\\w+)",
        "\\1_\\3,\\2")) %>% 
   separate(variable, into = c("variable", "newcol"), sep = ",") %>% 
   pivot_wider(names_from = newcol, values_from = c(`Mean (SD)`,
      `Median (IQR)`), names_glue = "{newcol}({.value})")%>% 
    rename_with(~ str_remove(str_replace(.x, "\\s+\\(", "/"), "\\)"), -variable)

-输出
# A tibble: 4 × 11
  variable   `VVV(Mean/SD)` `WWW(Mean/SD)` `XXX(Mean/SD)` `YYY(Mean/SD)` `ZZZ(Mean/SD)` `VVV(Median/IQR)` `WWW(Median/IQR)` `XXX(Median/IQR)` `YYY(Median/IQR)`
  <chr>      <glue>         <glue>         <glue>         <glue>         <glue>         <glue>            <glue>            <glue>            <glue>           
1 VarA_Cond1 268.59 (80.6)  149.07 (39.79) 147.71 (39.65) 18.85 (10.76)  20.98 (11.34)  276 (86)          155 (40.5)        155 (41)          18 (15.5)        
2 VarA_Cond2 228.49 (83.77) 113.66 (35.91) 112.64 (35.75) 24.07 (15.79)  26.36 (16.51)  241 (116)         116 (51)          116 (48)          23 (21.5)        
3 VarB_Cond1 250.72 (61.53) 140.71 (30.52) 138.93 (30.37) 21.02 (10.46)  22.72 (11.05)  259 (60)          142 (36)          142 (34)          21 (15)          
4 VarB_Cond2 225.98 (81.32) 112.43 (36.09) 111.1 (36.41)  24.71 (16.77)  26.59 (17.49)  244.5 (93.5)      107.5 (51.5)      107 (50.75)       24 (20.75)       
# … with 1 more variable: `ZZZ(Median/IQR)` <glue>

哇,我简直不敢相信解决方案竟然这么简单!!!非常感谢你!如果可以的话,能否解释一下这个正则表达式部分"^(\\w+)_(\\w+)_(\\w+)", "\\1_\\3,\\2")) - Larissa Cury
@LarissaCury (..) 语法用于捕获一些字符。\\w+ - 匹配字符串开头 (^) 的单词,并进行捕获,然后匹配 _,接着按照之前描述的方式捕获第二个和第三个单词。然后,我们只需重新排列捕获组的反向引用,以便我们可以通过唯一分隔符 , 进行拆分。rename_with 也是如此完成,以匹配您期望的列名称模式。 - akrun
这是我需要更深入学习的东西(正则表达式),谢谢!如果我想在 ZZZ(Median/IQR) 之间加一个空格,比如 ZZZ<这里>(Median/IQR),该怎么办?我尝试过 " ",\s,但都不起作用... @akrun - Larissa Cury
@LarissaCury 你可以在这里或者这里查看。 - akrun
1
谢谢你的建议! - Larissa Cury
显示剩余2条评论

4

这样的东西:

library(tidyverse)

df %>% 
  separate(variable, into = c("Var", "XXX", "Cond"), sep = "_") %>% 
  pivot_wider(names_from = XXX,
              values_from = c(`Mean (SD)`,`Median (IQR)`)) %>% 
  mutate(x = paste(Var, Cond, sep = "_"), .keep="unused", .before=1) %>% 
  column_to_rownames("x")

    Mean (SD)_VVV  Mean (SD)_WWW  Mean (SD)_XXX Mean (SD)_YYY Mean (SD)_ZZZ Median (IQR)_VVV Median (IQR)_WWW Median (IQR)_XXX Median (IQR)_YYY Median (IQR)_ZZZ
VarA_Cond1  268.59 (80.6) 149.07 (39.79) 147.71 (39.65) 18.85 (10.76) 20.98 (11.34)         276 (86)       155 (40.5)         155 (41)        18 (15.5)          20 (14)
VarA_Cond2 228.49 (83.77) 113.66 (35.91) 112.64 (35.75) 24.07 (15.79) 26.36 (16.51)        241 (116)         116 (51)         116 (48)        23 (21.5)        24 (22.5)
VarB_Cond1 250.72 (61.53) 140.71 (30.52) 138.93 (30.37) 21.02 (10.46) 22.72 (11.05)         259 (60)         142 (36)         142 (34)          21 (15)          21 (15)
VarB_Cond2 225.98 (81.32) 112.43 (36.09)  111.1 (36.41) 24.71 (16.77) 26.59 (17.49)     244.5 (93.5)     107.5 (51.5)      107 (50.75)       24 (20.75)     24.5 (21.75)
> 

3
这也可以运行:
df[c('Var', 'Group', 'Cond')] <- str_split_fixed(df$variable, "_", n = Inf)

df %>%
  pivot_wider(id_cols = c(Var, Cond), 
              values_from = c(`Mean (SD)`, `Median (IQR)`), 
              names_from = Group)


Var   Cond  `Mean (SD)_VVV` `Mean (SD)_WWW` `Mean (SD)_XXX` `Mean (SD)_YYY` `Mean (SD)_ZZZ` `Median (IQR)_VVV`
  <chr> <chr> <glue>          <glue>          <glue>          <glue>          <glue>          <glue>            
1 VarA  Cond1 268.59 (80.6)   149.07 (39.79)  147.71 (39.65)  18.85 (10.76)   20.98 (11.34)   276 (86)          
2 VarA  Cond2 228.49 (83.77)  113.66 (35.91)  112.64 (35.75)  24.07 (15.79)   26.36 (16.51)   241 (116)         
3 VarB  Cond1 250.72 (61.53)  140.71 (30.52)  138.93 (30.37)  21.02 (10.46)   22.72 (11.05)   259 (60)          
4 VarB  Cond2 225.98 (81.32)  112.43 (36.09)  111.1 (36.41)   24.71 (16.77)   26.59 (17.49)   244.5 (93.5) 

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接