如何使用expss创建双标题表格

3
我一直在阅读有关使用expss软件包制作带标签的表格的两个标题,请参见此处此处,但在线代码对我无效。我的想法是创建一个非常类似于此图像的表格:

enter image description here

数据框如下:

df <- data.frame(Categoria = c("gender", "gender" , "gender", "gender", "gender", "gender", 
                                 "religion", "religion", "religion", "religion", "religion",
                                 "religion", "religion", "religion", "religion", "religion", 
                                 "religion", "religion"),
                 Opcoes_da_categoria = c("Mulher", "Homem", "Mulher", "Homem", "Mulher", 
                                           "Homem", "Outra religião", "Católico", "Agnóstico ou ateu",
                                           "Evangélico", "Outra religião", "Católico", 
                                           "Agnóstico ou ateu", "Evangélico", "Outra religião",
                                           "Católico", "Agnóstico ou ateu", "Evangélico"),
                 Resposta = c("A Favor", "A Favor", "Contra",  "Contra",  "Não sei", "Não sei",
                              "A Favor", "A Favor", "A Favor", "A Favor", "Contra", "Contra",
                              "Contra", "Contra", "Não sei", "Não sei", "Não sei", "Não sei"),
                 value_perc = c(65, 50, 33, 43, 2, 7, 67, 64, 56, 28, 31, 34, 35, 66, 2, 2, 10, 5))

我创建两个表头的代码如下,但由于以下问题,它无法正常工作:

  • 表格应该有两个标题
  • 列名不应该出现在表格中
  • 值不应该有小数位
library(expss)

my_table <- df %>%
  tab_cells(Resposta) %>%
  tab_weight(value_perc) %>% 
  tab_cols(Opcoes_da_categoria, Categoria) %>%
  tab_stat_cpct(total_label = NULL) %>%
  tab_pivot()

library(gridExtra)

png("my_table.png", height = 50*nrow(my_table), width = 200*ncol(my_table))
grid.table(my_table)
dev.off()
  

enter image description here


不熟悉 expss,但可以使用 knitr::kable()kableExtra 完成此操作。我不知道您想要的确切样式,但这是另一个选项:vignette here - Andrew
我也尝试了knitr::kable()和kableExtra,但对我也没有用。使用这些包代替expss不是问题。 - polo
@polo 我最近开发了一个包,可以自动完成类似于您尝试实现的功能。输出与您的图像略有不同,但您可能想要在这里查看它。 - Dan Chaltiel
谢谢,@DanChaltiel - polo
3个回答

2
我不知道 expss 是什么,但最近我使用了 flextable 并发现它很好用。虽然我并不是专家,但我成功制作了一个外观不错的表格,接近您想要的效果。 从您的数据框开始,需要进行一些更改,以使其符合所需的表格格式。然后通过提取下划线前面的部分来重命名列名。构建描述列和标题名称依赖关系的 DF typology(可以在上面的链接中找到)。 然后是 flextable 部分,首先构建一个 flextable,然后应用 typology 和其他格式化命令。 最终得到的结果如附图所示。

library(tidyverse)
library(flextable)
#> 
#> Attache Paket: 'flextable'
#> The following object is masked from 'package:purrr':
#> 
#>     compose
df <- data.frame(
  Categoria = c(
    "gender", "gender", "gender", "gender", "gender", "gender",
    "religion", "religion", "religion", "religion", "religion",
    "religion", "religion", "religion", "religion", "religion",
    "religion", "religion"
  ),
  Opcoes_da_categoria = c(
    "Mulher", "Homem", "Mulher", "Homem", "Mulher",
    "Homem", "Outra religião", "Católico", "Agnóstico ou ateu",
    "Evangélico", "Outra religião", "Católico",
    "Agnóstico ou ateu", "Evangélico", "Outra religião",
    "Católico", "Agnóstico ou ateu", "Evangélico"
  ),
  Resposta = c(
    "A Favor", "A Favor", "Contra", "Contra", "Não sei", "Não sei",
    "A Favor", "A Favor", "A Favor", "A Favor", "Contra", "Contra",
    "Contra", "Contra", "Não sei", "Não sei", "Não sei", "Não sei"
  ),
  value_perc = c(65, 50, 33, 43, 2, 7, 67, 64, 56, 28, 31, 34, 35, 66, 2, 2, 10, 5)
)


# adjust your df to match cols and names with tidyvers
dfa <- df %>%
  pivot_wider(names_from =c('Opcoes_da_categoria', 'Categoria'), values_from = 'value_perc')
nam <- str_extract(colnames(dfa),'^[^_]+')
colnames(dfa) <- nam

typology <- data.frame(
  col_keys = c( "Resposta",
                "Mulher", "Homem",
                "Outra religião", "Católico",
                "Agnóstico ou ateu", "Evangélico"),
  what = c("", "Genero", "Genero", "Religio",
           "Religio", "Religio", 'Religio'),
  measure = c( "Resposta", 
               "Mulher", "Homem",
               "Outra religião", "Católico",
               "Agnóstico ou ateu", "Evangélico"),
  stringsAsFactors = FALSE )

library(officer) # needed for making border
dftab <- flextable::flextable(dfa)

border_v = fp_border(color="gray")
dftab <- dftab %>% 
  set_header_df(mapping = typology, key = "col_keys" ) %>% 
  merge_h(part = "header") %>% 
  merge_v(part = "header") %>% 
  theme_booktabs() %>% 
  vline(border = border_v, j =3, part = 'body') %>% 
  vline(border = border_v, j =3, part = 'header')
print(dftab)
#> a flextable object.
#> col_keys: `Resposta`, `Mulher`, `Homem`, `Outra religião`, `Católico`, `Agnóstico ou ateu`, `Evangélico` 
#> header has 2 row(s) 
#> body has 3 row(s) 
#> original dataset sample: 
#>   Resposta Mulher Homem Outra religião Católico Agnóstico ou ateu Evangélico
#> 1  A Favor     65    50             67       64                56         28
#> 2   Contra     33    43             31       34                35         66
#> 3  Não sei      2     7              2        2                10          5

enter image description here


1

这里有一个灵活的 kable 解决方案,只要你能将数据转换为宽格式,就可以适应不同的表格。希望它能帮到你——如果你有问题,请告诉我!

library(dplyr)
library(tidyr)
library(knitr)
library(kableExtra)

df_wide <- df %>% # transform data to wide format, "drop" name for Resposta
  pivot_wider(names_from = c(Categoria, Opcoes_da_categoria), 
              values_from = value_perc, names_sep = "_") %>%
  rename(" " = Resposta)

cols <- sub("(.*?)_(.*)", "\\2", names(df_wide)) # grab everything after the _
grps <- sub("(.*?)_(.*)", "\\1", names(df_wide)) # grab everything before the _

df_wide %>%
  kable(col.names = cols) %>% 
  kable_styling(c("striped"), full_width = FALSE) %>% # check out ?kable_styling for other options
  add_header_above(table(grps)[unique(grps)]) # unique makes sure it is the correct order

0

您尝试在 RStudio 的 Data Viewer 中查看表格。它将 expss 表格显示为通常的 data.frames。

您可以通过设置 expss_output_viewer() 在 RStudio Viewer(而不是 Data Viewer)中查看 expss 表格:

df <- data.frame(Categoria = c("gender", "gender" , "gender", "gender", "gender", "gender", 
                               "religion", "religion", "religion", "religion", "religion",
                               "religion", "religion", "religion", "religion", "religion", 
                               "religion", "religion"),
                 Opcoes_da_categoria = c("Mulher", "Homem", "Mulher", "Homem", "Mulher", 
                                         "Homem", "Outra religião", "Católico", "Agnóstico ou ateu",
                                         "Evangélico", "Outra religião", "Católico", 
                                         "Agnóstico ou ateu", "Evangélico", "Outra religião",
                                         "Católico", "Agnóstico ou ateu", "Evangélico"),
                 Resposta = c("A Favor", "A Favor", "Contra",  "Contra",  "Não sei", "Não sei",
                              "A Favor", "A Favor", "A Favor", "A Favor", "Contra", "Contra",
                              "Contra", "Contra", "Não sei", "Não sei", "Não sei", "Não sei"),
                 value_perc = c(65, 50, 33, 43, 2, 7, 67, 64, 56, 28, 31, 34, 35, 66, 2, 2, 10, 5))

library(expss)

my_table <- df %>%
    tab_cells(Resposta) %>%
    tab_weight(value_perc) %>% 
    tab_cols(Opcoes_da_categoria, Categoria) %>%
    tab_stat_cpct(total_label = NULL) %>%
    tab_pivot()

expss_digits(0) # turn off decimal digits
expss_output_viewer() # turn on displaying tables in the viewer
my_table

expss_output_default() # turn off displaying tables in the viewer

这段代码会产生以下结果: enter image description here

如果你真的想在数据查看器中显示表格,你可以将表格转换为通常的 data.frame。有一个特殊的命令可以实现这个功能 - split_table_to_df:
View(split_table_to_df(my_table))

结果如下: 在此输入图片描述

更新:

df <- data.frame(Categoria = c("gender", "gender" , "gender", "gender", "gender", "gender", 
                               "religion", "religion", "religion", "religion", "religion",
                               "religion", "religion", "religion", "religion", "religion", 
                               "religion", "religion"),
                 Opcoes_da_categoria = c("Mulher", "Homem", "Mulher", "Homem", "Mulher", 
                                         "Homem", "Outra religião", "Católico", "Agnóstico ou ateu",
                                         "Evangélico", "Outra religião", "Católico", 
                                         "Agnóstico ou ateu", "Evangélico", "Outra religião",
                                         "Católico", "Agnóstico ou ateu", "Evangélico"),
                 Resposta = c("A Favor", "A Favor", "Contra",  "Contra",  "Não sei", "Não sei",
                              "A Favor", "A Favor", "A Favor", "A Favor", "Contra", "Contra",
                              "Contra", "Contra", "Não sei", "Não sei", "Não sei", "Não sei"),
                 value_perc = c(65, 50, 33, 43, 2, 7, 67, 64, 56, 28, 31, 34, 35, 66, 2, 2, 10, 5))

library(expss)

my_table <- df %>%
    apply_labels(
        Resposta = "",
        Opcoes_da_categoria = "",
        Categoria = ""
    ) %>% 
    tab_cells(Resposta) %>%
    tab_weight(value_perc) %>% 
    tab_cols(Categoria, Opcoes_da_categoria) %>%
    tab_stat_cpct(total_row_position = "none") %>%
    tab_pivot()

expss_digits(0) # turn off decimal digits
View(my_table)

enter image description here


谢谢您的回答,Gregory Demin。但我的问题是表格中不应该出现列名(Opcoes_da_categoria和Categoria)。表格应该有两个标题(分类列的文本,然后是Opcoes_da_categoria的文本)。所以“性别”和“宗教”应该排在前面……还有,我该如何删除“#Total cases”行? - polo

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接