如何使用ggplot2制作基本的R风格箱线图?

13

我需要为即将发布的文章制作大量箱线图。我想使用ggplot2,因为我认为它更适合未来的项目,但我的PI坚持要求我按照base-R的风格制作这些图。他特别想要虚线,这样它们会看起来与我们之前做过的图相似。我已经用鸢尾花数据集做了一个例子给你展示,使用以下代码:

plot(iris$Species,
     iris$Sepal.Length,
     xlab='Species',
     ylab='Sepal Length',
     main='Sepal Variation Across Species',
     col='white')

基础 R 绘图

我的问题是如何使用 ggplot2 制作类似的绘图?

这是我的尝试:

library("ggplot2")
ggplot(iris) +
  geom_boxplot(aes(x=Species,y=Sepal.Length),linetype="dashed") +
  ggtitle("Sepal Variation Across Species")

ggplot尝试

我需要虚线和实线的组合,但我无法做到。我已经查看了https://stats.stackexchange.com/questions/8137/how-to-add-horizontal-lines-to-ggplot2-boxplot,这非常接近但没有虚线,而我们需要虚线。此外,异常值是填充的圆圈,这与基础R不同。


使用 outlier.color = 'black'outlier.fill = 'white' 来复制这些圆形。 - VFreguglia
很遗憾,IQR或中位数线没有单独的美学映射,但您可以创建自己的geom_boxplot()版本,并通过添加此类参数来修改传递到https://github.com/tidyverse/ggplot2/blob/master/R/geom-boxplot.r#L254-L259的内容。 - hrbrmstr
3个回答

17

使用ggplot2生成“基础R风格”的箱线图,我们可以将4个箱线图对象叠加在一起。 顺序很重要,因此如果您修改代码,请记住这一点。我强烈建议您通过单独绘制每个箱线图层来探索此代码;这样您就可以了解不同层之间的交互方式。

箱线图的排序如下(从底部到顶部排序):

  • (1)首先放置垂直虚线
  • (2)一个实心框,其中包含一个中位数线,它覆盖了来自(1)的虚线框
  • (3)和(4)由使用误差线创建的实线鬃毛线,最小值设置为最大值,反之亦然。

我还添加了自定义断点以匹配您的基础R图,您可以根据需要更改。 panel.border用于创建类似于基础R的薄边框。为了获得所需的开放圆圈,我们使用outlier.shape

代码:

library("ggplot2")

ggplot(data = iris, aes(x = Species, y = Sepal.Length)) +
  geom_boxplot(linetype = "dashed", outlier.shape = 1) +
  stat_boxplot(aes(ymin = ..lower.., ymax = ..upper..), outlier.shape = 1) +
  stat_boxplot(geom = "errorbar", aes(ymin = ..ymax..)) +
  stat_boxplot(geom = "errorbar", aes(ymax = ..ymin..)) +
  scale_y_continuous(breaks = seq(4.5, 8.0, 0.5)) +
  labs(title = "Sepal Variation Across Species",
       x = "Species",
       y = "Sepal Length") +
  theme_classic() + # remove panel background and gridlines
  theme(plot.title = element_text(hjust = 0.5,  # hjust = 0.5 centers the title
                                  size = 14,
                                  face = "bold"),
        panel.border = element_rect(linetype = "solid",
                                    colour = "black", fill = "NA", size = 0.5))

情节:

enter image description here

虽然不是完全相同,但看起来是一个相当不错的近似。希望这对您有足够的帮助。祝您好运,愉快绘图!


5
这里是对@Marcus的优秀解决方案进行包装的内容,以便更方便地使用和更灵活:
geom_boxplot2 <- function(mapping = NULL, data = NULL, stat = "boxplot", position = "dodge2", 
                          ..., outlier.colour = NULL, outlier.color = NULL, outlier.fill = NULL, 
                          outlier.shape = 1, outlier.size = 1.5, outlier.stroke = 0.5, 
                          outlier.alpha = NULL, notch = FALSE, notchwidth = 0.5, varwidth = FALSE, 
                          na.rm = FALSE, show.legend = NA, inherit.aes = TRUE,
                          linetype = "dashed"){
  list(
    geom_boxplot(mapping = mapping, data = data, stat = stat, position = position,
                 outlier.colour = outlier.colour, outlier.color = outlier.color, 
                 outlier.fill = outlier.fill, outlier.shape = outlier.shape, 
                 outlier.size = outlier.size, outlier.stroke = outlier.stroke, 
                 outlier.alpha = outlier.alpha, notch = notch, 
                 notchwidth = notchwidth, varwidth = varwidth, na.rm = na.rm, 
                 show.legend = show.legend, inherit.aes = inherit.aes, 
                 linetype = linetype, ...),
    stat_boxplot(aes(ymin = ..lower.., ymax = ..upper..), outlier.shape = 1) ,
    stat_boxplot(geom = "errorbar", aes(ymin = ..ymax..)) ,
    stat_boxplot(geom = "errorbar", aes(ymax = ..ymin..)) ,
    theme_classic(), # remove panel background and gridlines
    theme(plot.title = element_text(hjust = 0.5,  # hjust = 0.5 centers the title
                                    size = 14,
                                    face = "bold"),
          panel.border = element_rect(linetype = "solid",
                                      colour = "black", fill = "NA", size = 0.5))
  )
}

ggplot(data = iris, aes(x = Species, y = Sepal.Length)) +
  geom_boxplot2() +
  scale_y_continuous(breaks = seq(4.5, 8.0, 0.5)) + # not sure how to generalize this
  labs(title = "Sepal Variation Across Species", y = "Sepal Length")

3

在@Marcus和@Moody_Mudskipper提供的基础上进一步建设:

geom_boxplotMod <- function(mapping = NULL, data = NULL, stat = "boxplot", 
    position = "dodge2", ..., outlier.colour = NULL, outlier.color = NULL, 
    outlier.fill = NULL, outlier.shape = 1, outlier.size = 1.5, 
    outlier.stroke = 0.5, outlier.alpha = NULL, notch = FALSE, notchwidth = 0.5,
    varwidth = FALSE, na.rm = FALSE, show.legend = NA, inherit.aes = TRUE,
    linetype = "dashed") # to know how these come here use: args(geom_boxplot)
    {
    list(geom_boxplot(
            mapping = mapping, data = data, stat = stat, position = position,
            outlier.colour = outlier.colour, outlier.color = outlier.color, 
            outlier.fill = outlier.fill, outlier.shape = outlier.shape, 
            outlier.size = outlier.size, outlier.stroke = outlier.stroke, 
            outlier.alpha = outlier.alpha, notch = notch, 
            notchwidth = notchwidth, varwidth = varwidth, na.rm = na.rm, 
            show.legend = show.legend, inherit.aes = inherit.aes, linetype = 
            linetype, ...),
        stat_boxplot(geom = "errorbar", aes(ymin = ..ymax..), width = 0.25),
        #the width of the error-bar heads are decreased
        stat_boxplot(geom = "errorbar", aes(ymax = ..ymin..), width = 0.25),
        stat_boxplot(aes(ymin = ..lower.., ymax = ..upper..),
            outlier.shape = 1),
        theme(panel.background = element_blank(),
            panel.border = element_rect(size = 1.5, fill = NA),
            plot.title = element_text(hjust = 0.5),
            axis.title = element_text(size = 12),
            axis.text = element_text(size = 10.5))
        )
    }

library(tidyverse); library(ggplot2);
ggplot(iris, aes(x=Species,y=Sepal.Length, colour = Species)) +
    geom_boxplotMod() +
    ggtitle("Sepal Variation Across Species")

2020年7月20日创建,使用reprex包(v0.3.0)


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接