ggplot2：更复杂的分面

Question

ggplot2：更复杂的分面

8

我有一个热图，它不断变得越来越复杂。融化数据的示例：

head(df2)
  Class     Subclass         Family               variable value
1     A chemosensory family_1005117 caenorhabditis_elegans    10
2     A chemosensory family_1011230 caenorhabditis_elegans     4
3     A chemosensory family_1022539 caenorhabditis_elegans    10
4     A        other family_1025293 caenorhabditis_elegans    NA
5     A chemosensory family_1031345 caenorhabditis_elegans    10
6     A chemosensory family_1033309 caenorhabditis_elegans    10
tail(df2)
     Class Subclass        Family        variable value
6496     C  class c family_455391 trichuris_muris     1
6497     C  class c family_812893 trichuris_muris    NA
6498     F  class f family_225491 trichuris_muris     1
6499     F  class f family_236822 trichuris_muris     1
6500     F  class f family_276074 trichuris_muris     1
6501     F  class f family_768194 trichuris_muris    NA

使用ggplot2和geom_tile，我能够生成一张美丽的数据热图。我为这段代码感到自豪（这是我第一次在R中编写代码），所以我在下面发布了它：

df2[df2 == 0] <- NA
df2[df2 > 11] <- 10
df2.t <- data.table(df2)
df2.t[, clade := ifelse(variable %in% c("pristionchus_pacificus", "caenorhabditis_elegans", "ancylostoma_ceylanicum", "necator_americanus", "nippostrongylus_brasiliensis", "angiostrongylus_costaricensis", "dictyocaulus_viviparus", "haemonchus_contortus"), "Clade V",
                 ifelse(variable %in% c("meloidogyne_hapla","panagrellus_redivivus", "rhabditophanes_kr3021", "strongyloides_ratti"), "Clade IV",
                 ifelse(variable %in% c("toxocara_canis", "dracunculus_medinensis", "loa_loa", "onchocerca_volvulus", "ascaris_suum", "brugia_malayi", "litomosoides_sigmodontis", "syphacia_muris", "thelazia_callipaeda"), "Clade III",
                 ifelse(variable %in% c("romanomermis_culicivorax", "trichinella_spiralis", "trichuris_muris"), "Clade I",
                 ifelse(variable %in% c("echinococcus_multilocularis", "hymenolepis_microstoma", "mesocestoides_corti", "taenia_solium", "schistocephalus_solidus"), "Cestoda",
                 ifelse(variable %in% c("clonorchis_sinensis", "fasciola_hepatica", "schistosoma_japonicum", "schistosoma_mansoni"), "Trematoda", NA))))))]
df2.t$clade <- factor(df2.t$clade, levels = c("Clade I", "Clade III", "Clade IV", "Clade V", "Cestoda", "Trematoda"))
plot2 <- ggplot(df2.t, aes(variable, Family))
tile2 <- plot2 + geom_tile(aes(fill = value)) + facet_grid(Class ~ clade, scales = "free", space = "free")
tile2 <- tile2 + scale_x_discrete(expand = c(0,0)) + scale_y_discrete(expand = c(0,0))
tile2 <- tile2 + theme(axis.text.y = element_blank(), axis.ticks.y = element_blank(), legend.position = "right", axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.55), axis.text.y = element_text(size = rel(0.35)), panel.border = element_rect(fill=NA,color="grey", size=0.5, linetype="solid"))
tile2 <- tile2 + xlab(NULL)
tile2 <- tile2 + scale_fill_gradientn(breaks = c(1,2,3,4,5,6,7,8,9,10),labels = c("1", "2", "3", "4", "5", "6", "7", "8", "9", ">10"), limits = c(1, 10), colours = palette(11), na.value = "white", name = "Members")'

您可以看到，需要进行相当多的手动重新排序，否则代码非常简单。以下是图像输出：

heatmap

但是，您可能会注意到一整列信息“Subclass”未被利用。基本上，每个子类都适合于一个类别中。如果我能够将这些子类分别显示在已显示的类别方面，那就太完美了。换句话说，只有A类具有不同的子类。其他类别仅将其类别名称镜像（F =类别f）。有没有其他方法可以组织此热图，以便我可以显示所有相关信息？缺失的子类包含一些最关键的数据，并且是从数据中推断出结论所需的最必要的数据。

另一种方法是将子类分类而不是类别，手动重新排序它们，使类别聚集在一起，然后在它们周围画一个框以划分每个类别。我不知道如何完成此操作。

任何帮助都将非常有用。如果您需要任何其他信息，请告诉我。

- Nic

将 facet_grid(Class ~ clade, 更改为 facet_grid(Class + Subclass ~ clade,。 - Gregor Thomas

然而，对于标签的排序，您可能希望使用 子类 + 类。 - Gregor Thomas

这是一种方法，谢谢。它不完全符合我的设想（嵌套的方面），因为它只是将其分成子类，然后添加一个类标签。我想它不如我想象的那么漂亮，但也许这就是最好的了。感谢您的意见，@Gregor。 - Nic

我猜我不确定你想要什么，如果那不是它。 - Gregor Thomas

2个回答

2

将我的评论转化为一个带有一些简单演示数据的答案：

这并不难（甚至在?facet_grid中有例子，尽管它们在底部）。

# generate some nested data
dat = data.frame(x = rnorm(12), y = rnorm(12), class = rep(LETTERS[1:2], each = 6),
                 subclass = rep(letters[1:6], each = 2))

# plot it
ggplot(dat, aes(x, y)) + geom_point() +
    facet_grid(subclass + class ~ .)

您可以在~的任意一侧添加多个因子来实现此操作！

- Gregor Thomas

从功能上来说，这正是我想要的。从美学角度来看，它不是我想要的。但这可能是我的问题，而不是ggplot2的问题。我不喜欢它如何为每个嵌套子类重复类标签。我表达清楚了吗？对我来说，正确的做法似乎是有一个更大的类别条（灰色标签区域），跨越其包含的所有数据，并带有一些嵌套的子类别标签。@Gregor - Nic

@Nic 是的，我现在明白了。问题在于 ggplot 不知道这些因子是嵌套的，所以解决方案对于它们是嵌套还是交叉的都是通用的。 - Gregor Thomas

@Nic，类似这样的http://stackoverflow.com/.../annotating-facet-title-as-strip-over-facet.....。 - Sandy Muspratt

@Sandy 是的！那正是我要找的。谢谢！ - Nic

@Sandy 我猜我没有足够的声望在源代码上发表评论...无论如何，我想我能够将grob定位到正确的位置，但现在它比应该的要大得多。我认为正在发生的是它被放置在与图例相同的列中，而图例相对较宽，因此完全覆盖了图例并跨越了相当大的区域。我想这与原始gtable的组织有关。不幸的是，表格太大了，gtable_show_layout没有什么帮助。你能帮我吗？ - Nic

我在这里发布了一个新答案。它涉及图例，并且比链接后面的答案更加通用。 - Sandy Muspratt

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Sandy Muspratt · Accepted Answer

这将在原始条带的右侧添加一个新条带，并位于图例的左侧。

library(ggplot2)
library(gtable)
library(grid)

p <- ggplot(mtcars, aes(mpg, wt, colour = factor(vs))) + geom_point()
p <- p + facet_grid(cyl ~ gear)

# Convert the plot to a grob
gt <- ggplotGrob(p)

# Get the positions of the right strips in the layout: t = top, l = left, ...
strip <-c(subset(gt$layout, grepl("strip-r", gt$layout$name), select = t:r))

#  New column to the right of current strip
gt <- gtable_add_cols(gt, gt$widths[max(strip$r)], max(strip$r))  

# Add grob, the new strip, into new column
gt <- gtable_add_grob(gt, 
  list(rectGrob(gp = gpar(col = NA, fill = "grey85", size = .5)),
  textGrob("Number of Cylinders", rot = -90, vjust = .27, 
        gp = gpar(cex = .75, fontface = "bold", col = "black"))), 
        t = min(strip$t), l = max(strip$r) + 1, b = max(strip$b), name = c("a", "b"))

# Add small gap between strips
gt <- gtable_add_cols(gt, unit(1/5, "line"), max(strip$r))

# Draw it
grid.newpage()
grid.draw(gt)

enter image description here