将y轴图例保存为单独的绘图对象(grob)？

Question

将y轴图例保存为单独的绘图对象(grob)？

8

我有一个非常大的散点图，其中一个点是“命中”。我想在图表的顶部和侧面制作直方图来表示这些命中情况，就像以下网站上所示：http://blog.mckuhn.de/2009/09/learning-ggplot2-2d-plot-with.html。我可以将图形排列成2x2的网格，但是我遇到了一个问题：我的主要散点图的y轴标题很长（对于项目很重要），而在2x2的网格中，顶部的直方图会延伸到整个宽度，并且不再与x轴对齐。我的想法是制作一个3x3网格，其中我使用最左边的网格用于标题。但是这需要将Y轴文本保存为“grob”。在上述博客文章中，可以通过以下方式实现：

p <- qplot(data = mtcars, mpg, hp, geom = "point", colour = cyl)
legend <- p + opts(keep= "legend_box")

这样可以将“图例”放置在2x2网格布局中。如果我能使用相同的逻辑为Y轴标签创建一个单独的grob，那就太棒了。我至少尝试了以下方法：

legend <- p +opts(keep="Yaxis")
legend <- p +opts(keep="axis_text_y")
legend <- p +opts(keep="axis_text")
..... and many others

除了图例框之外，是否可以从其他东西制作grob？如果可以，请告诉我。如果不行，我会接受任何关于如何排列三个图并保持对齐并保存Y标签的建议。

谢谢

显示标签如何影响垂直对齐以及我想捕获yaxis文本的图像

- zach

2

我不确定你想做什么，但是 +opts（keep="ylabel"） 足够吗？ - kohske

opts(keep="ylabel") 将保留 Y 轴的标题。我想要保留每个 y 值的所有文本标签。但由于 "ylabel" 可以正常工作，我将尝试一些其他排列组合，看看是否能够捕获 Y 轴的文本。

- zach

3

g = ggplotGrob(p) ; gg = editGrob(getGrob(g, gPath("axis_v-3-1"), grep=TRUE), vp=viewport()) ; grid.draw(gg) 只绘制了图表的 y 轴，如果这有所帮助。 - baptiste

@zach 你有多想做这个？这个问题相当老了，所以我想知道你是否已经超越了这个需求。然而，如果你仍然想要制作这样的图表，我可以写一个例子，展示如何使用基本的“grid”包来使所有这些变得非常简单。如果你想要这个答案，请直接说出来。 - Dinre

@dinre，感谢你的提供。这个问题确实很老了，我已经超越它了，但它确实获得了相当多的浏览量，所以如果你能写一个逐步指南，我会很乐意将其标记为被接受的答案。 - zach

显示剩余4条评论

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Dinre · Accepted Answer

这个问题已经存在了很长时间，现在是为后人记录答案的时候了。

简短的回答是，使用“lattice”和“ggplot2”包中的函数包装器无法完成高度定制的数据可视化。函数包装器的目的是将一些决策从你手中拿走，因此你始终会受到函数编码者最初设想的限制。我强烈建议每个人都学习“lattice”或“ggplot2”包，但这些包更适用于数据探索而不是对数据可视化进行创意处理。 本答案适用于那些想创建自定义可视化的人。以下过程可能需要半天时间，但这比将“lattice”或“ggplot2”包改成你想要的形状要花费的时间少得多。这并不是对这两个包的批评；这只是它们目的的副产品。当你需要为出版物或客户创建创意可视化时，你一天中的4或5个小时与回报相比微不足道。

使用“grid”包制作自定义可视化的工作非常简单，但这并不意味着其中的数学总是简单的。实际上，这个例子中的大部分工作都是数学而不是图形。前言：在使用基本“grid”包制作可视化之前，有一些事情你应该知道。首先，“grid”是基于视口的概念工作的。这些是绘图空间，允许你从该空间内部引用，忽略其余的图形。这很重要，因为它允许你制作图形，而不必将你的工作缩放到整个空间的分数。这很像基本绘图函数中的布局选项，只是它们可以重叠、旋转和透明。

单位是另一件需要知道的事情。每个视口都有多种单位，你可以使用这些单位来指示位置和大小。你可以在“grid”文档中看到整个列表，但我经常只使用几个：npc、native、strwidth 和 lines。npc单位从左下角的（0,0）开始，到右上角的c（1,1）。本机单位使用“xscale”和“yscale”创建实际上是数据的绘图空间。strwidth单位告诉你一定字符串文本在图形上打印时的宽度。Lines单位告诉你一行文本在图形上打印时的高度。由于始终有多种类型的单位可用，所以你应该养成始终使用“unit”函数明确定义数字或在绘图函数内部指定“default.units”参数的习惯。

最后，你可以指定所有对象位置的对齐方式。这非常重要。这意味着你可以指定形状的位置，然后说你希望该形状如何水平和垂直对齐（中心、左、右、底部、顶部）。通过引用其他对象的位置，你可以完美地对齐这些物体。

我们正在制作的是什么：这不是一个完美的图形，因为我需要猜测OP想要什么，但足以让我们走向完美的图形。

演示图

第一步：加载一些库来进行工作。当您想要进行高度自定义的可视化时，请使用“grid”包。它是调用“lattice”和“ggplot2”等封装器的基本函数集。当您想要使用日期时，请使用“lubridate”包，因为它可以使您的生活更好。最后一个是个人喜好：当我要进行任何数据摘要工作时，我喜欢使用“plyr”包。它允许我快速将数据形状转换为聚合形式。

library(grid)
library(lubridate)
library(plyr)

示例数据生成：如果您已经拥有数据，则不需要此步骤，但是为了本例，我正在创建一组示例数据。您可以通过更改数据生成的用户设置来玩弄它。该脚本是灵活的，并将适应生成的数据。请随意添加更多网站并调整 lambda 值。

    set.seed(1)

#############################################
# User settings for the data generation.    #
#############################################

# Set number of hours to generate data for.
time_Periods <- 100

# Set starting datetime in m/d/yyyy hh:mm format.
start_Datetime <- "2/24/2013 00:00"

# Specify a list of websites along with a
# Poisson lambda to represent the average
# number of hits in a given time period.
df_Websites <- read.table(text="
url lambda
http://www.asitenoonereallyvisits.com 1
http://www.asitesomepeoplevisit.com 10
http://www.asitesomemorepeoplevisit.com 20
http://www.asiteevenmorepeoplevisit.com 40
http://www.asiteeveryonevisits.com 80
", header=TRUE, sep=" ")

#############################################
# Generate the data.                        #
#############################################

# Initialize lists to hold hit data and
# website names.
hits <- list()
websites <- list()

# For each time period and for each website,
# flip a coin to see if any visitors come.  If
# visitors come, use a Poisson distribution to
# see how many come.
# Also initialize the list of website names.
for (i in 1:nrow(df_Websites)){
    hits[[i]] <- rbinom(time_Periods, 1, 0.5) * rpois(time_Periods, df_Websites$lambda[i])
    websites[[i]] <- rep(df_Websites$url[i], time_Periods)
}

# Initialize list of time periods.
datetimes <- mdy_hm(start_Datetime) + hours(1:time_Periods)

# Tie the data into a data frame and erase rows with no hits.
# This is what the real data is more likely to look like
# after import and cleaning.
df_Hits <- data.frame(datetime=rep(datetimes, nrow(df_Websites)), hits=unlist(hits), website=unlist(websites))
df_Hits <- df_Hits[df_Hits$hits > 0,]

# Clean up data-generation variables.
rm(list=ls()[ls()!="df_Hits"])

步骤2：现在，我们需要决定我们想要的图形如何工作。将大小和颜色等内容分离到代码的不同部分是有用的，这样您就可以快速进行更改。在这里，我选择了一些基本设置，应该能够生成一个不错的图形。您会注意到，一些大小设置正在使用“unit”函数。这是“grid”包的神奇之一。您可以使用各种单位描述图形上的空间。例如，unit(1, "lines")是一行文本的高度。这使得布局图形变得更加容易。

#############################################
# User settings for the graphic.            #
#############################################

# Specify the window width and height and
# pixels per inch.
device_Width=12
device_Height=4.5
pixels_Per_Inch <- 100

# Specify the bin width (in hours) of the
# upper histogram.
bin_Width <- 2

# Specify a padding size for separating text
# from other plot elements.
padding <- unit(1, "strwidth", "W")

# Specify the bin cut-off values for the hit
# counts and the corresponding colors.  The
# cutoff should be the maximum value to be
# contained in the bin.
bin_Settings <- read.table(text="
cutoff color
10 'darkblue'
20 'deepskyblue'
40 'purple'
80 'magenta'
160 'red'
", header=TRUE, sep=" ")

# Specify the size of the histogram plots 
# in 'grid' units.  Override only if necessary.
# histogram_Size <- unit(6, "lines")
histogram_Size <- unit(nrow(bin_Settings) + 1, "lines")

# Set the background color for distinguishing
# between rows of data.
row_Background <- "gray90"

# Set the color for the date lines.
date_Color <- "gray40"

# Set the color for marker lines on histograms.
marker_Color <- "gray80"

# Set the fontsize for labels.
label_Size <- 10

第三步：是时候制作图形了。在SO的回答中，我的解释空间有限，所以我会概括一下，然后留下代码注释来解释细节。简而言之，我正在计算每个图表的大小，然后逐个制作图表。对于每个图表，我首先格式化我的数据，以便可以适当地指定视口。然后我放置需要在数据后面的标签，然后绘制数据。最后，我“弹出”视口以完成它。

    #############################################
# Make the graphic.                         #
#############################################

# Make sure bin cutoffs are in increasing order.
# This way, we can make assumptions later.
bin_Settings <- bin_Settings[order(bin_Settings$cutoff),]

# Initialize plot window.
# Make sure you always specify the pixels per
# inch, so you have an appropriately scaled
# graphic for output.
windows(
    width=device_Width,
    height=device_Height,
    xpinch=pixels_Per_Inch,
    ypinch=pixels_Per_Inch)
grid.newpage()

# Push an initial viewport, so we can set the
# font size to use in calculating label widths.
pushViewport(viewport(gp=gpar(fontsize=label_Size)))

# Find the list of websites in the data.
unique_Urls <- as.character(unique(df_Hits$website))

# Calculate the width of the website
# urls once printed on the screen.
label_Width <- list()
for (i in 1:length(unique_Urls)){
    label_Width[[i]] <- convertWidth(unit(1, "strwidth", unique_Urls[i]), "npc")
}
# Use the maximum url width plus two padding.
x_Label_Margin <- unit(max(unlist(label_Width)), "npc") + padding * 2

# Calculate a height for the date labels plus two padding.
y_Label_Margin <- unit(1, "strwidth", "99/99/9999") + padding * 2

# Calculate size of main plot after making
# room for histogram and label margins.
main_Width <- unit(1, "npc") - histogram_Size - x_Label_Margin
main_Height <- unit(1, "npc") - histogram_Size - y_Label_Margin

# Calculate x values, using the minimum datetime
# as zero, and counting the hours between each
# datetime and the minimum.
x_Values <- as.integer((df_Hits$datetime - min(df_Hits$datetime)))/60^2

# Initialize main plotting area
pushViewport(viewport(
    x=x_Label_Margin,
    y=y_Label_Margin,
    width=main_Width,
    height=main_Height,
    xscale=c(-1, max(x_Values) + 1),
    yscale=c(0, length(unique_Urls) + 1),
    just=c("left", "bottom"),
    gp=gpar(fontsize=label_Size)))

# Put grey background behind every other website
# to make data easier to read, and write urls as
# y-labels.
for (i in 1:length(unique_Urls)){
    if (i%%2==0){
        grid.rect(
            x=unit(-1, "npc"),
            y=i,
            width=unit(2, "npc"),
            height=1,
            default.units="native",
            just=c("left", "center"),
            gp=gpar(col=row_Background, fill=row_Background))
    }

    grid.text(
        unique_Urls[i],
        x=unit(0, "npc") - padding,
        y=i,
        default.units="native",
        just=c("right", "center"))
}

# Find the hour offset of the minimum date value.
time_Offset <- as.integer(format(min(df_Hits$datetime), "%H"))

# Find the dates in the data.
x_Labels <- unique(format(df_Hits$datetime, "%m/%d/%Y"))

# Find where the days begin in the data.
midnight_Locations <- (0:max(x_Values))[(0:max(x_Values)+time_Offset)%%24==0]

# Write the appropriate date labels on the x-axis
# where the days begin.
grid.text(
    x_Labels,
    x=midnight_Locations,
    y=unit(0, "npc") - padding,
    default.units="native",
    just=c("right", "center"),
    rot=90)

# Draw lines to vertically mark when days begin.
grid.polyline(
    x=c(midnight_Locations, midnight_Locations),
    y=unit(c(rep(0, length(midnight_Locations)), rep(1, length(midnight_Locations))), "npc"),
    default.units="native",
    id=rep(midnight_Locations, 2),
    gp=gpar(lty=2, col=date_Color))

# Initialize bin assignment variable.
bin_Assignment <- 1

# Calculate which bin each hit value belongs in.
for (i in 1:nrow(bin_Settings)){
    bin_Assignment <- bin_Assignment + ifelse(df_Hits$hits>bin_Settings$cutoff[i], 1, 0)
}

# Draw points, coloring according to the bin settings.
grid.points(
    x=x_Values,
    y=match(df_Hits$website, unique_Urls),
    pch=19,
    size=unit(1, "native"),
    gp=gpar(col=as.character(bin_Settings$color[bin_Assignment]), alpha=0.5))

# Finalize the main plotting area.
popViewport()

# Create the bins for the upper histogram.
bins <- ddply(
    data.frame(df_Hits, bin_Assignment, mid=floor(x_Values/bin_Width)*bin_Width+bin_Width/2),
    .(bin_Assignment, mid),
    summarize,
    freq=length(hits))

# Initialize upper histogram area
pushViewport(viewport(
    x=x_Label_Margin,
    y=y_Label_Margin + main_Height,
    width=main_Width,
    height=histogram_Size,
    xscale=c(-1, max(x_Values) + 1),
    yscale=c(0, max(bins$freq) * 1.05),
    just=c("left", "bottom"),
    gp=gpar(fontsize=label_Size)))


# Calculate where to put four value markers.
marker_Interval <- floor(max(bins$freq)/4)
digits <- nchar(marker_Interval)
marker_Interval <- round(marker_Interval, -digits+1)

# Draw horizontal lines to mark values.
grid.polyline(
    x=unit(c(rep(0,4), rep(1,4)), "npc"),
    y=c(1:4 * marker_Interval, 1:4 * marker_Interval),
    default.units="native",
    id=rep(1:4, 2),
    gp=gpar(lty=2, col=marker_Color))

# Write value labels for each marker.
grid.text(
    1:4 * marker_Interval,
    x=unit(0, "npc") - padding,
    y=1:4 * marker_Interval,
    default.units="native",
    just=c("right", "center"))

# Finalize upper histogram area, so we
# can turn it back on but with clipping.
popViewport()

# Initialize upper histogram area again,
# but with clipping turned on.
pushViewport(viewport(
    x=x_Label_Margin,
    y=y_Label_Margin + main_Height,
    width=main_Width,
    height=histogram_Size,
    xscale=c(-1, max(x_Values) + 1),
    yscale=c(0, max(bins$freq) * 1.05),
    just=c("left", "bottom"),
    gp=gpar(fontsize=label_Size),
    clip="on"))

# Draw bars for each bin.
for (i in 1:nrow(bin_Settings)){
    active_Bin <- bins[bins$bin_Assignment==i,]
    if (nrow(active_Bin)>0){
        for (j in 1:nrow(active_Bin)){
            grid.rect(
                x=active_Bin$mid[j],
                y=0,
                width=bin_Width,
                height=active_Bin$freq[j],
                default.units="native",
                just=c("center","bottom"),
                gp=gpar(col=as.character(bin_Settings$color[i]), fill=as.character(bin_Settings$color[i]), alpha=1/nrow(bin_Settings)))
        }
    }
}

# Draw x-axis.
grid.lines(x=unit(c(0, 1), "npc"), y=0, default.units="native")

# Finalize upper histogram area.
popViewport()

# Calculate the frequencies for each website and bin.
freq_Data <- ddply(
    data.frame(df_Hits, bin_Assignment),
    .(website, bin_Assignment),
    summarize,
    freq=length(hits))

# Create the line data for the side histogram.
line_Data <- matrix(0, nrow=length(unique_Urls)+2, ncol=nrow(bin_Settings))
for (i in 1:nrow(freq_Data)){
    line_Data[match(freq_Data$website[i], unique_Urls)+1,freq_Data$bin_Assignment[i]] <- freq_Data$freq[i]
}


# Initialize side histogram area
pushViewport(viewport(
    x=x_Label_Margin + main_Width,
    y=y_Label_Margin,
    width=histogram_Size,
    height=main_Height,
    xscale=c(0, max(line_Data) * 1.05),
    yscale=c(0, length(unique_Urls) + 1),
    just=c("left", "bottom"),
    gp=gpar(fontsize=label_Size)))

# Calculate where to put four value markers.
marker_Interval <- floor(max(line_Data)/4)
digits <- nchar(marker_Interval)
marker_Interval <- round(marker_Interval, -digits+1)

# Draw vertical lines to mark values.
grid.polyline(
    x=c(1:4 * marker_Interval, 1:4 * marker_Interval),
    y=unit(c(rep(0,4), rep(1,4)), "npc"),
    default.units="native",
    id=rep(1:4, 2),
    gp=gpar(lty=2, col=marker_Color))

# Write value labels for each marker.
grid.text(
    1:4 * marker_Interval,
    x=1:4 * marker_Interval,
    y=unit(0, "npc") - padding,
    default.units="native",
    just=c("center", "top"))

# Draw lines for each bin setting.
grid.polyline(
    x=array(line_Data),
    y=rep(0:(length(unique_Urls)+1), nrow(bin_Settings)),
    default.units="native",
    id=array(t(matrix(1:nrow(bin_Settings), nrow=nrow(bin_Settings), ncol=length(unique_Urls)+2))),
    gp=gpar(col=as.character(bin_Settings$color)))

# Draw vertical line for the y-axis.
grid.lines(x=0, y=c(0, length(unique_Urls)+1), default.units="native")

# Finalize side histogram area.
popViewport()

# Draw legend.
# Draw box behind legend headers.
grid.rect(
    x=0,
    y=1,
    width=unit(1, "strwidth", names(bin_Settings)[1]) + unit(1, "strwidth", names(bin_Settings)[2]) + 3 * padding,
    height=unit(1, "lines"),
    default.units="npc",
    just=c("left","top"),
    gp=gpar(col=row_Background, fill=row_Background))

# Draw legend headers from bin_Settings variable.
grid.text(
    names(bin_Settings)[1],
    x=padding,
    y=1,
    default.units="npc",
    just=c("left","top"))

grid.text(
    names(bin_Settings)[2],
    x=unit(1, "strwidth", names(bin_Settings)[1]) + 2 * padding,
    y=1,
    default.units="npc",
    just=c("left","top"))

# For each row in the bin_Settings variable,
# write the cutoff values and the color associated.
# Write the color name in the color it specifies.
for (i in 1:nrow(bin_Settings)){
    grid.text(
        bin_Settings$cutoff[i],
        x=unit(1, "strwidth", names(bin_Settings)[1]) + padding,
        y=unit(1, "npc") - i * unit(1, "lines"),
        default.units="npc",
        just=c("right","top"))

    grid.text(
        bin_Settings$color[i],
        x=unit(1, "strwidth", names(bin_Settings)[1]) + 2 * padding,
        y=unit(1, "npc") - i * unit(1, "lines"),
        default.units="npc",
        just=c("left","top"),
        gp=gpar(col=as.character(bin_Settings$color[i])))
}