如何重新排序 ggplot2 折线图中的图例键以匹配每个系列的最终值?

5
我正在使用ggplot2制作多种工具随时间价格变化的折线图。我已成功在图表上绘制了多条线,并添加了显示最近价格变化的值。我想做的是(但尚未实现)重新排序图例以使涨幅最大的价格系列位于图例的顶部,其次是第二大的价格系列,依此类推。
在下面的图中,图例按字母顺序显示标识。我希望它能够按照DDD,AAA,CCC然后BBB的性能顺序显示图例项,这是截至最近日期的顺序。我该如何做到这一点?
以下是类似代码的最小代码。
require(ggplot2)
require(scales)
require(gridExtra)
require(lubridate)
require(reshape)

# create fake price data
set.seed(123)
monthsback <- 15
date <- as.Date(paste(year(now()), month(now()),"1", sep="-")) - months(monthsback)
mydf <- data.frame(mydate = seq(as.Date(date), by = "month", length.out = monthsback),
                      aaa = runif(monthsback, min = 600, max = 800),
                      bbb = runif(monthsback, min = 100, max = 200),
                      ccc = runif(monthsback, min = 1400, max = 2000),
                      ddd = runif(monthsback, min = 50, max = 120))

# function to calculate change
change_from_start <- function(x) {
   (x - x[1]) / x[1]
}

# for appropriate columns (i.e. not date), replace fake price data with change in price
mydf[, 2:5] <- lapply(mydf[, 2:5], function(myparam){change_from_start(myparam)})

# get most recent values and reshape
myvals <- mydf[mydf$mydate == mydf$mydate[nrow(mydf)],]
myvals <- melt(myvals, id = c('mydate'))

# plot multiple lines
p <- ggplot(data = mydf) +
    geom_line( aes(x = mydate, y = aaa, colour = "AAA"), size = 1) +
    geom_line( aes(x = mydate, y = bbb, colour = "BBB"), size = 1) +
    geom_line( aes(x = mydate, y = ccc, colour = "CCC"), size = 1) +
    geom_line( aes(x = mydate, y = ddd, colour = "DDD"), size = 1) +
    scale_colour_manual("", values = c("AAA" = "red", "BBB" = "black", "CCC" = "blue", "DDD" = "green")) +
    scale_y_continuous(label = percent_format()) +
    geom_text(data = myvals, aes(x = mydate + 30, y = value, label = sprintf("%+1.1f%%", myvals$value * 100)), size = 4, colour = "grey50") +
    opts(axis.title.y = theme_blank()) +
    opts()

# and output
print(p)
3个回答

10

试试这个:

mydf <- melt(mydf,id.var = 1)
mydf$variable <- factor(mydf$variable,levels = rev(myvals$variable[order(myvals$value)]),ordered = TRUE)

# plot multiple lines
p <- ggplot(data = mydf) +
    geom_line(aes(x = mydate,y = value,colour = variable,group = variable),size = 1) +
    scale_colour_manual("", values = c("aaa" = "red", "bbb" = "black", "ccc" = "blue", "ddd" = "green")) +
    scale_y_continuous(label = percent_format()) +
    geom_text(data = myvals, aes(x = mydate + 30, y = value, label = sprintf("%+1.1f%%", myvals$value * 100)), 
                size = 4, colour = "grey50") +
    opts(axis.title.y = theme_blank()) +
    opts()

# and output
print(p)

enter image description here

我将你的完整数据集合并,以便为绘图代码节省几行。关键是确保变量是有序因子。

为解决评论中出现的问题,只要顺序正确,您可以传递任何标签以在图例本身中显示:

ggplot(data = mydf) +
    geom_line(aes(x = mydate,y = value,colour = variable,group = variable),size = 1) +
    scale_colour_manual("", values = c("aaa" = "red", "bbb" = "black", "ccc" = "blue", "ddd" = "green"),labels = c('Company D','Company A','Company C','Company B')) +
    scale_y_continuous(label = percent_format()) +
    geom_text(data = myvals, aes(x = mydate + 30, y = value, label = sprintf("%+1.1f%%", myvals$value * 100)), 
                size = 4, colour = "grey50") +
    opts(axis.title.y = theme_blank()) +
    opts()

在此输入图片描述

注意:自0.9.2版本以来,opts已被替换为theme,例如:

+ theme(axis.title.y = element_blank())

3
太好了!请注意,“有序因子”这个词是多余的。水平的顺序很重要,但ggplot2并不在乎因子是否被排序。 - kohske
感谢joran提供的帮助 - 我一直在避免对主数据框进行融合操作,但实际上并不重要,正如你所指出的那样,可以节省很多样板代码。我猜可能可以通过使用列表单独定义“values = c("aaa" = "red")”部分,这样每次想要更改要绘制的项目时就不必深入到代码的ggplot部分中...?同样感谢kohske的输入。 - SlowLearner
@joran - 我明白,但很遗憾并不是那么简单。问题在于,在这个简化的示例所代表的实际应用程序中,标签会很长,有多个单词,并且有时可能不是英语,因此使用不同(较短)的列名称来代替冗长的标签是非常可取的。也许难以实现? - SlowLearner
@joran - 的确,我已经尝试过 scale_colour_manual("", values=c('aaa'="red", 'bbb'="blue", 'ccc'="black"), labels=c("Company A","Company B", "Company C")) 但是图例键的顺序以标签的顺序排序,而不是值的顺序,并且颜色也被错误地分配了。例如,'公司C'是-24.6%,应该是黑色的,但图例显示'公司B'是黑色的。所以我又回到原点了。 - SlowLearner
@SlowLearner 按照您想要它们出现的顺序传递标签。 - joran
显示剩余5条评论

3

试试这个

  • 指南(颜色=guide_legend(reverse=T))

0

我认为有一种更简单的方法。一旦你融合了数据框,按日期值排序并使用最后日期的值创建图例。由于你按值排序,图例将按照你按值排序的方式显示线条(从大到小或从小到大)。以下是代码。

require(ggplot2)
require(scales)
require(gridExtra)
require(lubridate)
require(reshape)

# create fake price data
set.seed(123)
monthsback <- 15
date <- as.Date(paste(year(now()), month(now()),"1", sep="-")) - months(monthsback)
mydf <- data.frame(mydate = seq(as.Date(date), by = "month", length.out = monthsback),
                      aaa = runif(monthsback, min = 600, max = 800),
                      bbb = runif(monthsback, min = 100, max = 200),
                      ccc = runif(monthsback, min = 1400, max = 2000),
                      ddd = runif(monthsback, min = 50, max = 120))

# function to calculate change
change_from_start <- function(x) {
   (x - x[1]) / x[1]
}

# for appropriate columns (i.e. not date), replace fake price data with change in price
mydf[, 2:5] <- lapply(mydf[, 2:5], function(myparam){change_from_start(myparam)})

mydf <- melt(mydf, id.var=1)

#Order by date and value.  Decreasing since want to order greatest to least
mydf <- mydf[order(mydf$mydate, mydf$value, decreasing = TRUE),]

#Create legend breaks and labels
legend_length <- length(unique(mydf$variable))
legend_breaks <- mydf$variable[1:legend_length]

#Pass order through scale_colour_discrete
ggplot(data=mydf) + geom_line(aes(x = mydate,y = value,colour = variable,group = variable),size = 1) + scale_colour_discrete(breaks=legend_breaks)

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接