ggplot版本的图表.PerformanceSummary

7
我想要基于PerformanceAnalytics软件包中可用的charts.PerformanceSummary 的基本功能创建一个“ggplot版本”,因为我认为ggplot通常更漂亮,并且从理论上讲在编辑图像方面更强大。虽然我已经相当接近目标,但还有一些问题需要解决,希望得到一些帮助,具体来说:
  1. 减少图例所占空间的数量,在超过10行时,它变得十分丑陋...(仅需要线条颜色和名称)。
  2. 增加Daily_Returns分面的大小,使其与PerformanceAnalytics中的charts.PerformanceSummary相匹配。
  3. 有一个选项可以指定在Daily_Returns分面中显示哪个资产的日回报系列,而不总是使用第一列.

如果有更好的方法,可能可以使用gridExtra而不是facets...我不反对人们向我展示这将会更好...

这里的问题在于美学和潜在的易于操作性,因为PerformanceAnalytics已经有了一个良好的工作示例,我只是想让它看起来更漂亮/更专业...

此外,为了额外加分,我想能够在每个资产的图表下面或旁边显示与其相关的一些性能统计信息...不确定何处最好显示或显示此信息。

此外,如果有人对我的代码有建议,以清理它们的部分来改进,请不要犹豫提出建议。

下面是我的可重现示例...

首先生成回报数据:

require(xts)
X.stock.rtns <- xts(rnorm(1000,0.00001,0.0003), Sys.Date()-(1000:1))
Y.stock.rtns <- xts(rnorm(1000,0.00003,0.0004), Sys.Date()-(1000:1))
Z.stock.rtns <- xts(rnorm(1000,0.00005,0.0005), Sys.Date()-(1000:1))
rtn.obj <- merge(X.stock.rtns , Y.stock.rtns, Z.stock.rtns)
colnames(rtn.obj) <- c("x.stock.rtns","y.stock.rtns","z.stock.rtns")

我想要复制以下操作结果中的图片:
require(PerformanceAnalytics)
charts.PerformanceSummary(rtn.obj, geometric=TRUE)

这是我的翻译尝试...

目标

这是我迄今为止的尝试...

gg.charts.PerformanceSummary <- function(rtn.obj, geometric=TRUE, main="",plot=TRUE){

    # load libraries
suppressPackageStartupMessages(require(ggplot2))
suppressPackageStartupMessages(require(scales))
suppressPackageStartupMessages(require(reshape))
suppressPackageStartupMessages(require(PerformanceAnalytics))
    # create function to clean returns if having NAs in data
    clean.rtn.xts <- function(univ.rtn.xts.obj,na.replace=0){
    univ.rtn.xts.obj[is.na(univ.rtn.xts.obj)]<- na.replace
    univ.rtn.xts.obj
}
    # Create cumulative return function
cum.rtn <- function(clean.xts.obj, g=TRUE){
    x <- clean.xts.obj
    if(g==TRUE){y <- cumprod(x+1)-1} else {y <- cumsum(x)}
    y
}
    # Create function to calculate drawdowns
dd.xts <- function(clean.xts.obj, g=TRUE){
    x <- clean.xts.obj
    if(g==TRUE){y <- Drawdowns(x)} else {y <- Drawdowns(x,geometric=FALSE)}
    y
}
    # create a function to create a dataframe to be usable in ggplot to replicate charts.PerformanceSummary
cps.df <- function(xts.obj,geometric){
    x <- clean.rtn.xts(xts.obj)
    series.name <- colnames(xts.obj)[1]
    tmp <- cum.rtn(x,geometric)
    tmp$rtn <- x
    tmp$dd <- dd.xts(x,geometric)
    colnames(tmp) <- c("Cumulative_Return","Daily_Return","Drawdown")
    tmp.df <- as.data.frame(coredata(tmp))
    tmp.df$Date <- as.POSIXct(index(tmp))
    tmp.df.long <- melt(tmp.df,id.var="Date")
    tmp.df.long$asset <- rep(series.name,nrow(tmp.df.long))
    tmp.df.long
}
# A conditional statement altering the plot according to the number of assets
if(ncol(rtn.obj)==1){
            # using the cps.df function
    df <- cps.df(rtn.obj,geometric)
            # adding in a title string if need be
    if(main==""){
        title.string <- paste0(df$asset[1]," Performance")
    } else {
        title.string <- main
    }
            # generating the ggplot output with all the added extras....
    gg.xts <- ggplot(df, aes_string(x="Date",y="value",group="variable"))+
                facet_grid(variable ~ ., scales="free", space="free")+
                geom_line(data=subset(df,variable=="Cumulative_Return"))+
                geom_bar(data=subset(df,variable=="Daily_Return"),stat="identity")+
                geom_line(data=subset(df,variable=="Drawdown"))+
                ylab("")+
                geom_abline(intercept=0,slope=0,alpha=0.3)+
                ggtitle(title.string)+
                theme(axis.text.x = element_text(angle = 45, hjust = 1))+
                scale_x_datetime(breaks = date_breaks("6 months"), labels = date_format("%d/%m/%Y"))

} else {
            # a few extra bits to deal with the added rtn columns
    no.of.assets <- ncol(rtn.obj)
    asset.names <- colnames(rtn.obj)
    df <- do.call(rbind,lapply(1:no.of.assets, function(x){cps.df(rtn.obj[,x],geometric)}))
    df$asset <- ordered(df$asset, levels=asset.names)
    if(main==""){
        title.string <- paste0(df$asset[1]," Performance")
    } else {
        title.string <- main
    }
    if(no.of.assets>5){legend.rows <- 5} else {legend.rows <- no.of.assets}
    gg.xts <- ggplot(df, aes_string(x="Date", y="value",group="asset"))+
      facet_grid(variable~.,scales="free",space="free")+
      geom_line(data=subset(df,variable=="Cumulative_Return"),aes(colour=factor(asset)))+
      geom_bar(data=subset(df,variable=="Daily_Return"),stat="identity",aes(fill=factor(asset),colour=factor(asset)),position="dodge")+
      geom_line(data=subset(df,variable=="Drawdown"),aes(colour=factor(asset)))+
      ylab("")+
      geom_abline(intercept=0,slope=0,alpha=0.3)+
      ggtitle(title.string)+
      theme(legend.title=element_blank(), legend.position=c(0,1), legend.justification=c(0,1),
            axis.text.x = element_text(angle = 45, hjust = 1))+
      guides(col=guide_legend(nrow=legend.rows))+
      scale_x_datetime(breaks = date_breaks("6 months"), labels = date_format("%d/%m/%Y"))

}

assign("gg.xts", gg.xts,envir=.GlobalEnv)
if(plot==TRUE){
    plot(gg.xts)
} else {}

}
# seeing the ggplot equivalent....
gg.charts.PerformanceSummary(rtn.obj, geometric=TRUE)

result

2个回答

14

我正需要这个。你已经接近了。站在你的肩膀上,我能够修复一些问题。

编辑(2015年5月9日):现在可以通过三重冒号运算符PerformanceAnalytics:::Drawdown()调用函数Drawdown()。下面的代码已经修改以反映这个更改。编辑(2018年4月22日):show_guide已被弃用,并被show.legend替代。

require(xts)

X.stock.rtns <- xts(rnorm(1000,0.00001,0.0003), Sys.Date()-(1000:1))
Y.stock.rtns <- xts(rnorm(1000,0.00003,0.0004), Sys.Date()-(1000:1))
Z.stock.rtns <- xts(rnorm(1000,0.00005,0.0005), Sys.Date()-(1000:1))
rtn.obj <- merge(X.stock.rtns , Y.stock.rtns, Z.stock.rtns)
colnames(rtn.obj) <- c("x","y","z")

# advanced charts.PerforanceSummary based on ggplot
gg.charts.PerformanceSummary <- function(rtn.obj, geometric = TRUE, main = "", plot = TRUE)
{

    # load libraries
    suppressPackageStartupMessages(require(ggplot2))
    suppressPackageStartupMessages(require(scales))
    suppressPackageStartupMessages(require(reshape))
    suppressPackageStartupMessages(require(PerformanceAnalytics))

    # create function to clean returns if having NAs in data
    clean.rtn.xts <- function(univ.rtn.xts.obj,na.replace=0){
        univ.rtn.xts.obj[is.na(univ.rtn.xts.obj)]<- na.replace
        univ.rtn.xts.obj  
    }

    # Create cumulative return function
    cum.rtn <- function(clean.xts.obj, g = TRUE)
    {
        x <- clean.xts.obj
        if(g == TRUE){y <- cumprod(x+1)-1} else {y <- cumsum(x)}
        y
    }

    # Create function to calculate drawdowns
    dd.xts <- function(clean.xts.obj, g = TRUE)
    {
        x <- clean.xts.obj
        if(g == TRUE){y <- PerformanceAnalytics:::Drawdowns(x)} else {y <- PerformanceAnalytics:::Drawdowns(x,geometric = FALSE)}
        y
    }

    # create a function to create a dataframe to be usable in ggplot to replicate charts.PerformanceSummary
    cps.df <- function(xts.obj,geometric)
    {
        x <- clean.rtn.xts(xts.obj)
        series.name <- colnames(xts.obj)[1]
        tmp <- cum.rtn(x,geometric)
        tmp$rtn <- x
        tmp$dd <- dd.xts(x,geometric)
        colnames(tmp) <- c("Index","Return","Drawdown") # names with space
        tmp.df <- as.data.frame(coredata(tmp))
        tmp.df$Date <- as.POSIXct(index(tmp))
        tmp.df.long <- melt(tmp.df,id.var="Date")
        tmp.df.long$asset <- rep(series.name,nrow(tmp.df.long))
        tmp.df.long
    }

    # A conditional statement altering the plot according to the number of assets
    if(ncol(rtn.obj)==1)
    {
        # using the cps.df function
        df <- cps.df(rtn.obj,geometric)
        # adding in a title string if need be
        if(main == ""){
            title.string <- paste("Asset Performance")
        } else {
            title.string <- main
        }
    
        gg.xts <- ggplot(df, aes_string( x = "Date", y = "value", group = "variable" )) +
            facet_grid(variable ~ ., scales = "free_y", space = "fixed") +
            geom_line(data = subset(df, variable == "Index")) +
            geom_bar(data = subset(df, variable == "Return"), stat = "identity") +
            geom_line(data = subset(df, variable == "Drawdown")) +
            geom_hline(yintercept = 0, size = 0.5, colour = "black") +
            ggtitle(title.string) +
            theme(axis.text.x = element_text(angle = 0, hjust = 1)) +
            scale_x_datetime(breaks = date_breaks("6 months"), labels = date_format("%m/%Y")) +
            ylab("") +
            xlab("")
    
    } 
    else 
    {
        # a few extra bits to deal with the added rtn columns
        no.of.assets <- ncol(rtn.obj)
        asset.names <- colnames(rtn.obj)
        df <- do.call(rbind,lapply(1:no.of.assets, function(x){cps.df(rtn.obj[,x],geometric)}))
        df$asset <- ordered(df$asset, levels=asset.names)
        if(main == ""){
            title.string <- paste("Asset",asset.names[1],asset.names[2],asset.names[3],"Performance")
        } else {
            title.string <- main
        }
    
        if(no.of.assets>5){legend.rows <- 5} else {legend.rows <- no.of.assets}
    
        gg.xts <- ggplot(df, aes_string(x = "Date", y = "value" )) +
        
            # panel layout
            facet_grid(variable~., scales = "free_y", space = "fixed", shrink = TRUE, drop = TRUE, margin = 
                           , labeller = label_value) + # label_value is default
        
            # display points for Index and Drawdown, but not for Return
            geom_point(data = subset(df, variable == c("Index","Drawdown"))
                       , aes(colour = factor(asset), shape = factor(asset)), size = 1.2, show.legend = TRUE) + 
        
            # manually select shape of geom_point
            scale_shape_manual(values = c(1,2,3)) + 
        
            # line colours for the Index
            geom_line(data = subset(df, variable == "Index"), aes(colour = factor(asset)), show.legend = FALSE) +
        
            # bar colours for the Return
            geom_bar(data = subset(df,variable == "Return"), stat = "identity"
                     , aes(fill = factor(asset), colour = factor(asset)), position = "dodge", show.legend = FALSE) +
        
            # line colours for the Drawdown
            geom_line(data = subset(df, variable == "Drawdown"), aes(colour = factor(asset)), show.legend = FALSE) +
        
            # horizontal line to indicate zero values
            geom_hline(yintercept = 0, size = 0.5, colour = "black") +
        
            # horizontal ticks
            scale_x_datetime(breaks = date_breaks("6 months"), labels = date_format("%m/%Y")) +
        
            # main y-axis title
            ylab("") +
        
            # main x-axis title
            xlab("") +
        
            # main chart title
            ggtitle(title.string)
    
        # legend 
    
        gglegend <- guide_legend(override.aes = list(size = 3))
    
        gg.xts <- gg.xts + guides(colour = gglegend, size = "none") +
        
            # gglegend <- guide_legend(override.aes = list(size = 3), direction = "horizontal") # direction overwritten by legend.box?
            # gg.xts <- gg.xts + guides(colour = gglegend, size = "none", shape = gglegend) + # Warning: "Duplicated override.aes is ignored"
        
            theme( legend.title = element_blank()
                   , legend.position = c(0,1)
                   , legend.justification = c(0,1)
                   , legend.background = element_rect(colour = 'grey')
                   , legend.key = element_rect(fill = "white", colour = "white")
                   , axis.text.x = element_text(angle = 0, hjust = 1)
                   , strip.background = element_rect(fill = "white")
                   , panel.background = element_rect(fill = "white", colour = "white")
                   , panel.grid.major = element_line(colour = "grey", size = 0.5) 
                   , panel.grid.minor = element_line(colour = NA, size = 0.0)
            )
    
    }

    assign("gg.xts", gg.xts,envir=.GlobalEnv)
    if(plot == TRUE){
        plot(gg.xts)
    } else {}

}

# display chart
gg.charts.PerformanceSummary(rtn.obj, geometric = TRUE)

控制面板大小的方式在facet_grid()函数中:facet_grid(variable ~ ., scales = "free_y", space = "fixed")。这些选项的作用在手册中有详细解释,如下所述:

scales:比例尺是在所有图层之间共享(默认为“fixed”),还是在行(“free_x”)、列(“free_y”)或行列都不同(“free”)

space:如果为“fixed”(默认值),则所有面板的大小相同。如果为“free_y”,它们的高度将与y轴刻度的长度成比例;如果为“free_x”,则宽度将与x轴刻度的长度成比例;或者如果为“free”,则高度和宽度都会变化。除非相应的刻度也不同,否则此设置不起作用。

更新:标签

可以使用以下函数获取自定义标签:

# create a function to store fancy axis labels 

    my_labeller <- function(var, value){ # from the R Cookbook
        value <- as.character(value)
        if (var=="variable") 
        {
              value[value=="Index"] <- "Cumulative Returns"
              value[value=="Return"] <- "Daily Returns"
              value[value=="Drawdown"] <- "Drawdown"
        }
        return(value)
    }

并将labeller选项设置为“labeller=my_labeller”

更新:背景

可以从theme()函数内部控制背景、网格线、颜色等的外观:上面的代码已更新以反映这些更改。

enter image description here


@PatrickT,感谢您的所有工作。我有一个问题,我正在尝试弄清楚如何从图形中删除间隙。我已经尝试了geometric=FALSE,但没有效果。请参见屏幕截图http://i.imgur.com/i2sXxZl.png--我想要消除绘图中的过夜间隙。数据中没有这样的间隙,请参见pastebin:http://tny.cz/589efad7。 - ctrlbrk
@user1530260,我不是专家:我想可能有一个包含处理此类问题的函数的软件包,例如xts或zoo?或者PerformanceAnalytics?但您也可以手动完成。第一步,删除所有NAs。第二步,忽略日期/时间并根据实际交易时间绘制图表。第三步,调整标签以显示绘制数据的日期/时间。或者一个快速的解决方法是绘制未连接的线条,您仍然会有间隙,但是线条不会连接...也许您可以提出一个问题,我想看到一个答案。 - PatrickT
@PatrickT,谢谢。我已经沿着这条线尝试过了,但是没有结果。我的数据系列中没有间隔,似乎与等距类型间距有关,我还无法弄清楚,但会继续尝试。当你说“忽略日期/时间并根据实际交易时间进行绘图”时,这就是我想要的-沿着我的实际系列表示的x轴绘图,但它自己插入了间隙。我会继续搜索。 - ctrlbrk
我在多年后看到了你的第二条评论。我想,只要您提供日期对象,绘图程序就会引入“自然时间”间隔。因此,我的猜测是,您可以通过将日期包装为.as.numeric或.as.character或.as.factor或其组合来获得所需结果。但我确信金融软件包已经内置了此类功能。 - PatrickT
1
显然,现在Drawdowns()函数已被隐藏,只能通过三个冒号访问,即通过PerformanceAnalytics:::Drawdowns()调用。也许您想要更新它或修改为使用最新的PerformanceAnalytics函数进行回撤。此外,是否可以为单个资产表现摘要图指定线条颜色? - Frash
@Frash,谢谢。我已经编辑了上面的代码以反映这个更改。我不清楚你的问题。而且我已经有一段时间没有使用xts和PerformanceAnalytics了。一般来说,对于单个系列,您应该能够在“geom_line()”或“geom_point()”中指定颜色,例如“geom_line(color = "red")”。您还可以在“scale_colour_manual”中作为事后修改颜色,例如“scale_colour_manual(name = "variable", values = c("red"))”。 - PatrickT

1
请参见 ?theme 以了解图例的大小。大多数图例的方面可以通过那里进行调整...我想您想要调整的是legend.key.size,以及legend.background以删除每个图例周围的框...
在分面中,每个面板的大小有点更加复杂。我有一个小技巧,可以让您在调用facet_grid时指定每个面板的相对大小,但它需要从源代码安装等操作...更好的解决方案是将您的绘图转换为gtable对象并修改它...假设您的绘图称为p:
require(gtable)
require(grid)

pTable <- ggplot_gtable(ggplot_build(p))
pTable$heights[[4]] <- unit(2, 'null')

grid.newpage()
grid.draw(pTable)

这将使顶部面板的高度是其他面板的两倍...之所以是pTable$heights[[4]]而不是pTable$heights[[1]],是因为分面面板不是绘图中最上层的grobs。
我会避免比这更具体的说明,因为你最好自己探索gtable的属性(而且因为我没有时间)。
最好的祝福,
托马斯

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接