ggseqlogo应该是您在寻找的内容。我希望这可以解决您在使用R绘制序列标志时遇到的一些挫折。
ggplot2
的尝试,与Leipzig/Berry解决方案有些相似。这种格式更接近于标准的形态图。但是我的解决方案,以及我认为所有的ggplot2
解决方案,仍然存在缺陷,因为ggplot2
无法控制绘图符号的纵横比。这是生成序列标志所需的核心功能(我认为),而ggplot2
缺少这个功能。另外请注意:我使用了Jeremy Leipzig的答案中的数据,但我没有对小样本量或不同于50%的GC值进行任何校正。require(ggplot2)
require(reshape2)
freqs<-matrix(data=c(0.25,0.65,0.87,0.92,0.16,0.16,0.04,0.98,0.98,1.00,0.02,0.10,0.10,0.80,0.98,0.91,0.07,0.07,0.11,0.05,0.04,0.00,0.26,0.17,0.00,0.01,0.00,0.00,0.29,0.17,0.01,0.03,0.00,0.00,0.32,0.32,0.53,0.26,0.07,0.02,0.53,0.18,0.96,0.01,0.00,0.00,0.65,0.01,0.89,0.17,0.01,0.09,0.59,0.12,0.11,0.04,0.02,0.06,0.05,0.49,0.00,0.00,0.02,0.00,0.04,0.72,0.00,0.00,0.01,0.00,0.02,0.49),byrow=TRUE,nrow=4,dimnames=list(c('A','C','G','T')))
freqdf <- as.data.frame(t(freqs))
freqdf$pos = as.numeric(as.character(rownames(freqdf)))
freqdf$height <- apply(freqdf[,c('A', 'C','G','T')], MARGIN=1,
FUN=function(x){2-sum(log(x^x,base=2))})
logodf <- data.frame(A=freqdf$A*freqdf$height, C=freqdf$C*freqdf$height,
G=freqdf$G*freqdf$height, T=freqdf$T*freqdf$height,
pos=freqdf$pos)
lmf <- melt(logodf, id.var='pos')
quartz(height=3, width=8)
ggplot(data=lmf, aes(x=as.numeric(as.character(pos)), y=value)) +
geom_bar(aes(fill=variable,order=value), position='stack',
stat='identity', alpha=0.5) +
geom_text(aes(label=variable, size=value, order=value, vjust=value),
position='stack') +
theme_bw()
quartz.save('StackOverflow_5438474.png', type='png')
这将生成以下图表:
我已经实现了 Charles Berry 设计的一种替代方案,它解决了在下面评论区中反复讨论的 seqLogos 的一些弱点。它使用 ggplot2:
library("devtools")
install_github("leipzig/berrylogo")
library("berrylogo")
freqs<-matrix(data=c(0.25,0.65,0.87,0.92,0.16,0.16,0.04,0.98,0.98,1.00,0.02,0.10,0.10,0.80,0.98,0.91,0.07,0.07,0.11,0.05,0.04,0.00,0.26,0.17,0.00,0.01,0.00,0.00,0.29,0.17,0.01,0.03,0.00,0.00,0.32,0.32,0.53,0.26,0.07,0.02,0.53,0.18,0.96,0.01,0.00,0.00,0.65,0.01,0.89,0.17,0.01,0.09,0.59,0.12,0.11,0.04,0.02,0.06,0.05,0.49,0.00,0.00,0.02,0.00,0.04,0.72,0.00,0.00,0.01,0.00,0.02,0.49),byrow=TRUE,nrow=4,dimnames=list(c('A','C','G','T')))
p<-berrylogo(freqs,gc_content=.41)
print(p)
# Load package
library('RWebLogo')
# Sample alignment
aln <- c('CCAACCCAA', 'CCAACCCTA', 'AAAGCCTGA', 'TGAACCGGA')
# Plot logo to file
weblogo(seqs=aln, file.out='logo.pdf')
# Plot logo to R graphics device (uses generated jpeg logo and raster package)
weblogo(seqs=aln, plot=TRUE, open=FALSE, format='jpeg', resolution=600)
更多选项请参见?weblogo
或?plotlogo
gglogo
的包(也在CRAN上,是Heike Hofmann开发的另一个令人惊叹的ggplot2扩展)。library(ggplot2)
library(gglogo)
ggplot(data = ggfortify(sequences, "peptide")) +
geom_logo(aes(x=position, y=bits, group=element,
label=element, fill=interaction(Polarity, Water)),
alpha = 0.6) +
scale_fill_brewer(palette="Paired") +
theme(legend.position = "bottom")
这里有一个替代方案。motiflogo 是由ggplot2实现的序列模体(motif)标志的新表现形式。可以考虑两个方面: