R中具有大量重叠和3000多个点的散点图

3

我正在使用ggplot2在R中制作散点图。我正在比较希拉里和伯尼在初选中获得的选票比例和教育水平。存在很多重叠和太多点的问题。我试图使用透明度来看到重叠,但效果仍然不理想。

My Graph

代码:

demanalyze <- function(infocode, n = 1){
    infoname <- filter(infolookup, column_name == infocode)$description
    infocolumn <- as.vector(as.matrix(mydata[infocode]))
    ggplot(mydata) +
    aes(x = infocolumn) +
    ggtitle(infoname) +
    xlab(infoname) +
    ylab("Fraction of votes each canidate recieved") +
    xlab(infoname) +
    geom_point(aes(y = sanders_vote_fraction, colour = "Bernie Sanders")) +#, color = alpha("blue",0.02), size=I(1)) +
    stat_smooth(aes(y = sanders_vote_fraction), method = "lm", formula = y ~ poly(x, n), size = 1, color = "darkblue", se = F) +
    geom_point(aes(y = clinton_vote_fraction, colour = "Hillary Clinton")) +#, color = alpha("red",0.02), size=I(1)) +
    stat_smooth(aes(y = clinton_vote_fraction), method = "lm", formula = y ~ poly(x, n), size = 1, color = "darkred", se = F) +
    scale_colour_manual("", 
        values = c("Bernie Sanders" = alpha("blue",0.02), "Hillary Clinton" = alpha("red",0.02))
    ) +
    guides(colour = guide_legend(override.aes = list(alpha = 1)))
}

我该如何改变才能让重叠区域看起来不那么凌乱?

尝试使用密度图。两个半透明的图就足够了。最好提供一个可重现的示例或数据。 - Serban Tanasa
@SerbanTanasa,我该如何将我的CSV文件上传到Stack Overflow?我同意我需要一个密度图,但我不知道如何使它们看起来好看。低密度区域的点很难检测到。 - michaelmesser
1个回答

3

绘制大量二维点的标准方法是使用二维密度图:

带有可重复示例:

x1 <- rnorm(1000, mean=10)
x2 <- rnorm(1000, mean=10)
y1 <- rnorm(1000, mean= 5)
y2 <- rnorm(1000, mean = 7)


mydat <- data.frame(xaxis=c(x1, x2), yaxis=c(y1, y2), lab=rep(c("H","B"),each=1000))
head(mydat)

library(ggplot2)
##Dots and density plots (kinda messy, but can play with alpha)
p1 <-ggplot(mydat) + geom_point(aes(x=xaxis, y = yaxis, color=lab),alpha=0.4) +
stat_density2d(aes(x=xaxis, y = yaxis, color=lab))
p1

dots and radii

## just density
p2 <-ggplot(mydat) + stat_density2d(aes(x=xaxis, y = yaxis, color=lab))
p2

密度图

有许多参数可以调整,因此请到这里查看 ggplot2 中绘图类型的完整信息。


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接