在SEPARATE LINES图上添加回归线方程和R2

7
几年前,有个帖子询问如何在ggplot图形上添加回归线方程和R2值,具体内容请参考以下链接:Adding Regression Line Equation and R2 on graph。最佳解决方案如下:
lm_eqn <- function(df){
    m <- lm(y ~ x, df);
    eq <- substitute(italic(y) == a + b %.% italic(x)*","~~italic(r)^2~"="~r2, 
         list(a = format(coef(m)[1], digits = 2), 
              b = format(coef(m)[2], digits = 2), 
             r2 = format(summary(m)$r.squared, digits = 3)))
    as.character(as.expression(eq));                 
}

p1 <- p + geom_text(x = 25, y = 300, label = lm_eqn(df), parse = TRUE)

我正在使用这段代码,并且它运行得很好。但是,我想知道是否有可能使这段代码的R2值和回归线方程在不用逗号分隔的情况下单独显示出来。
而不是像这样: Instead of like this 而是像这样: Something like this 非常感谢您的帮助!
2个回答

8

ggpmisc包含stat_poly_eq函数,专门用于此任务(但不仅限于线性回归)。使用与@Sathish发布的相同的data,我们可以分别添加方程式和R2,但给label.y.npc不同的值。如果需要,可以调整label.x.npc

library(ggplot2)
library(ggpmisc)
#> For news about 'ggpmisc', please, see https://www.r4photobiology.info/

set.seed(21318)
df <- data.frame(x = c(1:100))
df$y <- 2 + 3*df$x + rnorm(100, sd = 40)

formula1 <- y ~ x

ggplot(data = df, aes(x = x, y = y)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE, formula = formula1) +
  stat_poly_eq(aes(label = paste(..eq.label.., sep = "~~~")), 
               label.x.npc = "right", label.y.npc = 0.15,
               eq.with.lhs = "italic(hat(y))~`=`~",
               eq.x.rhs = "~italic(x)",
               formula = formula1, parse = TRUE, size = 5) +
  stat_poly_eq(aes(label = paste(..rr.label.., sep = "~~~")), 
               label.x.npc = "right", label.y.npc = "bottom",
               formula = formula1, parse = TRUE, size = 5) +
  theme_bw(base_size = 16)

# using `atop`
ggplot(data = df, aes(x = x, y = y)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE, formula = formula1) +
  stat_poly_eq(aes(label = paste0("atop(", ..eq.label.., ",", ..rr.label.., ")")), 
               formula = formula1, 
               parse = TRUE) +
  theme_bw(base_size = 16)

### bonus: including result table
ggplot(data = df, aes(x = x, y = y)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE, formula = formula1) +
  stat_fit_tb(method = "lm",
              method.args = list(formula = formula1),
              tb.vars = c(Parameter = "term", 
                          Estimate = "estimate", 
                          "s.e." = "std.error", 
                          "italic(t)" = "statistic", 
                          "italic(P)" = "p.value"),
              label.y = "bottom", label.x = "right",
              parse = TRUE) +
  stat_poly_eq(aes(label = paste0("atop(", ..eq.label.., ",", ..rr.label.., ")")), 
               formula = formula1, 
               parse = TRUE) +
  theme_bw(base_size = 16)

reprex package (v0.3.0)创建


5

编辑:

除了插入公式外,我还修正了截距的符号值。将RNG设置为set.seed(2L)将会给出正的截距。下面的例子产生了负的截距。

我还解决了geom_text中文字重叠的问题。

set.seed(3L)
library(ggplot2)
df <- data.frame(x = c(1:100))
df$y <- 2 + 3 * df$x + rnorm(100, sd = 40)

lm_eqn <- function(df){
  # browser()
  m <- lm(y ~ x, df)
  a <- coef(m)[1]
  a <- ifelse(sign(a) >= 0, 
              paste0(" + ", format(a, digits = 4)), 
              paste0(" - ", format(-a, digits = 4))  )
  eq1 <- substitute( paste( italic(y) == b, italic(x), a ), 
                     list(a = a, 
                          b = format(coef(m)[2], digits = 4)))
  eq2 <- substitute( paste( italic(R)^2 == r2 ), 
                     list(r2 = format(summary(m)$r.squared, digits = 3)))
  c( as.character(as.expression(eq1)), as.character(as.expression(eq2)))
}

labels <- lm_eqn(df)


p <- ggplot(data = df, aes(x = x, y = y)) +
  geom_smooth(method = "lm", se=FALSE, color="red", formula = y ~ x) +
  geom_point() +
  geom_text(x = 75, y = 90, label = labels[1], parse = TRUE,  check_overlap = TRUE ) +
  geom_text(x = 75, y = 70, label = labels[2], parse = TRUE, check_overlap = TRUE )

print(p)

enter image description here


你需要在 geom_text 中设置 check_overlap = TRUE,以防止 ggplot 反复书写文本,从而导致模糊的文本。 - Tung
哇,非常感谢!你的原始答案不仅完美解决了我的问题,而且你修改符号的建议也是我接下来要问的问题!我真的无法表达我的感激之情。 - Fiala Bumpers

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接