如何使用purrr中的map和dplyr中的mutate生成glm摘要表?

4

我正在使用purrr和broom软件包来生成一系列glm,并构建一个表格,其中包含模型信息,以便我可以进行比较。

当我调用purrr的map函数时,代码失败了。我认为问题与mutate和map的组合有关。我想生成一个表格,每个glm都有一行,glm的组件有各自的列。

数据和代码

library(broom)
library(tidyverse)

# Produce a dummy dataset
set.seed(123)
dummy <- tibble(ID = 1:50,
                A = sample(x = 1:200, size = 50, replace = T),
                B = as.factor(sample(x = c("day", "night"), size = 50, replace = T)),
                C = as.factor(sample(x = c("blue", "red", "green"), size = 50, replace = T)))

# Nest the data
nested <- dummy %>% select(-ID) %>% nest()

# Define a function for a generalized linear model with a poisson family
mod_f <- function(x, df = nested) {glm(formula = as.formula(x), family = poisson, data = df)}

# Make a list of formulas as a column in a new dataframe
# A is our response variable that we try to predict using B and C
formulas <- c("A ~ 1", "A ~ B", "A ~ C", "A ~ B + C")
tbl <- tibble(forms = formulas)

# Fit the glm's using each of the formulas from the formulas vector
tbl_2 <- tbl %>% mutate(mods = map(formulas, mod_f))
        #gla = mods %>% map(glance),
        #tid = mods %>% map(tidy),
        #aug = mods %>% map(augment),
        #AIC = gla %>% map_dbl("AIC"))

错误

在 mutate_impl(.data, dots) 中出现错误:评估错误:找不到对象“A”


看一下我的答案吧。顺便注意一下变量名。有些地方将mods称为mosglan称为gla。干杯! - NelsonGon
1
谢谢@NelsonGon!!! 在准备样本时我错过了那些。幸运的是这些错别字已经被注释掉了。 - Darius
2个回答

4

以下是另一名Stackoverflow用户提供的最终答案:

library(broom)
library(tidyverse)

# Produce a dummy dataset
set.seed(123)
dummy <- tibble(ID = 1:50,
                A = sample(x = 1:200, size = 50, replace = T),
                B = as.factor(sample(x = c("day", "night"), size = 50, replace = T)),
                C = as.factor(sample(x = c("blue", "red", "green"), size = 50, replace = T)))

# Define a function for a generalized linear model with a poisson family
mod_f <- function(x) {glm(formula = as.formula(x), family = poisson, data = dummy)}

# Make a list of formulas as a column in a new dataframe
# A is yhe response variable we try to predict using B and C
formulas <- c("A ~ 1", "A ~ B", "A ~ C", "A ~ B + C")
tbl <- tibble(forms = formulas)

# Fit the glm using each of the formulas stored in the formulas vector
tbl_2 <- tbl %>% mutate(all = map(formulas, mod_f),
                        gla = all %>% map(glance),
                        tid = all %>% map(tidy),
                        aug = all %>% map(augment),
                        AIC = all%>% map_dbl("AIC"))

你认为在最后一行应该是 AIC = all%>% map("AIC")) 才能运行吗? - user63230

2
你的函数中有一个错误:你调用了df而不是dummy。不确定是否可以重构以实现通用性。 这里:
   mod_f <- function(x, df = nested) {glm(formula = as.formula(x), family = poisson, data = dummy)}

# Make a list of formulas as a column in a new dataframe
# A is our response variable that we try to predict using B and C

    formulas <- c("A ~ 1", "A ~ B", "A ~ C", "A ~ B + C")
    tbl <- tibble(forms = formulas)

    # Fit the glm's using each of the formulas from the formulas vector
    tbl_2 <- tbl %>% mutate(mods = map(formulas, mod_f))

这将产生:
forms     mods     
  <chr>     <list>   
1 A ~ 1     <S3: glm>
2 A ~ B     <S3: glm>
3 A ~ C     <S3: glm>
4 A ~ B + C <S3: glm>
    `Map(mod_f,formulas)` 

产量等等:
$`A ~ 1`

Call:  glm(formula = as.formula(x), family = poisson, data = dummy)

Coefficients:
(Intercept)  
      4.649  

Degrees of Freedom: 49 Total (i.e. Null);  49 Residual
Null Deviance:      1840 
Residual Deviance: 1840     AIC: 2154

1
感谢您的回答@NelsonGon。我一直以为我必须嵌套,然后将嵌套的数据框作为我的函数中的数据输入。看起来在函数中使用原始值也可以实现这个目的。 - Darius

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接