为使用svyglm创建插补列表

Question

为使用svyglm创建插补列表

3

使用survey包时，我在创建imputationList方面遇到了问题，svydesign无法接受。以下是可重复的示例：

library(tibble)
library(survey)
library(mitools)


# Data set 1
# Note that I am excluding the "income" variable from the "df"s and creating  
# it separately so that it varies between the data sets. This simulates the 
# variation with multiple imputation. Since I am using the same seed
# (i.e., 123), all the other variables will be the same, the only one that 
# will vary will be "income."

set.seed(123)

df1 <- tibble(id      = seq(1, 100, by = 1),
              gender  = as.factor(rbinom(n = 100, size = 1, prob = 0.50)),
              working = as.factor(rbinom(n = 100, size = 1, prob = 0.40)),
              pweight = sample(50:500, 100,  replace   = TRUE))


# Data set 2

set.seed(123)

df2 <- tibble(id      = seq(1, 100, by = 1),
              gender  = as.factor(rbinom(n = 100, size = 1, prob = 0.50)),
              working = as.factor(rbinom(n = 100, size = 1, prob = 0.40)),
              pweight = sample(50:500, 100,  replace   = TRUE))


# Data set 3

set.seed(123)

df3 <- tibble(id      = seq(1, 100, by = 1),
              gender  = as.factor(rbinom(n = 100, size = 1, prob = 0.50)),
              working = as.factor(rbinom(n = 100, size = 1, prob = 0.40)),
              pweight = sample(50:500, 100,  replace   = TRUE))


 # Create list of imputed data sets

 impList <- imputationList(df1,
                           df2, 
                           df3)


# Apply NHIS weights

weights <- svydesign(id     = ~id, 
                     weight = ~pweight, 
                     data   = impList)

我收到了以下错误信息：

Error in eval(predvars, data, env) : 
  numeric 'envir' arg not of length one

- scottsmith

可能是 https://dev59.com/zGox5IYBdhLWcg3wsGaC 的重复问题。 - zx8754

错误来自于 svydesign。我们不需要看到您如何获取数据，尝试创建一个小的可重现数据，可以生成相同的错误，例如 dput(head(impList))。 - zx8754

1

是的，错误来自于 svydesign，但我不知道为什么。我正在遵循 ?imputationList 中的示例，其中包括 imputationList(datasets,...)。通常我会使用小的可重现示例，但这更加复杂（例如，填充数据、调查权重），我认为最好使用真实世界的数据，因为很难重新创建完全相同的情况。 - scottsmith

@zx8754 这不是重复的问题... 这个问题特定于 library(survey)。 - Anthony Damico

@AnthonyDamico 我说“可能”，根据错误信息，这样OP就可以探索一下那篇链接的帖子是否有帮助。 - zx8754

感谢 @zx8754。我添加了一个更好的可重现示例。 - scottsmith

2个回答

1

在http://asdfree.com/national-health-interview-survey-nhis.html提供的逐步说明中，详细介绍了如何创建多重插补Nhis设计，并包括以下分析示例svyglm调用。请避免使用library(data.table)和library(dplyr)与library(survey)一起使用。

- Anthony Damico

感谢@AnthonyDamico。为什么要避免在library(survey)中使用library(data.table)和library(dplyr)？它们是我常用的数据整理库。 - scottsmith

不要使用调查设计对象。library(srvyr)允许使用一些library(dplyr)命令。 - Anthony Damico

感谢@AnthonyDamico。我有一个类似的问题，你可能会对这里的https://stackoverflow.com/questions/48506315/marginal-effects-with-survey-weights-and-multiple-imputations感兴趣。 - scottsmith

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- scottsmith · Accepted Answer

为了让它工作，我需要直接将 imputationList 添加到 svydesign 中，方法如下：

weights <- svydesign(id = ~id, 
                         weight = ~pweight, 
                         data = imputationList(list(df1, 
                                                    df2, 
                                                    df3))