ezANOVA在R中检查错误是否正态分布

5

我正在使用ezANOVA来实现一个具有组内变量和组间变量的实验设计分析。我已经成功地按照以下方式实现了ezANOVA:

structure(list(Sub = structure(c(3L, 3L, 3L, 4L, 4L, 4L, 1L, 
1L, 1L, 2L, 2L, 2L), .Label = c("A7011", "A7022", "B13", "B14"
), class = "factor"), Depvariable = c(0.375, 0.066667, 0.15, 
0.275, 0.025, 0.78333, 0.24167, 0.058333, 0.14167, 0.19167, 0.5, 
0), Group = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 
1L, 1L), .Label = c("A", "B"), class = "factor"), WithinFactor = c(0.6, 
0, -0.3, 0.6, 0, -0.3, 0.6, 0, -0.3, 0.6, 0, -0.3)), .Names = c("Sub", 
"Depvariable", "Group", "WithinFactor"), row.names = c(NA, 12L
 ), class = "data.frame")


mod.ez<-ezANOVA(data,
          dv = .(Depvariable),
          wid = .(Sub),  # subject
          within = .(WithinFactor),  
          between=.(Group),
          type=3, 
          detailed=TRUE,
          return_aov=TRUE)

我卡在检查残差正态分布的过程中了。 我尝试了以下方法:

shapiro.test(as.numeric(residuals(mod.ez$aov)))

但我得到了以下错误
shapiro.test(as.numeric(residuals(mod.ez$aov)))的错误: 样本大小必须在3和5000之间。
如果我调用residuals(mod.ez $ aov),结果为NULL。
我转而使用lmer,其中残差的检查似乎很简单。
plot(fitted(model_lmer), residuals(model_lmer))

然而,由于ezANOVA还实现了球形度测试和校正,因此我希望坚持使用它,并找到一种检查残差正态性假设的方法。
非常感谢任何帮助。

有没有可能提供一个小的可重现的例子? - Daniel
1
查看 GitHub 上的代码,我没有看到如何直接得到您想要的内容。函数只返回一些“有限”的输出,没有其他东西。虽然 ezMixed 有一个 return_models,但那只是公式模型(可以用来手动构建模型)。 - Roman Luštrik
类似 r = unlist(sapply(1:4, function(x) get("residuals",mod.ez$aov[[x]])))shapiro.test(r) 这样的代码有用吗? - seth
谢谢@seth,不幸的是你的解决方案并没有起作用... - Matilde
1个回答

6
逐步进行:
完整示例:
首先,您的代码的完整版本如下:
library(ez)

data <- structure(list(Sub = structure(c(3L, 3L, 3L, 4L, 4L, 4L, 1L, 
1L, 1L, 2L, 2L, 2L), .Label = c("A7011", "A7022", "B13", "B14"
), class = "factor"), Depvariable = c(0.375, 0.066667, 0.15, 
0.275, 0.025, 0.78333, 0.24167, 0.058333, 0.14167, 0.19167, 0.5, 
0), Group = structure(c(2L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 
1L, 1L), .Label = c("A", "B"), class = "factor"), WithinFactor = c(0.6, 
0, -0.3, 0.6, 0, -0.3, 0.6, 0, -0.3, 0.6, 0, -0.3)), .Names = c("Sub", 
"Depvariable", "Group", "WithinFactor"), row.names = c(NA, 12L
 ), class = "data.frame")

mod.ez <- ezANOVA(
    data,
    dv = .(Depvariable),
    wid = .(Sub),  # subject
    within = .(WithinFactor),  
    between = .(Group),
    type = 3, 
    detailed = TRUE,
    return_aov = TRUE)

如何探索复杂的 R 结构
其次,如果您找不到残差等信息,那么问题是:ezANOVA 的结果是否实际包含它们,还是已经丢弃了这些信息?针对这种问题,我喜欢使用这个函数:
wtf_is <- function(x) {
    # For when you have no idea what something is.
    # https://dev59.com/vGox5IYBdhLWcg3w95A9
    cat("1. typeof():\n")
    print(typeof(x))
    cat("\n2. class():\n")
    print(class(x))
    cat("\n3. mode():\n")
    print(mode(x))
    cat("\n4. names():\n")
    print(names(x))
    cat("\n5. slotNames():\n")
    print(slotNames(x))
    cat("\n6. attributes():\n")
    print(attributes(x))
    cat("\n7. str():\n")
    print(str(x))
}

因此:
wtf_is(mod.ez)

在ezANOVA输出中寻找残差
输出结果很长。我们正在寻找长度为12的列表(因为您有12个数据点),或类似于残差或预测值的内容。输出的一部分如下:
[...]
7. str():
List of 2
 $ ANOVA:'data.frame':  3 obs. of  9 variables:
 [...]
 $ aov  :List of 4
  ..$ (Intercept)     :List of 9
  [...]
  ..$ Sub             :List of 9
  [...]
  .. ..$ residuals    : Named num [1:3] 0.102 -0.116 0.164
  .. .. ..- attr(*, "names")= chr [1:3] "2" "3" "4"
  [...]
  .. ..$ fitted.values: Named num [1:3] -1.39e-17 1.28e-01 9.03e-02
  .. .. ..- attr(*, "names")= chr [1:3] "2" "3" "4"
  ..$ Sub:WithinFactor:List of 9
  [...]
  .. ..$ residuals    : Named num [1:4] 0.00964 0.00964 0.23081 -0.23081
  .. .. ..- attr(*, "names")= chr [1:4] "5" "6" "7" "8"
  [...]
  .. ..$ fitted.values: Named num [1:4] 0.0804 -0.0804 -0.0444 -0.0444
  .. .. ..- attr(*, "names")= chr [1:4] "5" "6" "7" "8"
  [...]
  ..$ Within          :List of 6
  [...]
  .. ..$ residuals    : num [1:4, 1] 0.3286 0.1098 -0.4969 0.0564
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:4] "9" "10" "11" "12"
  .. .. .. ..$ : NULL
  .. ..$ fitted.values: num [1:4, 1] 0 0 0 0
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:4] "9" "10" "11" "12"
  .. .. .. ..$ : NULL
  [...]
  ..- attr(*, "error.qr")=List of 5
  .. ..$ qr   : num [1:12, 1:8] -3.464 0.289 0.289 0.289 0.289 ...
  .. .. ..- attr(*, "dimnames")=List of 2
  .. .. .. ..$ : chr [1:12] "1" "2" "3" "4" ...
  .. .. .. ..$ : chr [1:8] "(Intercept)" "Sub1" "Sub2" "Sub3" ...
  .. .. ..- attr(*, "assign")= int [1:8] 0 1 1 1 2 2 2 2
  .. .. ..- attr(*, "contrasts")=List of 1
  .. .. .. ..$ Sub: chr "contr.helmert"
  [...]

“…这些都对我来说看起来不是很有帮助。因此答案可能是“它不存在”或“不明显存在”,其他人也同意:ggplot2 residuals with ezANOVA。”
“使用afex::aov_ez代替”
“因此,您可以改用:”
library(afex)
model2 <- aov_ez(
    id = "Sub",  # subject
    dv = "Depvariable",
    data = data,
    between = c("Group"),
    within = c("WithinFactor"),
    type = "III"  # or 3; type III sums of squares
)
anova(model2)
summary(model2)
residuals(model2$lm)

"...这确实会给你留存收益。但是,它也会给出不同的 F/p 值。关于为什么 aov_ez 和 ezANOVA 在这里给出不同答案的原因是:..."
> mod.ez
$ANOVA
              Effect DFn DFd         SSn        SSd          F         p p<.05         ges
1              Group   1   2 0.024449088 0.05070517 0.96436277 0.4296328       0.134418588
2       WithinFactor   1   2 0.001296481 0.10673345 0.02429382 0.8904503       0.008167579
3 Group:WithinFactor   1   2 0.015557350 0.10673345 0.29151781 0.6433264       0.089928978

> anova(model2)
Anova Table (Type III tests)

Response: Depvariable
                   num Df den Df      MSE      F     ges Pr(>F)
Group              1.0000 2.0000 0.025353 0.9644 0.07197 0.4296
WithinFactor       1.4681 2.9363 0.090093 0.2322 0.08876 0.7471
Group:WithinFactor 1.4681 2.9363 0.090093 1.5001 0.38628 0.3370

不同的结果。请注意mod.ez的警告信息:
Warning: "WithinFactor" will be treated as numeric

"...即作为连续预测变量(协变量),而不是离散预测变量(因子)。因此,我们应该查看covariate和factorize参数;请参阅?aov_ez。我必须说,我有点难以弄清楚如何在这里进行一项组内ANCOVA。如果我正确地阅读文档,则factorize部分仅适用于受试者间的预测变量,同样,covariate也仅适用于受试者间的协变量。

作为一个快速检查,如果您使用ezANOVA并强制将WithinFactor用作离散(而不是连续)预测变量,就像这样:

"
data$WithinCovariate <- data$WithinFactor  # so the name is clearer!
data$WithinFactorDiscrete <- as.factor(data$WithinFactor)
mod.ez.discrete <- ezANOVA(
    data,
    dv = .(Depvariable),
    wid = .(Sub),  # subject
    within = .(WithinFactorDiscrete),  
    between = .(Group),
    type = 3, 
    detailed = TRUE,
    return_aov = TRUE)

"...你会得到与aov_ez相匹配的F/p值:"
> mod.ez.discrete
$ANOVA
                      Effect DFn DFd        SSn        SSd          F          p p<.05        ges
1                (Intercept)   1   2 0.65723113 0.05070517 25.9236350 0.03647725     * 0.67583504
2                      Group   1   2 0.02444909 0.05070517  0.9643628 0.42963280       0.07197457
3       WithinFactorDiscrete   2   4 0.03070651 0.26453641  0.2321534 0.80280844       0.08876045
4 Group:WithinFactorDiscrete   2   4 0.19841198 0.26453641  1.5000731 0.32651697       0.38627588

这样可以得到匹配结果、Greenhouse-Geisser/Huynh-Feldt校正和残差,但对于组内协变量除外。
最后,使用连续的组内预测变量来检查球形度是什么意思?我完全不清楚;球形度与组内因素各个水平间差异方差的均匀性有关。如果预测变量是连续的,则没有成对的值。
因此,冒着错误的风险,我要么 (a) 相信ezANOVA并放弃残差; (b) 使用可以完成除球形度测试之外所有操作的工具,例如:
library(lme4)
library(lmerTest)  # upgrades reports from lme4 to include p values! ;)

mod.lmer.wscov_interact <- lmer(
    Depvariable ~
        Group * WithinCovariate
        + (1 | Sub),
    data = data
)
anova(mod.lmer.wscov_interact)
residuals(mod.lmer.wscov_interact)

mod.lmer.wscov_no_interact <- lmer(
    Depvariable ~
        Group + WithinCovariate
        + (1 | Sub),
    data = data
)
anova(mod.lmer.wscov_no_interact)

mod.lmer.wsfac <- lmer(
    Depvariable ~
        Group * WithinFactorDiscrete
        + (1 | Sub),
    data = data
)
anova(mod.lmer.wsfac)

"给予"
> anova(mod.lmer.wscov_interact)
Analysis of Variance Table of type III  with  Satterthwaite 
approximation for degrees of freedom
                        Sum Sq  Mean Sq NumDF DenDF F.value Pr(>F)
Group                 0.033586 0.033586     1     8 0.50936 0.4957
WithinCovariate       0.001296 0.001296     1     8 0.01966 0.8920
Group:WithinCovariate 0.015557 0.015557     1     8 0.23594 0.6402

> residuals(mod.lmer.wscov_interact)
           1            2            3            4            5            6            7            8            9           10           11           12 
 0.130059250 -0.219344250 -0.156546500  0.030059250 -0.261011250  0.476783500 -0.009225679 -0.118156464  0.002383643 -0.059225679  0.323510536 -0.139286357 

> anova(mod.lmer.wscov_no_interact)
Analysis of Variance Table of type III  with  Satterthwaite 
approximation for degrees of freedom
                   Sum Sq   Mean Sq NumDF DenDF F.value Pr(>F)
Group           0.0244491 0.0244491     1     9 0.40519 0.5403
WithinCovariate 0.0012965 0.0012965     1     9 0.02149 0.8867

> anova(mod.lmer.wsfac)
Analysis of Variance Table of type III  with  Satterthwaite 
approximation for degrees of freedom
                             Sum Sq  Mean Sq NumDF DenDF F.value Pr(>F)
Group                      0.024449 0.024449     1     6 0.46534 0.5206
WithinFactorDiscrete       0.030707 0.015353     2     6 0.29222 0.7567
Group:WithinFactorDiscrete 0.198412 0.099206     2     6 1.88819 0.2312

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接