lavaan中的分类变量与交互作用的效果编码？

Question

lavaan中的分类变量与交互作用的效果编码？

3

我对将lm-syntax翻译成lavaan很感兴趣，特别是我想要一个影响编码的交互作用，涉及到一个因子x数字变量，当因子有>2个级别时。 (提醒: 影响编码是对分类变量进行虚拟编码的替代方法，使得编码为-1、1和0。)

下面是一个最小的示例（无意义）。你可以看到lm（线性回归）语法，然后是相应的lavaan语法（回归部分）。它适用于没有交互作用的回归，但不适用于交互作用。

首先考虑对没有交互作用的影响编码因素进行回归。 这个有效

library(lavaan)
# Use iris data as minimal example
# 
# 1. Linear regression model
# Change contrasts to effects-coding
contrasts(iris$Species) <- contr.sum(3)
# Linear regression
lmmodel <- Sepal.Length ~ Species # the regression model
lmfit <- lm(lmmodel, iris) # fit it

# 2. SEM
# first, re-code the factors
iris$s1 <- contrasts(iris$Species)[iris$Species, 1] # Numeric and effects-coed
iris$s2 <- contrasts(iris$Species)[iris$Species, 2] #     - " -
semmodel <- 'Sepal.Length ~ s1 + s2' # the SEM model
semfit <- sem(semmodel, iris) # fit it

# 3. Compare the coefficients lm vs. sem, should be equal (and are equal)
cbind(coef(lmfit)[-1], coef(semfit)[-length(coef(semfit))])
#                 [,1]        [,2]
# Species1 -0.83733333 -0.83733330
# Species2  0.09266667  0.09266664

这是我如何利用交互进行操作的方式 我做错了什么吗？

# 1. Linear regression w/ interaction
lmmodel <- Sepal.Length ~ Species + Species:Sepal.Width
lmfit <- lm(lmmodel, iris)

# 2. SEM
iris$s3 <- as.numeric(iris$Species=='virginica') # Code third species
iris$s1_w <- iris$s1 * iris$Sepal.Width # Numeric interaction
iris$s2_w <- iris$s2 * iris$Sepal.Width #      - " -
iris$s3_w <- iris$s3 * iris$Sepal.Width #      - " -"
semmodel <- 'Sepal.Length ~ s1 + s2 + s1_w + s2_w + s3_w'
semfit <- sem(semmodel, iris)

# 3. Compare the coefficients lm vs. sem
cbind(coef(lmfit)[-1], coef(semfit)[-length(coef(semfit))])
#                                     [,1]       [,2]
# Species1                      -0.7228562 -0.7228566
# Species2                       0.1778772  0.1778772
# Speciessetosa:Sepal.Width      0.6904897  0.6904899
# Speciesversicolor:Sepal.Width  0.8650777  0.8650779  <----- equal
# Speciesvirginica:Sepal.Width   0.9015345  2.4571023  <----- not equal

- JBJ

你在尝试做什么？我不太清楚。我看不到任何lavaan包，为什么长度应该是潜变量？ - Mensch

你好！谢谢，这是你要的内容：sem() 是 lavaan 函数，代码中的 semmodel 是 lavaan 语法。我正在尝试在 lavaan 中建立回归模型 "Sepal.Length ~ Species + Species:Sepal.Width"。 - JBJ

PS：这段代码是一个最小的例子。我的问题涉及正确设置具有效果编码的模型。lm的系数应该等于sem的系数（请参见＃2 https://psu-psychology.github.io/r-bootcamp-2018/talks/lavaan_tutorial.html）。在我的example2中，一个系数不同。似乎正确的系数（称之为`b*`）等于`b-c-d`（其中`b` = s3_w效应，c = s2_w效应，d = s1_w效应）。考虑到因素是效果编码，这似乎是正确的。我想知道的是如何使lavaan产生正确的系数（例如：0.9015）。 - JBJ

让我看看我是否理解了。您想了解为什么只有这一行不相等：# Speciesvirginica:Sepal.Width 0.9015345 2.4571023 <----- 不相等，对吗？您尝试在Google群组上询问过吗？ - Mensch

是的，那就是我想要的！ - JBJ

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Sinval · Answer 1

问题不在于 lavaan，而是你没有正确编码 Virginica Species 的对比。

对于第101到150行，应该有0,0,1，即：

iris[101:150,"s2_w"] <- 0
iris[101:150,"s1_w"] <- 0

重新运行原始代码:

semmodel <- 'Sepal.Length ~ s1 + s2 + s1_w + s2_w + s3_w'
semfit <- sem(model = semmodel, data = iris, estimator="ml")

# 3. Compare the coefficients lm vs. sem
cbind(coef(lmfit)[-1], coef(semfit)[-length(coef(semfit))])

请检查：

(¬_¬)# 3. Compare the coefficients lm vs. sem
(¬_¬)cbind(coef(lmfit)[-1], coef(semfit)[-length(coef(semfit))])
                                    [,1]       [,2]
Species1                      -0.7228562 -0.7228563
Species2                       0.1778772  0.1778772
Speciessetosa:Sepal.Width      0.6904897  0.6904898
Speciesversicolor:Sepal.Width  0.8650777  0.8650778
Speciesvirginica:Sepal.Width   0.9015345  0.9015345