在R和Python中使用广义线性模型得到的结果与MATLAB中不同

3
我是一名有用的助手,可以为您翻译文本。以下是需要翻译的内容:

我正在尝试将处理广义线性模型的函数从MATLAB转录到R和Python。目前,无论给定相同的输入,R和Python都给出了与MATLAB不同的答案。

MATLAB代码本质上是这样的:

coeffs = glmfit(X, [y ones(length(y),1)], 'binomial', 'link', 'logit');

请注意,在Matlab中,glmfit函数已经将常数项添加到X中。该函数的文档在这里:http://www.mathworks.com/help/stats/glmfit.html 在R中,我有如下代码:
model <- glm(cbind(y, array(1, length(y)))~X, family=binomial(link=logit))
coeffs <- model$coefficients

在R中,glm的文档在这里:http://stat.ethz.ch/R-manual/R-patched/library/stats/html/glm.html 而在Python中,我有这个:
import statsmodels.api as sm
import numpy as np

est = sm.GLM(sm.add_constant(y, prepend=False), sm.add_constant(X), family=sm.families.Binomial()).fit()
coeffs = est.params

在Python中,文档在此处:http://statsmodels.sourceforge.net/devel/glm.html#module-reference 根据文档,所有GLM函数都使用"logit"参数作为"link"。如果我不得不猜测,这个问题可能是由于MATLAB的精度引起的,我以前遇到过这个问题。(似乎MATLAB不如R或NumPy精确。)然而,我想确保我没有使用错误的函数。
我认为这是MATLAB的一个问题,还是我缺少一些参数/做错了什么?
另外——如果重要的话,这是我使用的变量(我正在从csv文件读取):
X =
3.00000,9.10000,0.43000,-1.26000,-0.25000,0.05000,-4.01000,7.87000
4.00000,1.70000,-1.10000,0.43000,-1.26000,-17.34000,0.05000,-4.01000
5.00000,3.80000,-0.01000,-1.10000,0.43000,5.35000,-17.34000,0.05000
6.00000,0.70000,0.13000,-0.01000,-1.10000,6.03000,5.35000,-17.34000
7.00000,5.80000,0.38000,0.13000,-0.01000,18.10000,6.03000,5.35000
8.00000,8.90000,0.13000,0.38000,0.13000,-1.88000,18.10000,6.03000
9.00000,7.60000,-0.88000,0.13000,0.38000,-7.59000,-1.88000,18.10000
10.00000,4.50000,0.01000,-0.88000,0.13000,2.67000,-7.59000,-1.88000
11.00000,2.40000,0.37000,0.01000,-0.88000,12.76000,2.67000,-7.59000
12.00000,6.60000,0.60000,0.37000,0.01000,3.25000,12.76000,2.67000
13.00000,2.80000,-1.66000,0.60000,0.37000,18.67000,3.25000,12.76000
14.00000,7.20000,1.91000,-1.66000,0.60000,-0.37000,18.67000,3.25000
15.00000,4.00000,0.78000,1.91000,-1.66000,-8.18000,-0.37000,18.67000
16.00000,7.00000,-0.81000,0.78000,1.91000,8.86000,-8.18000,-0.37000
17.00000,0.50000,0.33000,-0.81000,0.78000,-12.55000,8.86000,-8.18000
18.00000,4.00000,1.17000,0.33000,-0.81000,-10.02000,-12.55000,8.86000
19.00000,7.10000,0.20000,1.17000,0.33000,5.50000,-10.02000,-12.55000
20.00000,3.20000,0.12000,0.20000,1.17000,-20.33000,5.50000,-10.02000
3.00000,0.70000,-1.26000,-0.96000,0.88000,-12.24000,7.18000,0.31000
4.00000,1.60000,-0.70000,-1.26000,-0.96000,-1.32000,-12.24000,7.18000
5.00000,8.00000,1.98000,-0.70000,-1.26000,6.75000,-1.32000,-12.24000
6.00000,3.40000,0.80000,1.98000,-0.70000,17.24000,6.75000,-1.32000
7.00000,8.00000,-1.09000,0.80000,1.98000,-7.76000,17.24000,6.75000
8.00000,7.30000,-0.59000,-1.09000,0.80000,-2.04000,-7.76000,17.24000
9.00000,7.70000,0.69000,-0.59000,-1.09000,7.18000,-2.04000,-7.76000
10.00000,6.60000,0.09000,0.69000,-0.59000,-1.75000,7.18000,-2.04000
11.00000,0.10000,0.10000,0.09000,0.69000,1.37000,-1.75000,7.18000
12.00000,5.80000,0.18000,0.10000,0.09000,14.02000,1.37000,-1.75000
13.00000,9.90000,-1.36000,0.18000,0.10000,-3.16000,14.02000,1.37000
14.00000,1.30000,-0.26000,-1.36000,0.18000,6.47000,-3.16000,14.02000
15.00000,8.40000,-0.52000,-0.26000,-1.36000,-7.12000,6.47000,-3.16000
16.00000,7.40000,-0.66000,-0.52000,-0.26000,-1.73000,-7.12000,6.47000
17.00000,8.30000,-0.63000,-0.66000,-0.52000,11.65000,-1.73000,-7.12000
18.00000,6.70000,-1.04000,-0.63000,-0.66000,13.63000,11.65000,-1.73000
19.00000,5.80000,-1.25000,-1.04000,-0.63000,-14.71000,13.63000,11.65000
20.00000,3.20000,-1.37000,-1.25000,-1.04000,-3.60000,-14.71000,13.63000
3.00000,2.30000,-1.16000,-0.19000,-0.70000,8.41000,0.18000,-4.08000
4.00000,1.90000,1.11000,-1.16000,-0.19000,-0.00000,8.41000,0.18000
5.00000,6.40000,0.12000,1.11000,-1.16000,17.86000,-0.00000,8.41000
6.00000,3.90000,1.30000,0.12000,1.11000,1.12000,17.86000,-0.00000
7.00000,0.80000,-0.63000,1.30000,0.12000,-1.33000,1.12000,17.86000
8.00000,7.60000,-0.21000,-0.63000,1.30000,22.26000,-1.33000,1.12000
9.00000,0.90000,1.47000,-0.21000,-0.63000,-14.10000,22.26000,-1.33000
10.00000,5.10000,0.60000,1.47000,-0.21000,13.36000,-14.10000,22.26000
11.00000,6.40000,-1.41000,0.60000,1.47000,-5.37000,13.36000,-14.10000
12.00000,3.30000,-0.97000,-1.41000,0.60000,6.47000,-5.37000,13.36000
13.00000,8.80000,1.82000,-0.97000,-1.41000,8.82000,6.47000,-5.37000
14.00000,9.90000,1.80000,1.82000,-0.97000,5.63000,8.82000,6.47000
15.00000,3.40000,-0.60000,1.80000,1.82000,5.02000,5.63000,8.82000
16.00000,5.90000,-1.14000,-0.60000,1.80000,10.93000,5.02000,5.63000
17.00000,3.00000,0.77000,-1.14000,-0.60000,-2.12000,10.93000,5.02000
18.00000,1.60000,1.69000,0.77000,-1.14000,-2.89000,-2.12000,10.93000
19.00000,1.10000,-0.07000,1.69000,0.77000,-2.78000,-2.89000,-2.12000
20.00000,0.70000,-0.66000,-0.07000,1.69000,2.82000,-2.78000,-2.89000
3.00000,1.30000,-0.72000,1.12000,1.41000,4.24000,5.35000,-0.62000
4.00000,4.80000,-1.18000,-0.72000,1.12000,8.91000,4.24000,5.35000
5.00000,1.30000,-0.44000,-1.18000,-0.72000,1.15000,8.91000,4.24000
6.00000,8.90000,-0.42000,-0.44000,-1.18000,11.64000,1.15000,8.91000
7.00000,6.20000,0.06000,-0.42000,-0.44000,20.74000,11.64000,1.15000
8.00000,6.60000,1.83000,0.06000,-0.42000,21.14000,20.74000,11.64000
9.00000,2.80000,1.50000,1.83000,0.06000,-16.67000,21.14000,20.74000
10.00000,8.70000,-1.72000,1.50000,1.83000,9.86000,-16.67000,21.14000
11.00000,6.90000,-1.51000,-1.72000,1.50000,14.44000,9.86000,-16.67000
12.00000,1.80000,-1.10000,-1.51000,-1.72000,3.64000,14.44000,9.86000
13.00000,4.30000,-1.34000,-1.10000,-1.51000,13.78000,3.64000,14.44000
14.00000,4.90000,-0.31000,-1.34000,-1.10000,-2.43000,13.78000,3.64000
15.00000,2.50000,1.69000,-0.31000,-1.34000,5.85000,-2.43000,13.78000
16.00000,1.70000,0.67000,1.69000,-0.31000,-0.94000,5.85000,-2.43000
17.00000,8.00000,1.23000,0.67000,1.69000,-2.19000,-0.94000,5.85000
18.00000,1.40000,0.63000,1.23000,0.67000,14.29000,-2.19000,-0.94000
19.00000,5.00000,1.10000,0.63000,1.23000,7.65000,14.29000,-2.19000
20.00000,4.10000,-0.13000,1.10000,0.63000,3.03000,7.65000,14.29000
3.00000,3.70000,-0.79000,-1.32000,-0.62000,-5.73000,0.93000,-0.62000
4.00000,0.90000,1.56000,-0.79000,-1.32000,3.27000,-5.73000,0.93000
5.00000,6.90000,2.39000,1.56000,-0.79000,11.63000,3.27000,-5.73000
6.00000,0.20000,1.17000,2.39000,1.56000,2.61000,11.63000,3.27000
7.00000,2.30000,-0.83000,1.17000,2.39000,18.57000,2.61000,11.63000
8.00000,0.10000,0.08000,-0.83000,1.17000,2.48000,18.57000,2.61000
9.00000,0.80000,-0.52000,0.08000,-0.83000,-2.11000,2.48000,18.57000
10.00000,3.40000,0.47000,-0.52000,0.08000,-5.58000,-2.11000,2.48000
11.00000,3.30000,-1.37000,0.47000,-0.52000,-5.09000,-5.58000,-2.11000
12.00000,7.70000,-0.54000,-1.37000,0.47000,-13.45000,-5.09000,-5.58000
13.00000,7.80000,2.62000,-0.54000,-1.37000,13.67000,-13.45000,-5.09000
14.00000,3.50000,0.91000,2.62000,-0.54000,-10.03000,13.67000,-13.45000
15.00000,4.40000,-0.73000,0.91000,2.62000,7.27000,-10.03000,13.67000
16.00000,0.50000,0.53000,-0.73000,0.91000,-5.44000,7.27000,-10.03000
17.00000,4.50000,-0.80000,0.53000,-0.73000,7.60000,-5.44000,7.27000
18.00000,7.90000,0.01000,-0.80000,0.53000,-11.30000,7.60000,-5.44000
19.00000,6.30000,0.42000,0.01000,-0.80000,9.38000,-11.30000,7.60000
20.00000,2.70000,0.13000,0.42000,0.01000,3.55000,9.38000,-11.30000

并且 y =

0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
1.00000
1.00000
0.00000
1.00000
1.00000
0.00000
0.00000
0.00000
1.00000
0.00000
1.00000
0.00000
0.00000
1.00000
1.00000
0.00000
1.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
1.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
1.00000
1.00000
0.00000
0.00000
1.00000
0.00000
1.00000
1.00000
0.00000
0.00000
0.00000
1.00000
1.00000
0.00000
1.00000
0.00000
1.00000
0.00000
0.00000
0.00000
0.00000
1.00000
0.00000
0.00000
1.00000
1.00000
0.00000
1.00000
1.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
1.00000
0.00000
0.00000
0.00000

感谢您的帮助!

我已经为Python和R添加了GLM的文档。 - James
2
在R中,二项式glm的响应可以是0和1的向量(如果使用伯努利变量且二项式样本大小为1),也可以是成功和失败的矩阵。根据您编码的方式,您的响应是成功和尝试次数的矩阵。尝试将响应更改为ycbind(y, 1-y) - aosmith
我做了那个,它起作用了。我还尝试在Python中做同样的事情,它也解决了问题! - James
1个回答

1
感谢aosmith的帮助,我找到了解决方案——
我的R代码已更改为:
y <- cbind(y, 1-y)
model <- glm(y~X, family=binomial(link=logit))
coeffs <- model$coeffs

我的Python代码已更改为:

est = sm.GLM(np.vstack([y, 1-y]).T, sm.add_constant(X), family=sm.families.Binomial()).fit()
coeffs = est.params

谢谢!

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接