熊猫因果推断

Question

熊猫因果推断

7

我想使用Python Pandas对时间序列数据执行Granger因果关系测试，有两个问题。

(1) 我尝试使用pandas.stats.var包，但这似乎已经被弃用了。是否有其他推荐的选项？

(2) 我难以解释pandas.stats.var包中VAR.granger_causality()函数的输出。我能找到的唯一参考是源代码中的注释，它说：

   Returns the f-stats and p-values from the Granger Causality Test.
   If the data consists of columns x1, x2, x3, then we perform the
   following regressions:
   x1 ~ L(x2, x3)
   x1 ~ L(x1, x3)
   x1 ~ L(x1, x2)
   The f-stats of these results are placed in the 'x1' column of the
   returned DataFrame.  We then repeat for x2, x3.
   Returns
   -------
   Dict, where 'f-stat' returns the DataFrame containing the f-stats,
   and 'p-value' returns the DataFrame containing the corresponding
   p-values of the f-stats.

例如，试运行的输出如下所示：

p-value:
          C         B         A
A   0.472122  0.798261  0.412984
B   0.327602  0.783978  0.494436
C   0.071369  0.385844  0.688292

f-stat:
          C         B         A
A   0.524075  0.065955  0.680298
B   0.975334  0.075878  0.473030
C   3.378231  0.763898  0.162619

我了解p值表中的每个单元格都对应于f统计量表中的一个单元格，但我不知道f统计量表中的单元格指的是什么。例如，第C列第A行中的0.52代表什么意思？

- agg212

通常使用pandas时，您需要检查statsmodels和scipy（有时对于更简单的统计数据还需要numpy）。看起来statsmodels有一些东西：http://statsmodels.sourceforge.net/0.6.0/generated/statsmodels.tsa.stattools.grangercausalitytests.html - JohnE

1

@JohnE的回答中更新了链接：链接 - Amani

您可以查看以下链接，了解如何通过P值进行解释： https://www.machinelearningplus.com/time-series/time-series-analysis-python/ - Debashis Sahoo

2个回答

0

请记住，最简单的格兰杰因果关系由两个回归的R2的F-Test组成： y=const+y[-1]+e vs. y=const+y[-1]+x[-1]+e

以便查看第二个回归的R2是否更高。另请参见： http://www.statisticshowto.com/granger-causality/

- Niccola Tartaglia

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Debashis Sahoo · Accepted Answer

(零假设) H0: Xt不会导致Yt发生变化。
(备择假设) H1: Xt会导致Yt发生变化。

如果P值小于5%（或0.05），则我们可以拒绝零假设（H0），并得出结论：Xt会导致Yt发生变化。

因此，只要您的P值小于0.05，就可以考虑这些特征。