在numpy.correlate函数中指定滞后时间

Question

在numpy.correlate函数中指定滞后时间

6

Matlab的交叉相关函数xcorr(x,y,maxlags)有一个选项maxlag，它返回跨度范围[-maxlags:maxlags]内的交叉相关序列。Numpy的numpy.correlate(N,M,mode)有三种模式，但它们都不允许我设置特定的滞后，这与完整的(N+M-1)、相同的(max(M,N))或有效的(max(M,N)-min(M,N)+1)不同。对于len(N) = 60000和len(M) = 200，我想将滞后设置为100。

- Jyotika

所以你是在询问是否有一个像相关函数一样可以接受可变滞后参数的函数？ - macduff

3个回答

2

这是我对于先导-滞后相关性的实现，但它只适用于1维数据，并不能保证在效率上是最佳的。它使用scipy.stats.pearsonr来进行核心计算，因此也返回系数的p值。请根据这个初始版本进行修改以优化。

def lagcorr(x,y,lag=None,verbose=True):
    '''Compute lead-lag correlations between 2 time series.

    <x>,<y>: 1-D time series.
    <lag>: lag option, could take different forms of <lag>:
          if 0 or None, compute ordinary correlation and p-value;
          if positive integer, compute lagged correlation with lag
          upto <lag>;
          if negative integer, compute lead correlation with lead
          upto <-lag>;
          if pass in an list or tuple or array of integers, compute 
          lead/lag correlations at different leads/lags.

    Note: when talking about lead/lag, uses <y> as a reference.
    Therefore positive lag means <x> lags <y> by <lag>, computation is
    done by shifting <x> to the left hand side by <lag> with respect to
    <y>.
    Similarly negative lag means <x> leads <y> by <lag>, computation is
    done by shifting <x> to the right hand side by <lag> with respect to
    <y>.

    Return <result>: a (n*2) array, with 1st column the correlation 
    coefficients, 2nd column correpsonding p values.

    Currently only works for 1-D arrays.
    '''

    import numpy
    from scipy.stats import pearsonr

    if len(x)!=len(y):
        raise('Input variables of different lengths.')

    #--------Unify types of <lag>-------------
    if numpy.isscalar(lag):
        if abs(lag)>=len(x):
            raise('Maximum lag equal or larger than array.')
        if lag<0:
            lag=-numpy.arange(abs(lag)+1)
        elif lag==0:
            lag=[0,]
        else:
            lag=numpy.arange(lag+1)    
    elif lag is None:
        lag=[0,]
    else:
        lag=numpy.asarray(lag)

    #-------Loop over lags---------------------
    result=[]
    if verbose:
        print '\n#<lagcorr>: Computing lagged-correlations at lags:',lag

    for ii in lag:
        if ii<0:
            result.append(pearsonr(x[:ii],y[-ii:]))
        elif ii==0:
            result.append(pearsonr(x,y))
        elif ii>0:
            result.append(pearsonr(x[ii:],y[:-ii]))

    result=numpy.asarray(result)

    return result

- Jason

0

我建议查看此文件以确定您想要如何实现这里描述的相关性。

- macduff

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Leonard AB · Accepted Answer

matplotlib.xcorr有一个maxlags参数。它实际上是numpy.correlate的包装器，所以没有性能提升。尽管如此，它给出了与Matlab的交叉相关函数完全相同的结果。下面我编辑了来自maxplotlib的代码，使其仅返回相关性。原因是如果我们使用matplotlib.corr，它还会返回绘图。问题是，如果我们将复杂数据类型作为参数传递给它，当matplotlib尝试绘制图时，我们将收到“将复杂类型转换为实数数据类型”的警告。

<!-- language: python -->

import numpy as np
import matplotlib.pyplot as plt

def xcorr(x, y, maxlags=10):
    Nx = len(x)
    if Nx != len(y):
        raise ValueError('x and y must be equal length')

    c = np.correlate(x, y, mode=2)

    if maxlags is None:
        maxlags = Nx - 1

    if maxlags >= Nx or maxlags < 1:
        raise ValueError('maxlags must be None or strictly positive < %d' % Nx)

    c = c[Nx - 1 - maxlags:Nx + maxlags]

    return c