使用numpy从偏态正态分布中生成N个随机数

18
我需要一个Python函数从一个偏斜正态分布中返回N个随机数,该偏斜度需要作为参数给出。
例如,我目前使用的是:
```x = numpy.random.randn(1000)```
理想的函数应该是:
```x = randn_skew(1000, skew=0.7)```
解决方案需要符合以下要求:Python版本2.7、numpy版本1.9。
类似的答案在这里:skew normal distribution in scipy,但是这生成的是概率密度函数而不是随机数。

您已经提出了一个请求,但这是一个问答网站,所以您的问题是什么?我们将帮助您解决编码中遇到的问题,但不会为您编写代码。 - Tadhg McDonald-Jensen
你想生成遵循某种分布的随机数吗? - Srivatsan
3个回答

30

我首先生成PDF曲线作为参考:

NUM_SAMPLES = 100000
SKEW_PARAMS = [-3, 0]

def skew_norm_pdf(x,e=0,w=1,a=0):
    # adapated from:
    # https://dev59.com/l2025IYBdhLWcg3wpXod
    t = (x-e) / w
    return 2.0 * w * stats.norm.pdf(t) * stats.norm.cdf(a*t)

# generate the skew normal PDF for reference:
location = 0.0
scale = 1.0
x = np.linspace(-5,5,100) 

plt.subplots(figsize=(12,4))
for alpha_skew in SKEW_PARAMS:
    p = skew_norm_pdf(x,location,scale,alpha_skew)
    # n.b. note that alpha is a parameter that controls skew, but the 'skewness'
    # as measured will be different. see the wikipedia page:
    # https://en.wikipedia.org/wiki/Skew_normal_distribution
    plt.plot(x,p)

偏态正态分布的PDF图像

接下来,我找到了一个VB实现的偏态正态分布随机抽样方法,并将其转换为Python:

# literal adaption from:
# http://stackoverflow.com/questions/4643285/how-to-generate-random-numbers-that-follow-skew-normal-distribution-in-matlab
# original at:
# http://www.ozgrid.com/forum/showthread.php?t=108175
def rand_skew_norm(fAlpha, fLocation, fScale):
    sigma = fAlpha / np.sqrt(1.0 + fAlpha**2) 

    afRN = np.random.randn(2)
    u0 = afRN[0]
    v = afRN[1]
    u1 = sigma*u0 + np.sqrt(1.0 -sigma**2) * v 

    if u0 >= 0:
        return u1*fScale + fLocation 
    return (-u1)*fScale + fLocation 

def randn_skew(N, skew=0.0):
    return [rand_skew_norm(skew, 0, 1) for x in range(N)]

# lets check they at least visually match the PDF:
plt.subplots(figsize=(12,4))
for alpha_skew in SKEW_PARAMS:
    p = randn_skew(NUM_SAMPLES, alpha_skew)
    sns.distplot(p)

由偏斜正态分布生成的直方图

然后写了一个快速版本(未经广泛测试)看起来是正确的:

def randn_skew_fast(N, alpha=0.0, loc=0.0, scale=1.0):
    sigma = alpha / np.sqrt(1.0 + alpha**2) 
    u0 = np.random.randn(N)
    v = np.random.randn(N)
    u1 = (sigma*u0 + np.sqrt(1.0 - sigma**2)*v) * scale
    u1[u0 < 0] *= -1
    u1 = u1 + loc
    return u1

# lets check again
plt.subplots(figsize=(12,4))
for alpha_skew in SKEW_PARAMS:
    p = randn_skew_fast(NUM_SAMPLES, alpha_skew)
    sns.distplot(p)

histograms from skew normal distributions as generated by the faster method


16

这没问题,但最初的问题是针对numpy的。 - jamesj629

3

这段内容是从fGarch R包中的rsnorm函数进行了改编。

def random_snorm(n, mean = 0, sd = 1, xi = 1.5):
    def random_snorm_aux(n, xi):
        weight = xi/(xi + 1/xi)
        z = numpy.random.uniform(-weight,1-weight,n)
        xi_ = xi**numpy.sign(z)
        random = -numpy.absolute(numpy.random.normal(0,1,n))/xi_ * numpy.sign(z)
        m1 = 2/numpy.sqrt(2 * numpy.pi)
        mu = m1 * (xi - 1/xi)
        sigma = numpy.sqrt((1 - m1**2) * (xi**2 + 1/xi**2) + 2 * m1**2 - 1)
        return (random - mu)/sigma

    return random_snorm_aux(n, xi) * sd + mean

当我尝试执行以下操作时 p = random_snorm(n, 5.61594709, 3.73888096, 1.62537967),我会得到噪音。 x = linspace(-10, 100, n) plt.plot(x, p)。那么,我该如何检查结果呢? - Hakaishin

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接