用于求解高斯积分的Metropolis-Hasting算法实现

4

我目前在实现Metropolis-Hastings算法时遇到了问题。

我尝试使用该算法来计算形如以下公式的积分:

formula

在使用该算法时,我们可以获得一长串配置(在这种情况下,每个配置只是一个数字),以致于在链的末尾,拥有特定配置的概率将遵循(或趋向于)高斯分布。

我的代码似乎在获取所述高斯分布方面出现了问题。转移概率(选择新候选配置的概率,取决于链中先前的配置)存在奇怪的依赖关系。然而,如果这个转移概率是对称的,它根本不应该影响这个函数(它仅影响探索潜在配置空间的速度以及链收敛到所需分布的速度)!

在我的情况下,我使用了一个正常的分布转移函数(它满足对称性的要求),宽度为d。 对于每个我使用的d,我确实获得了高斯分布,但标准差sigma取决于我选择的d。 结果的高斯分布应该大约有0.701的标准差,但我发现实际得到的值取决于参数d,而不应该这样。

我不确定代码中的错误在哪里,希望能得到任何帮助!

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm


'''
We want to get an exponential decay integral approx using importance sampling.
We will try to integrate x^2exp(-x^2) over the real line.
Metropolis-hasting alg will generate configuartions (in this case, single numbers) such that 
the probablity of a given configuration x^a ~ p(x^a) for p(x) propto exp(-x^2).

Once configs  = {x^a} generated, the apporximation, Q_N, of the integral, I, will be given by 
Q_N = 1/N sum_(configs) x^2
lim (N-> inf) Q_N -> I
'''

'''
Implementing metropolis-hasting algorithm
'''

#Setting up the initial config for our chain, generating first 2 to generate numpy array
x_0 = np.random.uniform(-20,-10,2)

#Defining function that generates the next N steps in the chain, given a starting config x
#Works by iteratively taking the last element in the chain, generating a new candidate configuration from it and accepting/rejecting according to the algorithm
#Success and failures implemented to see roughly the success rate of each step
def next_steps(x,N):
    i = 0
    Success = 0
    Failures = 0
    Data = np.array(x)
    d = 1.5                       #Spread of (normal) transition function
    while i < N:
        r = np.random.uniform(0,1)
        delta = np.random.normal(0,d)
        x_old = Data[-1]
        x_new = x_old + delta
        hasting_ratio = np.exp(-(x_new**2-x_old**2) )
        if hasting_ratio > r:
            i = i+1
            Data = np.append(Data,x_new)
            Success = Success +1
        else:
            Failures = Failures + 1
    print(Success)
    print(Failures)
    return Data

#Number of steps in the chain
N_iteration   = 50000

#Generating the data
Data = next_steps(x_0,N_iteration)

#Plotting data to see convergence of chain to gaussian distribution
plt.plot(Data)
plt.show()

#Obtaining tail end data and obtaining the standard deviation of resulting gaussian distribution
Data = Data[-40000:]
(mu, sigma) = norm.fit(Data)
print(sigma)


#Plotting a histogram to visually see if guassian
plt.hist(Data, bins = 300)
plt.show()

正如David Eisenstat所建议的那样,您始终保存x_{t+1},它将包含x_t或x_proposed(x_new)的值。另一个基准实现可以在这里找到。 - Tim
1个回答

2

即使x没有发生变化,你也需要保存它。否则,中心值会被低估,而随着d值的增加,这种情况会更加明显,从而增加方差。

import numpy as np
from scipy.stats import norm


"""
We want to get an exponential decay integral approx using importance sampling.
We will try to integrate x^2exp(-x^2) over the real line.
Metropolis-hasting alg will generate configuartions (in this case, single numbers) such that 
the probablity of a given configuration x^a ~ p(x^a) for p(x) propto exp(-x^2).

Once configs  = {x^a} generated, the apporximation, Q_N, of the integral, I, will be given by 
Q_N = 1/N sum_(configs) x^2
lim (N-> inf) Q_N -> I
"""

"""
Implementing metropolis-hasting algorithm
"""

# Setting up the initial config for our chain
x_0 = np.random.uniform(-20, -10)

# Defining function that generates the next N steps in the chain, given a starting config x
# Works by iteratively taking the last element in the chain, generating a new candidate configuration from it and accepting/rejecting according to the algorithm
# Success and failures implemented to see roughly the success rate of each step
def next_steps(x, N):
    Success = 0
    Failures = 0
    Data = np.empty((N,))
    d = 1.5  # Spread of (normal) transition function
    for i in range(N):
        r = np.random.uniform(0, 1)
        delta = np.random.normal(0, d)
        x_new = x + delta
        hasting_ratio = np.exp(-(x_new ** 2 - x ** 2))
        if hasting_ratio > r:
            x = x_new
            Success = Success + 1
        else:
            Failures = Failures + 1
        Data[i] = x
    print(Success)
    print(Failures)
    return Data


# Number of steps in the chain
N_iteration = 50000

# Generating the data
Data = next_steps(x_0, N_iteration)

# Obtaining tail end data and obtaining the standard deviation of resulting gaussian distribution
Data = Data[-40000:]
(mu, sigma) = norm.fit(Data)
print(sigma)

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接