单元测试概率

Question

单元测试概率

3

我有一个方法，可以在给定的次数 (math.random * x) 内创建2个不同的实例(M，N)。该方法将创建对象M和剩余次数的对象N。我已经编写了单元测试，并模拟了随机数字，以确保该方法的行为符合预期。但是，我不确定如何（或是否）测试概率是否准确。例如，如果 x = 0.1，我期望有10次中的1次返回实例M。我该如何测试这个功能？

- Eran

2个回答

1

分割测试。第一项测试应该允许您定义随机数生成器返回的内容（我假设您已经有了这个）。这部分测试只是满足"如果随机数生成器返回某个值，我是否获得了预期结果"。

第二项测试应该只是使用一些统计分析函数（比如计算它返回每个值的频率）运行随机数生成器。

我建议用一个包装器来包装真实的生成器，返回"create M"和"create N"（或者可能只是0和1）。这样，您就可以将实现与使用它的地方（创建两个不同实例的代码）分开，后者不需要知道如何初始化生成器以及如何将实际结果转换为"create X"。

- Aaron Digulla

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Russia Must Remove Putin · Accepted Answer

我将用Python编写这个功能。首先描述您的功能：

def binomial_process(x):
    '''
    given a probability, x, return M with that probability, 
    else return N with probability 1-x 
    maybe: return random.random() > x
    '''

然后测试这个功能：

import random
def binom(x):
    return random.random() > x

然后编写测试函数，首先是设置函数，将数据从昂贵的过程中组合起来：

def setUp(x, n):
    counter = dict()
    for _ in range(n):
        result = binom(x)
        counter[result] = counter.get(result, 0) + 1
    return counter

然后是实际测试：

import scipy.stats
trials = 1000000


def test_binomial_process():

    ps = (.01, .1, .33, .5, .66, .9, .99)
    x_01 = setUp(.01, trials)
    x_1 = setUp(.1, trials)
    x_33 = setUp(.1, trials)
    x_5 = setUp(.5, trials)
    x_66 = setUp(.9, trials)
    x_9 = setUp(.9, trials)
    x_99 = setUp(.99, trials)
    x_01_result = scipy.stats.binom_test(x_01.get(True, 0), trials, .01)
    x_1_result = scipy.stats.binom_test(x_1.get(True, 0), trials, .1)
    x_33_result = scipy.stats.binom_test(x_33.get(True, 0), trials, .33)
    x_5_result = scipy.stats.binom_test(x_5.get(True, 0), trials)
    x_66_result = scipy.stats.binom_test(x_66.get(True, 0), trials, .66)
    x_9_result = scipy.stats.binom_test(x_9.get(True, 0), trials, .9)
    x_99_result = scipy.stats.binom_test(x_99.get(True, 0), trials, .99)
    setups = (x_01, x_1, x_33, x_5, x_66,  x_9, x_99)
    results = (x_01_result, x_1_result, x_33_result, x_5_result,
               x_66_result, x_9_result, x_99_result)
    print 'can reject the hypothesis that the following tests are NOT the'
    print 'results of a binomial process (with their given respective'
    print 'probabilities) with probability < .01, {0} trials each'.format(trials)
    for p, setup, result in zip(ps, setups, results):
        print 'p = {0}'.format(p), setup, result, 'reject null' if result < .01 else 'fail to reject'

然后编写你的函数（好的，我们已经写好了）：

def binom(x):
    return random.random() > x

运行您的测试：

test_binomial_process()

最终输出给我：

can reject the hypothesis that the following tests are NOT the
results of a binomial process (with their given respective
probabilities) with probability < .01, 1000000 trials each
p = 0.01 {False: 10084, True: 989916} 4.94065645841e-324 reject null
p = 0.1 {False: 100524, True: 899476} 1.48219693752e-323 reject null
p = 0.33 {False: 100633, True: 899367} 2.96439387505e-323 reject null
p = 0.5 {False: 500369, True: 499631} 0.461122365668 fail to reject
p = 0.66 {False: 900144, True: 99856} 2.96439387505e-323 reject null
p = 0.9 {False: 899988, True: 100012} 1.48219693752e-323 reject null
p = 0.99 {False: 989950, True: 10050} 4.94065645841e-324 reject null

为什么我们无法在p=0.5时拒绝?我们来看一下scipy.stats.binom_test的帮助文档:

Help on function binom_test in module scipy.stats.morestats:

binom_test(x, n=None, p=0.5, alternative='two-sided')
    Perform a test that the probability of success is p.

    This is an exact, two-sided test of the null hypothesis
    that the probability of success in a Bernoulli experiment
    is `p`.

    Parameters
    ----------
    x : integer or array_like
        the number of successes, or if x has length 2, it is the
        number of successes and the number of failures.
    n : integer
        the number of trials.  This is ignored if x gives both the
        number of successes and failures
    p : float, optional
        The hypothesized probability of success.  0 <= p <= 1. The
        default value is p = 0.5
    alternative : {'two-sided', 'greater', 'less'}, optional
        Indicates the alternative hypothesis. The default value is
        'two-sided'.

所以，.5是测试的默认零假设，不拒绝这个零假设在这种情况下是有道理的。