在不同概率范围内生成随机数

Question

在不同概率范围内生成随机数

algorithmmathrandomprobability

8

如何生成一个在A = 1和B = 10之间的随机数，其中每个数字具有不同的概率？

例如：数字/概率

1-20％

2-20％

3-10％

4-5％

5-5％

......等等。

我知道一些硬编码的解决方法，但对于更大的范围，例如A = 1000和B = 100000，这些方法都无济于事。

假设我们有一个

    Rand()

有一个返回随机数 R，其中 0 < R < 1 的方法，请问是否有人可以提供一个正确的代码示例？最好使用 c# / java / actionscript。

- Dan Dinu

你能描述一下你计划如何指定与范围[1000, 100000]中的99001个值相关联的概率吗？只是为了了解你的期望。此外，你的存储和时间限制是什么？ - rici

7个回答

5

有一种优雅的算法被归因于Knuth提到的A. J. Walker (Electronics Letters 10, 8 (1974), 127-128; ACM Trans. Math Software 3 (1977), 253-256)。

思路是，如果你有n种不同颜色的k * n个球总共，那么可以将球分配到n个容器中，使得第i个容器包含颜色为i的球和至多一种其他颜色的球。证明是通过对n进行归纳进行的。对于归纳步骤，选择具有最少数量的球的颜色。

在您的示例中，n = 10。使用适当的m将概率乘以它们都成为整数。因此，可能m = 100，您有20个0号颜色的球，20个1号颜色的球，10个2号颜色的球，5个3号颜色的球等等。所以，k = 10。

现在生成一个维度为n的表格，每个条目都是概率（颜色i的球与其他颜色球的比率）和其他颜色。

要生成随机球，请在范围[0，n)内生成随机浮点数r。让i为整数部分（r的floor）且x为余数（r-i）。

if (x < table[i].probability) output i
else output table[i].other

该算法的优点是，对于每个随机球，您只进行一次比较。

让我举个例子（与Knuth相同）。

考虑模拟掷一对骰子。

因此，P(2) = 1/36，P(3) = 2/36，P(4) = 3/36，P(5) = 4/36，P(6) = 5/36，P(7) = 6/36，P(8) = 5/36，P(9) = 4/36，P(10) = 3/36，P(11) = 2/36，P(12) = 1/36。

乘以36 * 11得到393个球，其中11个是颜色为2的，22个是颜色为3的，33个是颜色为4的，...，11个是颜色为12的。我们有k = 393 / 11 = 36。

表[2] = (11/36, 颜色为4)

表[12] = (11/36, 颜色为10)

表[3] = (22/36, 颜色为5)

表[11] = (22/36, 颜色为5)

表[4] = (8/36, 颜色为9)

表[10] = (8/36, 颜色为6)

表[5] = (16/36, 颜色为6)

表[9] = (16/36, 颜色为8)

表[6] = (7/36, 颜色为8)

表[8] = (6/36, 颜色为7)

表[7] = (36/36, 颜色为7)

- user515430

2

假设您有一个函数p(n)，可以为随机数字给出所需的概率：

r = rand()  // a random number between 0 and 1
for i in A to B do
    if r < p(i) 
      return i
    r = r - p(i)    
done

更快的方法是创建一个由（B - A）* 100个元素组成的数组，并将从A到B的数字填充其中，使得每个项目的数量与数组大小的比率等于它的概率。然后您可以生成一个均匀随机数来获取数组的索引，并直接访问数组以获取您的随机数。

- perreal

你可能是想说 if r > p(i)。但这并不正确。如果1和2都有20%的概率出现，你总是只返回其中一个而永远不会返回另一个。 - SomeWittyUsername

实际上，当 0.2 < r < 0.4 时，你会得到 2。 - perreal

1

根据概率将您的均匀随机结果映射到所需的输出。

例如，对于您的示例：

If `0 <= Round() <= 0.2`: result = 1.
If `0.2 < Round() <= 0.4`: result = 2.
If `0.4 < Round() <= 0.5`: result = 3.
If `0.5 < Round() <= 0.55`: result = 4.
If `0.55 < Round() <= 0.65`: result = 5.
...

- SomeWittyUsername

1

这是Knuth's Algorithm的一个实现。正如一些答案所讨论的那样，它的工作原理是： 1）创建一个累加频率表 2）生成一个随机整数 3）使用ceiling函数将其四舍五入 4）找到随机数落在其中的“累加”范围，并根据它输出原始数组实体

- DivyaJyoti Rajdev

0

反向变换

在概率论中，累积分布函数 F(x) 返回任意随机抽取的值 X 小于或等于某个给定值 x 的概率。例如，在这种情况下，如果我执行 F(4)，我会得到 0.6，因为您示例中概率的运行总和是 {.2, .4, .5, .55, .6, .65, ....}。也就是说，随机获取小于或等于 4 的值的概率为 0.6。然而，我实际上想知道的是累积概率函数的反函数，称为 F_inv。我想知道在给定累积概率的情况下 x 的值是多少。我想传入 F_inv(0.6) 并返回 4。这就是为什么它被称为反向变换方法。

因此，在反向变换方法中，我们基本上试图找到随机均匀分布（0,1）数字落在累积分布中的区间。这可以通过 perreal 和 icepack 发布的算法解决。以下是另一种用累积分布函数表述的方式：

Generate a random number U
for x in A .. B
   if U <= F(x) then return x

请注意，如果较小的概率出现在分布的开头，则将循环从B到A并检查U是否大于等于F（x）可能更有效。

- frankc

0

考虑一下，比如说，随机数范围是1, 2, 3，对应的概率分别是50%，30%和20%，我认为按照Jim Mischel的建议来思考这个问题是最简单的：

"建立一个包含100个整数的数组，并用50个1、30个2和20个3来填充它。然后随机从数组中选择一个项目。"

...但是，用多个索引窗口（也就是索引范围）来实现可能更高效，而不是使用数组分配和初始化。

#include <assert.h>

#include <random>
#include <iostream>

// g++ algorithm.cpp -g -o algorithm

int main()
{
    // random number range: 1, 2, 3
    int lowest = 1;
    int highest = 3;

    // probabilities: 50%, 30%, 20%
    int probs[] = { 50, 30, 20 };

    /*

    "Build an array of 100 integers and populate it with 50 ones, 30 twos and 20 threes.
    Then just randomly pick an item from the array."

    */

    const int prob_count = sizeof(probs)/sizeof(probs[0]);

    // must have as many random number possibilities as there are probabilities
    assert(highest - lowest + 1 == prob_count);

    const int random_min = 0;
    const int random_max = 99;

    srand(time(NULL)); // seed
    const int continuous_index = random_min + rand() % ((random_max + 1) - random_min);
    
    std::cout << "continuous index: " << continuous_index << std::endl;

    int window_index = -1;

    /*

    window index 0 (50%): continuous indexes 0 - 49
    window index 1 (30%): continuous indexes 50 - 79
    window index 2 (20%): continuous indexes 80 - 99

    */

    for (int i = 0, index = 0; i < prob_count; i++)
    {
        const int prob = probs[i];
        assert(prob > 0 && prob < 100); // more likely a mistake than an intended arg

        const int continuous_index_from = index;
        const int continuous_index_to = continuous_index_from + prob - 1;

        std::cout << "window index " << i << " (" << prob << "%): continuous indexes " << continuous_index_from << " - " << continuous_index_to << std::endl;

        if (continuous_index >= continuous_index_from && continuous_index <= continuous_index_to)
        {
            window_index = i;
            // do not // break;
        }

        if (!(i + 1 < prob_count)) // the last iteration
            assert(continuous_index_to == 99); // sanity: probabilities must add up to 100

        index = continuous_index_to + 1;
    }

    assert(window_index >= 0 && window_index < prob_count);
    std::cout << "window index: " << window_index << std::endl;

    const int random = lowest + window_index;

    assert(random >= lowest && random <= highest);
    std::cout << "random number: " << random << std::endl;
}

示例输出：

continuous index: 89
window index 0 (50%): continuous indexes 0 - 49
window index 1 (30%): continuous indexes 50 - 79
window index 2 (20%): continuous indexes 80 - 99
window index: 2
random number: 3

- Gary

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Jim Mischel · Accepted Answer

构建一个包含100个整数的数组，并用20个1，20个2，10个3，5个4，5个5等来填充它。然后从数组中随机选择一个项目。

int[] numbers = new int[100];
// populate the first 20 with the value '1'
for (int i = 0; i < 20; ++i)
{
    numbers[i] = 1;
}
// populate the rest of the array as desired.

// To get an item:
// Since your Rand() function returns 0 < R < 1
int ix = (int)(Rand() * 100);
int num = numbers[ix];

如果项目数量比较少并且精度要求不是很高，那么这个方法效果会很好。也就是说，如果你需要4.375%的7，那么你需要一个更大的数组。