哪个更快？正则表达式还是EndsWith？

Question

哪个更快？正则表达式还是EndsWith？

26

哪个更快？

public String Roll()
{
    Random rnd = new Random();
    int roll = rnd.Next(1, 100000);
    if (Regex.IsMatch(roll.ToString(), @"(.)\1{1,}$"))
    {
        return "doubles";
    }
    return "none";
}

或者

public String Roll()
{
    Random rnd = new Random();
    int roll = rnd.Next(1, 100000);
    if (roll.ToString().EndsWith("11") || roll.ToString().EndsWith("22")  || roll.ToString().EndsWith("33")  || roll.ToString().EndsWith("44")  || roll.ToString().EndsWith("55")  || roll.ToString().EndsWith("66")  || roll.ToString().EndsWith("77")  || roll.ToString().EndsWith("88")  || roll.ToString().EndsWith("99")  || roll.ToString().EndsWith("00"))
    {
        return "doubles";
    }
    return "none";
}

我目前正在使用一个非常长的if语句列表，其中包含正则表达式，用于检查一个整数是否以双、三、四、五等数字结尾...所以在我改变所有这些代码之前，我想知道哪个更快。

- Dysanix Official

20

简介它。https://ericlippert.com/2012/12/17/performance-rant/ - Rob

1

您IP地址为143.198.54.68，由于运营成本限制，当前对于免费用户的使用频率限制为每个IP每72小时10次对话，如需解除限制，请点击左下角设置图标按钮（手机用户先点击左上角菜单按钮）。 - Alexei Levenkov

3

你不必给我点踩吧？我有很多 if 语句，我想在改动之前先询问一些有经验的人，这样会更快。 - Dysanix Official

2

@Alexei Levenkov，没有研究的问题？我已经谷歌了很久，寻找更好的方法来做这件事，但总是涉及到正则表达式和字符串比较。如果你知道更好的方法，请分享出来，而不是表现得比我高明，让我陷入不知情的境地。 - Dysanix Official

1

啊，是的，性能测试和基准比较在我看来总是很有趣的。 ;) 我对文本处理的性能尤其感兴趣（正如您可以在我的旧帖子中看到的那样 - 尽管没有像您的帖子那样受到好评）。所以是的，我也为你感到高兴！ :) - Ian

显示剩余4条评论

7个回答

13

@StianStandahls的朋友在这里。这个解决方案是最快的！它与@Ian答案中先前最快的示例相同，但随机生成器在这里进行了优化。

private const string CONST_DOUBLES = "doubles";
private const string CONST_NONE = "none";
public string FastestOptimizedModded()
{
    return (rnd.Next(99999)+1) % 100 % 11 == 0 ? CONST_DOUBLES : CONST_NONE;
}

- Gjermund Grøneng

经过内部比赛，这是最快的答案。 :) - Stian Standahl

2

我的天啊............!!! +1 真不敢相信居然还有更快的解决方案！今天我不能再测试答案了，但明天我一定会将这个答案包括在测试中！当然，附上你的名字。就像其他人一样 :) 顺便说一下很高兴认识你... :) - Ian

根据源代码，很明显这个解决方案应该是最快的。https://github.com/dotnet/corefx/blob/master/src/System.Runtime.Extensions/src/System/Random.cs - Stian Standahl

1

我测试了它（在不同的笔记本电脑上，但都一起测试），它显着增加了时间。恭喜！;) - Ian

1

好的回答。我一定会在我的即将到来的模块中使用它。我对这个正则表达式现象有点担心。这会帮助我很多。 - Mukesh Adhvaryu

5

关于最佳表现，我相信@Ian已经很好地解决了这个问题。所有的功劳都归于他。

在问答中有一件事情没有让我满意的是为什么正则表达式在这种情况下优于EndsWith。我感到有必要解释一下区别，让人们意识到哪种解决方案可能在哪种情况下更有效。

EndsWith

EndsWith功能基本上是对字符串的一部分进行连续顺序比较。类似于这样：

bool EndsWith(string haystack, string needle)
{
    bool equal = haystack.Length >= needle.Length;
    for (int i=0; i<needle.Length && equal; ++i)
    {
        equal = s[i] == needle[needle.Length - haystack.Length + i];
    }
    return equal;
}

这段代码非常简单，我们只需要取出第一个字符并进行匹配，接着取下一个字符，以此类推，直到字符串末尾。 正则表达式 正则表达式的工作方式不同。考虑在一个非常大的字符串里查找“foofoo”这个词。显而易见的实现方法是从第一个字符开始检查是否为“f”，然后移动到下一个字符，以此类推，直到字符串末尾。但是，我们可以做得更好：

仔细观察任务。如果我们首先查看字符串的第5个字符，并注意到它不是“o”（最后一个字符），我们可以立即跳过到第11个字符，并再次检查是否为“o”。这样，在最好的情况下，我们可以获得比原始代码快6倍的性能提升，在最坏的情况下，性能相同。

还要注意的是，如果我们只需要查看尾随字符，则使用“或”、“与”等更复杂的正则表达式就不再有太多意义了。

这就是为什么正则表达式通常使用编译为DFA的NFA。这里有一个很棒的在线工具：http://hackingoff.com/compilers/regular-expression-to-nfa-dfa，展示了这个过程是怎样的（对于简单的正则表达式）。

在内部，您可以使用Reflection.Emit来要求.NET编译正则表达式，并且当您使用正则表达式时，实际上会评估这个经过优化、已编译的状态机（RegexOptions.Compiled）。

最终，您可能会得到类似以下的代码：

bool Matches(string haystack)
{
    char _1;
    int index = 0;

    // match (.)
state0:
    if (index >= haystack.Length) 
    {
        goto stateFail;
    }
    _1 = haystack[index]; 
    state = 1;
    ++index;
    goto state1;

    // match \1{1}
state1:
    if (index >= haystack.Length) 
    {
        goto stateFail;
    }
    if (_1 == haystack[index])
    {
        ++index;
        goto state2;
    }
    goto stateFail;

    // match \1{2,*}$ -- usually optimized away because it always succeeds
state1:
    if (index >= haystack.Length) 
    {
        goto stateSuccess;
    }
    if (_1 == haystack[index])
    {
        ++index;
        goto state2;
    }
    goto stateSuccess;

stateSuccess:
    return true;

stateFail:
    return false;
}

那么哪个更快？

这要视情况而定。在确定NFA/DFA表达式、编译程序以及每次调用查找程序并进行评估时都会产生开销。对于非常简单的情况，EndsWith比正则表达式更快。在这种情况下，是EndsWith中的"OR"使其比正则表达式慢。

另一方面，正则表达式通常是您多次使用的东西，这意味着您只需要编译一次，然后每次调用只需查找它即可。

- atlaste

好的解释！ :) 这确实是一个有价值的信息，了解正则表达式如何在内部工作。已点赞。 - Ian

3

由于此时该主题已经转移到了Random方法的微小优化上，因此我将集中讨论LargeLookup实现。

首先，RollLargeLookupParameterlessRandom解决方案除了偏差问题之外还有另一个问题。所有其他实现都在范围[1, 99999] 内检查随机数，即共有99999个数字，而% 100000生成范围为[0, 99999]，即共有100000个数字。

所以让我们纠正这一点，并同时通过删除添加操作来优化RollLargeLookup实现：

private string[] LargeLookup;
private void InitLargeLookup()
{
    LargeLookup = new string[99999];
    for (int i = 0; i < LargeLookup.Length; i++)
    {
        LargeLookup[i] = (i + 1) % 100 % 11 == 0 ? "doubles" : "none";
    }
}

public string RollLargeLookup()
{ 
    return LargeLookup[rnd.Next(99999)];
}

public string RollLargeLookupParameterlessRandom()
{ 
    return LargeLookup[rnd.Next() % 99999];
}

现在，我们可以进一步优化RollLargeLookupParameterlessRandom的实现，并同时解决前面提到的偏差问题，使其与其他实现兼容吗？事实证明我们是可以的。为了再次达到这个目标，我们需要知道Random.Next(maxValue)的实现方式，它大致如下：

return (int)((Next() * (1.0 / int.MaxValue)) * maxValue);

请注意，1.0 / int.MaxValue是在编译时计算的常量。我们的想法是将1.0替换为maxValue（在本例中也是常量99999），从而消除一次乘法。因此，最终的函数如下：

public string RollLargeLookupOptimizedRandom()
{
    return LargeLookup[(int)(rnd.Next() * (99999.0 / int.MaxValue))];
}

有趣的是，这不仅修复了RollLargeLookupParameterlessRandom问题，而且速度也稍微快了一点。

实际上，这种优化可以应用于任何其他解决方案，因此最快的非查找实现将是:

public string FastestOptimizedRandomModded()
{
    return ((int)(rnd.Next() * (99999.0 / int.MaxValue)) + 1) % 100 % 11 == 0 ? CONST_DOUBLES : CONST_NONE;
}

在展示性能测试之前，让我们证明该结果与Random.Next(maxValue)的实现是兼容的:

for (int n = 0; n < int.MaxValue; n++)
{
    var n1 = (int)((n * (1.0 / int.MaxValue)) * 99999);
    var n2 = (int)(n * (99999.0 / int.MaxValue));
    Debug.Assert(n1 == n2);
}

最后，我的基准测试：

64位操作系统，发布版本，优先使用32位 = 是

Large Lookup Optimized Random: 149 ms
Large Lookup Parameterless Random: 159 ms
Large Lookup: 179 ms
Lookup Optimized Modded: 231 ms
Fastest Optimized Random Modded: 219 ms
Fastest Optimized Modded: 251 ms
Optimized Modded Const: 412 ms
Optimized Modded: 416 ms
Modded: 419 ms
Simple: 1343 ms
Another simple with HashSet: 1805 ms
Another Simple: 2690 ms
Option (Compiled) Regex: 8538 ms
Regex: 14861 ms
EndsWith: 39117 ms

64位操作系统，发布版本，优先选择32位 = False

Large Lookup Optimized Random: 121 ms
Large Lookup Parameterless Random: 126 ms
Large Lookup: 156 ms
Lookup Optimized Modded: 168 ms
Fastest Optimized Random Modded: 154 ms
Fastest Optimized Modded: 186 ms
Optimized Modded Const: 178 ms
Optimized Modded: 180 ms
Modded: 202 ms
Simple: 795 ms
Another simple with HashSet: 1287 ms
Another Simple: 2178 ms
Option (Compiled) Regex: 7246 ms
Regex: 17090 ms
EndsWith: 36554 ms

- Ivan Stoev

我错过了我的派对？啊，抱歉 - 你的意思是你不能错过我的派对。"又来了" - 啊是的，在C#赏金问题中我会更经常见到你... :p - Ian

话虽如此，您使用的方法是通过引入99999.0作为常量，我想这是一个double常量？如果是这样，由于double实现（根据定义）是不精确的，因此仍然可能存在一些偏差（非常小）- 是吗？但是偏差将比0.0039％少得多-这就是您所说的消除偏差的意思-我对吗？ - Ian

啊，我明白你的意思了。可能是我的问题，我想消除“%”实现引入的偏差。或者更好的说法是使它与“Random”类的实现兼容（等效）。 - Ivan Stoev

1

谢谢！我有一个想法，只是等待比赛结束。我正在考虑授予Gjermund 100个我的声望，无论比赛结果如何，现在还增加了你的悬赏所得。从我在你的答案下读到的评论中，我认为他值得这样做。期待着你的另一次聚会（不一定要有悬赏：），真的很有趣！ - Ivan Stoev

1

酷啊！！:D 哦，为什么不呢？下次聚会再见！;) - Ian

显示剩余8条评论

3

正如其他几位已经指出的那样，数字的字符串比较并不高效。

    public static String RollNumbers()
    {
        int roll = rnd.Next(1, 100000);

        int lastDigit = roll % 10;
        int secondLastDigit = (roll / 10) % 10;

        if( lastDigit == secondLastDigit )
        {
            return "doubles";
        }
        else
        {
            return "none";
        }
    }

在我的设备上，这将在50毫秒内运行，而原始方法需要1200毫秒。大部分时间都用于分配许多小临时对象。如果可以的话，应该首先摆脱字符串。如果这是您的热代码路径，将数据结构转换为创建更昂贵但非常便宜的查询内容可帮助解决问题。本文演示了查找表，这是一个很好的起点。

如果您仔细查看LargeLookup实现，您会发现其良好的性能大多是因为它不使用字符串作为键，而是使用初始随机数进行一些计算作为索引。如果尝试我的解决方案，它很可能会运行得更快，因为查找表往往具有不良的缓存一致性，这使得内存访问更加昂贵。

- Alois Kraus

感谢您的回答。好的，我会在明天到达我的笔记本电脑时测试这个解决方案，这也是我测试所有其他解决方案的地方。请期待赏金在宽限期内给出（根据测试结果）- 我很感激您的回答所付出的努力。您得到了我的点赞。 - Ian

感谢Ian对所有解决方案进行了分析。经过时间测试的基准测试很好，但其他资源指标也可能很有趣。个人而言，我一直坚持使用ETW分析，因为它为我提供了最多的见解。请参阅http://etwcontroler.codeplex.com/获取更多信息。 - Alois Kraus

啊，这确实是一个有趣的性能分析工具（对我来说还很新奇）...当然，我也会去看看它。（虽然就这个问题和悬赏而言，现在知道这个工具可能有点晚了。但是，我仍然会为了自己的理解去研究它。再次感谢） - Ian

好的，测试已经关闭。不再接受任何新的赏金方法。请检查我的答案中的结果和测试方式，并让我知道是否有任何异议/遗漏（在赏金宽限期到期之前）。如果没有，则会将赏金授予速度最快的解决方案（正如我在赏金评论中提到的）。Thomas Ayoub的答案在我看来是有道理的，Evk的FastRandom也是如此。请阅读所给的说明。 - Ian

看起来其他人通过使用模运算而避免昂贵的除法操作仍然更快。如果查找表不太大，可以适应L3高速缓存（约为6 MB）甚至是L1/L2高速缓存，则实际计算中花费的周期变得相关。 - Alois Kraus

确实，取模运算似乎比除法更便宜（尽管它是整数除法 - 而不是双精度/浮点数）。 - Ian

3

如果预先生成所有可能值的查找表，可以挤出更多的性能。这将避免最快方法中的两个模数除法，因此会更快一些：

private string[] LargeLookup;
private void Init() {
    LargeLookup = new string[100000];
    for (int i = 0; i < 100000; i++) {
        LargeLookup[i] = i%100%11 == 0 ? "doubles" : "none";
    }
}

而方法本身就是这样的：

public string RollLargeLookup() {
    return LargeLookup[rnd.Next(99999) + 1];
}

虽然看起来有些不自然，但这种方法经常被使用。例如最快的已知扑克牌手评估器预先生成一个包含数十万条目的巨大数组（使用非常聪明的技巧），然后只需在该数组上进行几个简单的查找，就可以立即评估一张扑克牌的强度是否优于另一张。

通过使用替代的随机数生成器，您甚至可以使速度更快。例如，如果您将System.Random替换为基于xorshift算法实现的FastRandom类（链接） - 它将快两倍。

如果同时实现大型查找表和FastRandom - 在我的电脑上，RollLookupOptimizedModded的时间是100ms对220ms。

以下是我链接中提到的FastRandom类的源代码：

public class FastRandom
{
    // The +1 ensures NextDouble doesn't generate 1.0
    const double REAL_UNIT_INT = 1.0 / ((double)int.MaxValue + 1.0);
    const double REAL_UNIT_UINT = 1.0 / ((double)uint.MaxValue + 1.0);
    const uint Y = 842502087, Z = 3579807591, W = 273326509;

    uint x, y, z, w;

    #region Constructors

    /// <summary>
    /// Initialises a new instance using time dependent seed.
    /// </summary>
    public FastRandom()
    {
        // Initialise using the system tick count.
        Reinitialise((int)Environment.TickCount);
    }

    /// <summary>
    /// Initialises a new instance using an int value as seed.
    /// This constructor signature is provided to maintain compatibility with
    /// System.Random
    /// </summary>
    public FastRandom(int seed)
    {
        Reinitialise(seed);
    }

    #endregion

    #region Public Methods [Reinitialisation]

    /// <summary>
    /// Reinitialises using an int value as a seed.
    /// </summary>
    /// <param name="seed"></param>
    public void Reinitialise(int seed)
    {
        // The only stipulation stated for the xorshift RNG is that at least one of
        // the seeds x,y,z,w is non-zero. We fulfill that requirement by only allowing
        // resetting of the x seed
        x = (uint)seed;
        y = Y;
        z = Z;
        w = W;
    }

    #endregion

    #region Public Methods [System.Random functionally equivalent methods]

    /// <summary>
    /// Generates a random int over the range 0 to int.MaxValue-1.
    /// MaxValue is not generated in order to remain functionally equivalent to System.Random.Next().
    /// This does slightly eat into some of the performance gain over System.Random, but not much.
    /// For better performance see:
    /// 
    /// Call NextInt() for an int over the range 0 to int.MaxValue.
    /// 
    /// Call NextUInt() and cast the result to an int to generate an int over the full Int32 value range
    /// including negative values. 
    /// </summary>
    /// <returns></returns>
    public int Next()
    {
        uint t = (x ^ (x << 11));
        x = y; y = z; z = w;
        w = (w ^ (w >> 19)) ^ (t ^ (t >> 8));

        // Handle the special case where the value int.MaxValue is generated. This is outside of 
        // the range of permitted values, so we therefore call Next() to try again.
        uint rtn = w & 0x7FFFFFFF;
        if (rtn == 0x7FFFFFFF)
            return Next();
        return (int)rtn;
    }

    /// <summary>
    /// Generates a random int over the range 0 to upperBound-1, and not including upperBound.
    /// </summary>
    /// <param name="upperBound"></param>
    /// <returns></returns>
    public int Next(int upperBound)
    {
        if (upperBound < 0)
            throw new ArgumentOutOfRangeException("upperBound", upperBound, "upperBound must be >=0");

        uint t = (x ^ (x << 11));
        x = y; y = z; z = w;

        // The explicit int cast before the first multiplication gives better performance.
        // See comments in NextDouble.
        return (int)((REAL_UNIT_INT * (int)(0x7FFFFFFF & (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8))))) * upperBound);
    }

    /// <summary>
    /// Generates a random int over the range lowerBound to upperBound-1, and not including upperBound.
    /// upperBound must be >= lowerBound. lowerBound may be negative.
    /// </summary>
    /// <param name="lowerBound"></param>
    /// <param name="upperBound"></param>
    /// <returns></returns>
    public int Next(int lowerBound, int upperBound)
    {
        if (lowerBound > upperBound)
            throw new ArgumentOutOfRangeException("upperBound", upperBound, "upperBound must be >=lowerBound");

        uint t = (x ^ (x << 11));
        x = y; y = z; z = w;

        // The explicit int cast before the first multiplication gives better performance.
        // See comments in NextDouble.
        int range = upperBound - lowerBound;
        if (range < 0)
        {   // If range is <0 then an overflow has occured and must resort to using long integer arithmetic instead (slower).
            // We also must use all 32 bits of precision, instead of the normal 31, which again is slower.  
            return lowerBound + (int)((REAL_UNIT_UINT * (double)(w = (w ^ (w >> 19)) ^ (t ^ (t >> 8)))) * (double)((long)upperBound - (long)lowerBound));
        }

        // 31 bits of precision will suffice if range<=int.MaxValue. This allows us to cast to an int and gain
        // a little more performance.
        return lowerBound + (int)((REAL_UNIT_INT * (double)(int)(0x7FFFFFFF & (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8))))) * (double)range);
    }

    /// <summary>
    /// Generates a random double. Values returned are from 0.0 up to but not including 1.0.
    /// </summary>
    /// <returns></returns>
    public double NextDouble()
    {
        uint t = (x ^ (x << 11));
        x = y; y = z; z = w;

        // Here we can gain a 2x speed improvement by generating a value that can be cast to 
        // an int instead of the more easily available uint. If we then explicitly cast to an 
        // int the compiler will then cast the int to a double to perform the multiplication, 
        // this final cast is a lot faster than casting from a uint to a double. The extra cast
        // to an int is very fast (the allocated bits remain the same) and so the overall effect 
        // of the extra cast is a significant performance improvement.
        //
        // Also note that the loss of one bit of precision is equivalent to what occurs within 
        // System.Random.
        return (REAL_UNIT_INT * (int)(0x7FFFFFFF & (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8)))));
    }


    /// <summary>
    /// Fills the provided byte array with random bytes.
    /// This method is functionally equivalent to System.Random.NextBytes(). 
    /// </summary>
    /// <param name="buffer"></param>
    public void NextBytes(byte[] buffer)
    {
        // Fill up the bulk of the buffer in chunks of 4 bytes at a time.
        uint x = this.x, y = this.y, z = this.z, w = this.w;
        int i = 0;
        uint t;
        for (int bound = buffer.Length - 3; i < bound;)
        {
            // Generate 4 bytes. 
            // Increased performance is achieved by generating 4 random bytes per loop.
            // Also note that no mask needs to be applied to zero out the higher order bytes before
            // casting because the cast ignores thos bytes. Thanks to Stefan Troschьtz for pointing this out.
            t = (x ^ (x << 11));
            x = y; y = z; z = w;
            w = (w ^ (w >> 19)) ^ (t ^ (t >> 8));

            buffer[i++] = (byte)w;
            buffer[i++] = (byte)(w >> 8);
            buffer[i++] = (byte)(w >> 16);
            buffer[i++] = (byte)(w >> 24);
        }

        // Fill up any remaining bytes in the buffer.
        if (i < buffer.Length)
        {
            // Generate 4 bytes.
            t = (x ^ (x << 11));
            x = y; y = z; z = w;
            w = (w ^ (w >> 19)) ^ (t ^ (t >> 8));

            buffer[i++] = (byte)w;
            if (i < buffer.Length)
            {
                buffer[i++] = (byte)(w >> 8);
                if (i < buffer.Length)
                {
                    buffer[i++] = (byte)(w >> 16);
                    if (i < buffer.Length)
                    {
                        buffer[i] = (byte)(w >> 24);
                    }
                }
            }
        }
        this.x = x; this.y = y; this.z = z; this.w = w;
    }


    //      /// <summary>
    //      /// A version of NextBytes that uses a pointer to set 4 bytes of the byte buffer in one operation
    //      /// thus providing a nice speedup. The loop is also partially unrolled to allow out-of-order-execution,
    //      /// this results in about a x2 speedup on an AMD Athlon. Thus performance may vary wildly on different CPUs
    //      /// depending on the number of execution units available.
    //      /// 
    //      /// Another significant speedup is obtained by setting the 4 bytes by indexing pDWord (e.g. pDWord[i++]=w)
    //      /// instead of adjusting it dereferencing it (e.g. *pDWord++=w).
    //      /// 
    //      /// Note that this routine requires the unsafe compilation flag to be specified and so is commented out by default.
    //      /// </summary>
    //      /// <param name="buffer"></param>
    //      public unsafe void NextBytesUnsafe(byte[] buffer)
    //      {
    //          if(buffer.Length % 8 != 0)
    //              throw new ArgumentException("Buffer length must be divisible by 8", "buffer");
    //
    //          uint x=this.x, y=this.y, z=this.z, w=this.w;
    //          
    //          fixed(byte* pByte0 = buffer)
    //          {
    //              uint* pDWord = (uint*)pByte0;
    //              for(int i=0, len=buffer.Length>>2; i < len; i+=2) 
    //              {
    //                  uint t=(x^(x<<11));
    //                  x=y; y=z; z=w;
    //                  pDWord[i] = w = (w^(w>>19))^(t^(t>>8));
    //
    //                  t=(x^(x<<11));
    //                  x=y; y=z; z=w;
    //                  pDWord[i+1] = w = (w^(w>>19))^(t^(t>>8));
    //              }
    //          }
    //
    //          this.x=x; this.y=y; this.z=z; this.w=w;
    //      }

    #endregion

    #region Public Methods [Methods not present on System.Random]

    /// <summary>
    /// Generates a uint. Values returned are over the full range of a uint, 
    /// uint.MinValue to uint.MaxValue, inclusive.
    /// 
    /// This is the fastest method for generating a single random number because the underlying
    /// random number generator algorithm generates 32 random bits that can be cast directly to 
    /// a uint.
    /// </summary>
    /// <returns></returns>
    public uint NextUInt()
    {
        uint t = (x ^ (x << 11));
        x = y; y = z; z = w;
        return (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8)));
    }

    /// <summary>
    /// Generates a random int over the range 0 to int.MaxValue, inclusive. 
    /// This method differs from Next() only in that the range is 0 to int.MaxValue
    /// and not 0 to int.MaxValue-1.
    /// 
    /// The slight difference in range means this method is slightly faster than Next()
    /// but is not functionally equivalent to System.Random.Next().
    /// </summary>
    /// <returns></returns>
    public int NextInt()
    {
        uint t = (x ^ (x << 11));
        x = y; y = z; z = w;
        return (int)(0x7FFFFFFF & (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8))));
    }


    // Buffer 32 bits in bitBuffer, return 1 at a time, keep track of how many have been returned
    // with bitBufferIdx.
    uint bitBuffer;
    uint bitMask = 1;

    /// <summary>
    /// Generates a single random bit.
    /// This method's performance is improved by generating 32 bits in one operation and storing them
    /// ready for future calls.
    /// </summary>
    /// <returns></returns>
    public bool NextBool()
    {
        if (bitMask == 1)
        {
            // Generate 32 more bits.
            uint t = (x ^ (x << 11));
            x = y; y = z; z = w;
            bitBuffer = w = (w ^ (w >> 19)) ^ (t ^ (t >> 8));

            // Reset the bitMask that tells us which bit to read next.
            bitMask = 0x80000000;
            return (bitBuffer & bitMask) == 0;
        }

        return (bitBuffer & (bitMask >>= 1)) == 0;
    }

    #endregion
}

然后，您需要与您的Random一起初始化它：

Random rnd = new Random(10000);
FastRandom fastRnd = new FastRandom(10000);

方法变为：

public string RollLargeLookup() {
    return LargeLookup[fastRnd.Next(99999) + 1];
}

- Evk

我测试过了，使用大型查找确实更快。点赞。虽然如果我们使用大型查找，那么算法已经很少了。关于FastRandom，我们可能会想要将其用作基准测试工具。不过不错！利用内存进行性能优化的最佳方式就是使用大型查找！;) - Ian

请查看Thomas的回答中的讨论。如果您想使用FastRandom提供解决方案，并且该方案与Thomas的回答一样有充分的理由，我可以自由地提供帮助。 - Ian

@Ian，你需要我为FastRandom包含测试的东西吗？也许我应该在我的答案中包含那个FastRandom的代码？据我所知，Xorshift算法在生成伪随机分布方面并不比内置的Random差，因此除了依赖内置实现之外，没有什么正当理由不使用它（当然，如果你的目标是性能）。 - Evk

是的，请。如果该答案可以被测试并显示更快的结果，并且在两个方面上是合理的：（1）具有非常小的偏差（即使我们使用Random本身，仍然会有无意识的偏差-但预计会很小），（2）如果在相同的数字序列下进行测试，它仍将更快（正如我在Thomas的回答评论中提到的那样），那么它是合理的 -由于随机性的本质，我们期望如此。 - Ian

@Ian 在我的答案中添加了 FastRandom 类的实现。 - Evk

显示剩余3条评论

2

我能够实现的最快优化方法是使用大型查找方法来优化随机使用。

return LargeLookup[rnd.Next() % 100000];

这个运行比原版快20%因为它避免了除法（查看Next()代码和Next(int maxValue)）。

为了寻求真正的公平性^IMHO，我稍微改变了测试方法。

TL;DR;这里是仪表板:

|-----------------Name---------------|--Avg--|--Min--|---Max---|
|------------------------------------|-------|-------|---------|
|RollLargeLookup                     |    108|    122|    110,2|
|RollLookupOptimizedModded           |    141|    156|    145,5|
|RollOptimizedModdedConst            |    156|    159|    156,7|
|RollOptimizedModded                 |    158|    163|    159,8|
|RollNumbers                         |    197|    214|    200,9|
|RollSimple                          |  1 242|  1 304|  1 260,8|
|RollSimpleHashSet                   |  1 635|  1 774|  1 664,6|
|RollAnotherSimple                   |  2 544|  2 732|  2 603,2|
|RollOptionRegex                     |  9 137|  9 605|  9 300,6|
|RollRegex                           | 17 510| 18 873| 17 959  |
|RollEndsWith                        | 20 725| 22 001| 21 196,1|

我做了一些修改：

预先计算测试数字，以便每种方法都使用相同的数字集进行测试（避免了随机生成和引入的偏差）；
随机运行每种方法10次；
在每个函数中引入了一个参数；
删除了重复的内容。

我创建了一个名为MethodToTest的类：

public class MethodToTest
{
    public delegate string RollDelegate(int number);

    public RollDelegate MethodDelegate { get; set; }
    public List<long> timeSpent { get; set; }

    public MethodToTest()
    {
        timeSpent = new List<long>();
    }

    public string TimeStats()
    {
        return string.Format("Min: {0}ms, Max: {1}ms, Avg: {2}ms", timeSpent.Min(), timeSpent.Max(),
            timeSpent.Average());
    }
}

以下是主要内容：

private static void Test()
{
    List<MethodToTest> methodList = new List<MethodToTest>
    {
        new MethodToTest{ MethodDelegate = RollNumbers},
        new MethodToTest{ MethodDelegate = RollLargeLookup},
        new MethodToTest{ MethodDelegate = RollLookupOptimizedModded},
        new MethodToTest{ MethodDelegate = RollOptimizedModdedConst},
        new MethodToTest{ MethodDelegate = RollOptimizedModded},
        new MethodToTest{ MethodDelegate = RollSimple},
        new MethodToTest{ MethodDelegate = RollSimpleHashSet},
        new MethodToTest{ MethodDelegate = RollAnotherSimple},
        new MethodToTest{ MethodDelegate = RollOptionRegex},
        new MethodToTest{ MethodDelegate = RollRegex},
        new MethodToTest{ MethodDelegate = RollEndsWith},
    };

    InitLargeLookup();
    Stopwatch s = new Stopwatch();
    Random rnd = new Random();
    List<int> Randoms = new List<int>();

    const int trial = 10000000;
    const int numberOfLoop = 10;
    for (int j = 0; j < numberOfLoop; j++)
    {
        Console.Out.WriteLine("Loop: " + j);
        Randoms.Clear();
        for (int i = 0; i < trial; ++i)
            Randoms.Add(rnd.Next(1, 100000));

        // Shuffle order
        foreach (MethodToTest method in methodList.OrderBy(m => new Random().Next()))
        {
            s.Restart();
            for (int i = 0; i < trial; ++i)
                method.MethodDelegate(Randoms[i]);

            method.timeSpent.Add(s.ElapsedMilliseconds);
            Console.Out.WriteLine("\tMethod: " +method.MethodDelegate.Method.Name);
        }
    }

    File.WriteAllLines(@"C:\Users\me\Desktop\out.txt", methodList.OrderBy(m => m.timeSpent.Average()).Select(method => method.MethodDelegate.Method.Name + ": " + method.TimeStats()));
}

以下是相关函数：

//OP's Solution 2
public static String RollRegex(int number)
{
    return Regex.IsMatch(number.ToString(), @"(.)\1{1,}$") ? "doubles" : "none";
}

//Radin Gospodinov's Solution
static readonly Regex OptionRegex = new Regex(@"(.)\1{1,}$", RegexOptions.Compiled);
public static String RollOptionRegex(int number)
{
    return OptionRegex.IsMatch(number.ToString()) ? "doubles" : "none";
}

//OP's Solution 1
public static String RollEndsWith(int number)
{
    if (number.ToString().EndsWith("11") || number.ToString().EndsWith("22") || number.ToString().EndsWith("33") ||
        number.ToString().EndsWith("44") || number.ToString().EndsWith("55") || number.ToString().EndsWith("66") ||
        number.ToString().EndsWith("77") || number.ToString().EndsWith("88") || number.ToString().EndsWith("99") ||
        number.ToString().EndsWith("00"))
    {
        return "doubles";
    }
    return "none";
}

//Ian's Solution   
public static String RollSimple(int number)
{
    string rollString = number.ToString();
    return number > 10 && rollString[rollString.Length - 1] == rollString[rollString.Length - 2] ?
        "doubles" : "none";
}

//Ian's Other Solution
static List<string> doubles = new List<string>() { "00", "11", "22", "33", "44", "55", "66", "77", "88", "99" };
public static String RollAnotherSimple(int number)
{

    string rollString = number.ToString();
    return rollString.Length > 1 && doubles.Contains(rollString.Substring(rollString.Length - 2)) ?
        "doubles" : "none";
}

//Dandré's Solution
static HashSet<string> doublesHashset = new HashSet<string>() { "00", "11", "22", "33", "44", "55", "66", "77", "88", "99" };
public static String RollSimpleHashSet(int number)
{
    string rollString = number.ToString();
    return rollString.Length > 1 && doublesHashset.Contains(rollString.Substring(rollString.Length - 2)) ?
        "doubles" : "none";
}


//Stian Standahl optimizes modded solution
public static string RollOptimizedModded(int number) { return number % 100 % 11 == 0 ? "doubles" : "none"; }

//Gjermund Grøneng's method with constant addition
private const string CONST_DOUBLES = "doubles";
private const string CONST_NONE = "none";
public static string RollOptimizedModdedConst(int number) { return number % 100 % 11 == 0 ? CONST_DOUBLES : CONST_NONE; }

//Corak's Solution, added on Gjermund Grøneng's
private static readonly string[] Lookup = { "doubles", "none", "none", "none", "none", "none", "none", "none", "none", "none", "none" };
public static string RollLookupOptimizedModded(int number) { return Lookup[number % 100 % 11]; }

//Evk's Solution, large Lookup
private static string[] LargeLookup;
private static void InitLargeLookup()
{
    LargeLookup = new string[100000];
    for (int i = 0; i < 100000; i++)
    {
        LargeLookup[i] = i % 100 % 11 == 0 ? "doubles" : "none";
    }
}
public static string RollLargeLookup(int number) { return LargeLookup[number]; }

//Alois Kraus's Solution
public static string RollNumbers(int number)
{
    int lastDigit = number % 10;
    int secondLastDigit = (number / 10) % 10;

    return lastDigit == secondLastDigit ? "doubles" : "none";
}

- Thomas Ayoub

好的，你的解决方案似乎非常有效！不过我用了另一台笔记本电脑测试了一下——但是我可以看到性能显著提高了！我会在回到原来的笔记本电脑后立即进行测试，并在相同条件下编辑我的答案并附上性能结果！已点赞。;) - Ian

4

rnd.Next(100000)会返回0到99999之间的值，每个值的概率相等。但是rnd.Next() % 100000 不会返回相同的结果。这是因为 rnd.Next() 返回0到2147483647之间的值，并且每个值的概率相等。所以只取最后五位数意味着0到83647之间的值出现的频率略高于83648至99999之间的值。 - Michael Liu

@MichaelLiu 你可能相信真正的随机性，但我不这样认为。从0到2147400000的数字分布代表了从0到Int32.MaxValue范围的99,9961%。这意味着0到83647之间的数字将比83648到99999之间的值出现多0.0039%的次数。 - Thomas Ayoub

1

这是Eric Lippert对这种偏差的分析，非常棒。如果您打算在实际代码中使用这个东西，请阅读它。 - Lucas Trzesniewski

1

@ThomasAyoub 我想说的是，如果你想使用已知的种子（例如用于单元测试）如 var rnd = new Random(12345)，那么如果在调用 Next 时使用模数（或 +1），最终序列可能不同。为了公平比较，你应该从相同的种子中获得完全相同的序列。 - Phil1970

显示剩余21条评论

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Ian · Accepted Answer

在您的情况下，Regex实际上更快...但这很可能是因为您使用了许多带有OR和多余的ToString()的EndsWith。如果简化逻辑，简单的string操作可能会更快。

这是文本处理的性能摘要 - 从最快到最慢（1000万次循环[优先/非优先32位] - 排名按两者中最快的顺序排列）：

大型查找快速随机UInt（不计入赏金）：219/273毫秒 - 我的，改进自Evk's
大型查找优化随机：228/273毫秒 - Ivan Stoev的备选解决方案
大型查找快速随机：238/294毫秒 - Evk的替代解决方案
大型查找无参数随机：248/287毫秒 - Thomas Ayoub

关于此解决方案，我想提出几点说明（基于下面的评论）：
1. 该解决方案引入了0.0039％的偏差，偏向小数字（<100000）（参考：Eric Lippert的博客文章，由Lucas Trzesniewski链接）
2. 在测试时，它不会生成与其他人相同的数字序列（参考：Michael Liu的评论）-因为它更改了使用Random的方式（从Random.Next(int)到Random.Next()），它用于测试本身。
虽然对于这种方法，测试不能使用完全相同的数字序列（如Phil1970所述），但我有两点要说明：
1. 一些人可能会对Random.Next()的实现与Random.Next(int)感兴趣，以了解为什么即使使用相同的数字序列，该解决方案仍然会更快。。
2. 在真实情况下，Random的使用本身（大多数情况下）不会假设数字序列是相同的（或可预测的）-只有在测试我们的方法时，我们希望Random序列是相同的（为公平的单元测试目的）。该方法的预期更快结果不能仅从测试结果中推导出来，还要看Next() vs Next(int)的实现。
大型查找：320/284毫秒 - Evk
最快优化随机模式：286/333毫秒Ivan Stoev
查找优化模式：315/329毫秒 - Corak
优化的模式：471/330毫秒 - Stian Standahl
优化的模式+常量：472/337 - Gjermund Grøneng
最快优化的模式：345/340毫秒 - Gjermund Grøneng
修改过的：496/370毫秒 - Corak + 可能Alexei Levenkov
数字：537/408毫秒 - Alois Kraus
简单：1668/1176毫秒 - 我的
哈希集合包含：2138/1609毫秒 - Dandré
列表包含：3013/2465毫秒 - 另一个我的
编译的正则表达式：8956/7675毫秒 - Radin Gospodinov
正则表达式：15032/16640毫秒 - OP的解决方案1

End

这是我的简单测试用例：

Random rnd = new Random(10000);
FastRandom fastRnd = new FastRandom(10000);

//OP's Solution 2
public String RollRegex() {
    int roll = rnd.Next(1, 100000);
    if (Regex.IsMatch(roll.ToString(), @"(.)\1{1,}$")) {
        return "doubles";
    } else {
        return "none";
    }
}

//Radin Gospodinov's Solution
Regex optionRegex = new Regex(@"(.)\1{1,}$", RegexOptions.Compiled);
public String RollOptionRegex() {
    int roll = rnd.Next(1, 100000);
    string rollString = roll.ToString();
    if (optionRegex.IsMatch(roll.ToString())) {
        return "doubles";
    } else {
        return "none";
    }
}

//OP's Solution 1
public String RollEndsWith() {
    int roll = rnd.Next(1, 100000);
    if (roll.ToString().EndsWith("11") || roll.ToString().EndsWith("22") || roll.ToString().EndsWith("33") || roll.ToString().EndsWith("44") || roll.ToString().EndsWith("55") || roll.ToString().EndsWith("66") || roll.ToString().EndsWith("77") || roll.ToString().EndsWith("88") || roll.ToString().EndsWith("99") || roll.ToString().EndsWith("00")) {
        return "doubles";
    } else {
        return "none";
    }
}

//My Solution   
public String RollSimple() {
    int roll = rnd.Next(1, 100000);
    string rollString = roll.ToString();
    return roll > 10 && rollString[rollString.Length - 1] == rollString[rollString.Length - 2] ?
        "doubles" : "none";
}

//My Other Solution
List<string> doubles = new List<string>() { "00", "11", "22", "33", "44", "55", "66", "77", "88", "99" };
public String RollAnotherSimple() {
    int roll = rnd.Next(1, 100000);
    string rollString = roll.ToString();
    return rollString.Length > 1 && doubles.Contains(rollString.Substring(rollString.Length - 2)) ?
        "doubles" : "none";
}

//Dandré's Solution
HashSet<string> doublesHashset = new HashSet<string>() { "00", "11", "22", "33", "44", "55", "66", "77", "88", "99" };
public String RollSimpleHashSet() {
    int roll = rnd.Next(1, 100000);
    string rollString = roll.ToString();
    return rollString.Length > 1 && doublesHashset.Contains(rollString.Substring(rollString.Length - 2)) ?
        "doubles" : "none";
}

//Corak's Solution - hinted by Alexei Levenkov too
public string RollModded() { int roll = rnd.Next(1, 100000); return roll % 100 % 11 == 0 ? "doubles" : "none"; }

//Stian Standahl optimizes modded solution
public string RollOptimizedModded() { return rnd.Next(1, 100000) % 100 % 11 == 0 ? "doubles" : "none"; }

//Gjermund Grøneng's method with constant addition
private const string CONST_DOUBLES = "doubles";
private const string CONST_NONE = "none";
public string RollOptimizedModdedConst() { return rnd.Next(1, 100000) % 100 % 11 == 0 ? CONST_DOUBLES : CONST_NONE; }

//Gjermund Grøneng's method after optimizing the Random (The fastest!)
public string FastestOptimizedModded() { return (rnd.Next(99999) + 1) % 100 % 11 == 0 ? CONST_DOUBLES : CONST_NONE; }

//Corak's Solution, added on Gjermund Grøneng's
private readonly string[] Lookup = { "doubles", "none", "none", "none", "none", "none", "none", "none", "none", "none", "none" };
public string RollLookupOptimizedModded() { return Lookup[(rnd.Next(99999) + 1) % 100 % 11]; }

//Evk's Solution, large Lookup
private string[] LargeLookup;
private void InitLargeLookup() {
    LargeLookup = new string[100000];
    for (int i = 0; i < 100000; i++) {
        LargeLookup[i] = i % 100 % 11 == 0 ? "doubles" : "none";
    }
}
public string RollLargeLookup() { return LargeLookup[rnd.Next(99999) + 1]; }

//Thomas Ayoub's Solution
public string RollLargeLookupParameterlessRandom() { return LargeLookup[rnd.Next() % 100000]; }

//Alois Kraus's Solution
public string RollNumbers() {
    int roll = rnd.Next(1, 100000);
    int lastDigit = roll % 10;
    int secondLastDigit = (roll / 10) % 10;

    if (lastDigit == secondLastDigit) {
        return "doubles";
    } else {
        return "none";
    }
}

//Ivan Stoev's Solution
public string FastestOptimizedRandomModded() {
    return ((int)(rnd.Next() * (99999.0 / int.MaxValue)) + 1) % 100 % 11 == 0 ? CONST_DOUBLES : CONST_NONE;
}

//Ivan Stoev's Alternate Solution
public string RollLargeLookupOptimizedRandom() {
    return LargeLookup[(int)(rnd.Next() * (99999.0 / int.MaxValue))];
}

//Evk's Solution using FastRandom
public string RollLargeLookupFastRandom() {
    return LargeLookup[fastRnd.Next(99999) + 1];
}

//My Own Test, using FastRandom + NextUInt
public string RollLargeLookupFastRandomUInt() {
    return LargeLookup[fastRnd.NextUInt() % 99999 + 1];
}

额外的FastRandom类：

//FastRandom's part used for the testing
public class FastRandom {
    // The +1 ensures NextDouble doesn't generate 1.0
    const double REAL_UNIT_INT = 1.0 / ((double)int.MaxValue + 1.0);
    const double REAL_UNIT_UINT = 1.0 / ((double)uint.MaxValue + 1.0);
    const uint Y = 842502087, Z = 3579807591, W = 273326509;

    uint x, y, z, w;

    #region Constructors

    /// <summary>
    /// Initialises a new instance using time dependent seed.
    /// </summary>
    public FastRandom() {
        // Initialise using the system tick count.
        Reinitialise((int)Environment.TickCount);
    }

    /// <summary>
    /// Initialises a new instance using an int value as seed.
    /// This constructor signature is provided to maintain compatibility with
    /// System.Random
    /// </summary>
    public FastRandom(int seed) {
        Reinitialise(seed);
    }

    #endregion

    #region Public Methods [Reinitialisation]

    /// <summary>
    /// Reinitialises using an int value as a seed.
    /// </summary>
    /// <param name="seed"></param>
    public void Reinitialise(int seed) {
        // The only stipulation stated for the xorshift RNG is that at least one of
        // the seeds x,y,z,w is non-zero. We fulfill that requirement by only allowing
        // resetting of the x seed
        x = (uint)seed;
        y = Y;
        z = Z;
        w = W;
    }

    #endregion

    #region Public Methods [System.Random functionally equivalent methods]

    /// <summary>
    /// Generates a random int over the range 0 to int.MaxValue-1.
    /// MaxValue is not generated in order to remain functionally equivalent to System.Random.Next().
    /// This does slightly eat into some of the performance gain over System.Random, but not much.
    /// For better performance see:
    /// 
    /// Call NextInt() for an int over the range 0 to int.MaxValue.
    /// 
    /// Call NextUInt() and cast the result to an int to generate an int over the full Int32 value range
    /// including negative values. 
    /// </summary>
    /// <returns></returns>
    public int Next() {
        uint t = (x ^ (x << 11));
        x = y; y = z; z = w;
        w = (w ^ (w >> 19)) ^ (t ^ (t >> 8));

        // Handle the special case where the value int.MaxValue is generated. This is outside of 
        // the range of permitted values, so we therefore call Next() to try again.
        uint rtn = w & 0x7FFFFFFF;
        if (rtn == 0x7FFFFFFF)
            return Next();
        return (int)rtn;
    }

    /// <summary>
    /// Generates a random int over the range 0 to upperBound-1, and not including upperBound.
    /// </summary>
    /// <param name="upperBound"></param>
    /// <returns></returns>
    public int Next(int upperBound) {
        if (upperBound < 0)
            throw new ArgumentOutOfRangeException("upperBound", upperBound, "upperBound must be >=0");

        uint t = (x ^ (x << 11));
        x = y; y = z; z = w;

        // The explicit int cast before the first multiplication gives better performance.
        // See comments in NextDouble.
        return (int)((REAL_UNIT_INT * (int)(0x7FFFFFFF & (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8))))) * upperBound);
    }

    /// <summary>
    /// Generates a random int over the range lowerBound to upperBound-1, and not including upperBound.
    /// upperBound must be >= lowerBound. lowerBound may be negative.
    /// </summary>
    /// <param name="lowerBound"></param>
    /// <param name="upperBound"></param>
    /// <returns></returns>
    public int Next(int lowerBound, int upperBound) {
        if (lowerBound > upperBound)
            throw new ArgumentOutOfRangeException("upperBound", upperBound, "upperBound must be >=lowerBound");

        uint t = (x ^ (x << 11));
        x = y; y = z; z = w;

        // The explicit int cast before the first multiplication gives better performance.
        // See comments in NextDouble.
        int range = upperBound - lowerBound;
        if (range < 0) {   // If range is <0 then an overflow has occured and must resort to using long integer arithmetic instead (slower).
            // We also must use all 32 bits of precision, instead of the normal 31, which again is slower.  
            return lowerBound + (int)((REAL_UNIT_UINT * (double)(w = (w ^ (w >> 19)) ^ (t ^ (t >> 8)))) * (double)((long)upperBound - (long)lowerBound));
        }

        // 31 bits of precision will suffice if range<=int.MaxValue. This allows us to cast to an int and gain
        // a little more performance.
        return lowerBound + (int)((REAL_UNIT_INT * (double)(int)(0x7FFFFFFF & (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8))))) * (double)range);
    }

    /// <summary>
    /// Generates a random double. Values returned are from 0.0 up to but not including 1.0.
    /// </summary>
    /// <returns></returns>
    public double NextDouble() {
        uint t = (x ^ (x << 11));
        x = y; y = z; z = w;

        // Here we can gain a 2x speed improvement by generating a value that can be cast to 
        // an int instead of the more easily available uint. If we then explicitly cast to an 
        // int the compiler will then cast the int to a double to perform the multiplication, 
        // this final cast is a lot faster than casting from a uint to a double. The extra cast
        // to an int is very fast (the allocated bits remain the same) and so the overall effect 
        // of the extra cast is a significant performance improvement.
        //
        // Also note that the loss of one bit of precision is equivalent to what occurs within 
        // System.Random.
        return (REAL_UNIT_INT * (int)(0x7FFFFFFF & (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8)))));
    }


    /// <summary>
    /// Fills the provided byte array with random bytes.
    /// This method is functionally equivalent to System.Random.NextBytes(). 
    /// </summary>
    /// <param name="buffer"></param>
    public void NextBytes(byte[] buffer) {
        // Fill up the bulk of the buffer in chunks of 4 bytes at a time.
        uint x = this.x, y = this.y, z = this.z, w = this.w;
        int i = 0;
        uint t;
        for (int bound = buffer.Length - 3; i < bound; ) {
            // Generate 4 bytes. 
            // Increased performance is achieved by generating 4 random bytes per loop.
            // Also note that no mask needs to be applied to zero out the higher order bytes before
            // casting because the cast ignores thos bytes. Thanks to Stefan Troschьtz for pointing this out.
            t = (x ^ (x << 11));
            x = y; y = z; z = w;
            w = (w ^ (w >> 19)) ^ (t ^ (t >> 8));

            buffer[i++] = (byte)w;
            buffer[i++] = (byte)(w >> 8);
            buffer[i++] = (byte)(w >> 16);
            buffer[i++] = (byte)(w >> 24);
        }

        // Fill up any remaining bytes in the buffer.
        if (i < buffer.Length) {
            // Generate 4 bytes.
            t = (x ^ (x << 11));
            x = y; y = z; z = w;
            w = (w ^ (w >> 19)) ^ (t ^ (t >> 8));

            buffer[i++] = (byte)w;
            if (i < buffer.Length) {
                buffer[i++] = (byte)(w >> 8);
                if (i < buffer.Length) {
                    buffer[i++] = (byte)(w >> 16);
                    if (i < buffer.Length) {
                        buffer[i] = (byte)(w >> 24);
                    }
                }
            }
        }
        this.x = x; this.y = y; this.z = z; this.w = w;
    }


    //      /// <summary>
    //      /// A version of NextBytes that uses a pointer to set 4 bytes of the byte buffer in one operation
    //      /// thus providing a nice speedup. The loop is also partially unrolled to allow out-of-order-execution,
    //      /// this results in about a x2 speedup on an AMD Athlon. Thus performance may vary wildly on different CPUs
    //      /// depending on the number of execution units available.
    //      /// 
    //      /// Another significant speedup is obtained by setting the 4 bytes by indexing pDWord (e.g. pDWord[i++]=w)
    //      /// instead of adjusting it dereferencing it (e.g. *pDWord++=w).
    //      /// 
    //      /// Note that this routine requires the unsafe compilation flag to be specified and so is commented out by default.
    //      /// </summary>
    //      /// <param name="buffer"></param>
    //      public unsafe void NextBytesUnsafe(byte[] buffer)
    //      {
    //          if(buffer.Length % 8 != 0)
    //              throw new ArgumentException("Buffer length must be divisible by 8", "buffer");
    //
    //          uint x=this.x, y=this.y, z=this.z, w=this.w;
    //          
    //          fixed(byte* pByte0 = buffer)
    //          {
    //              uint* pDWord = (uint*)pByte0;
    //              for(int i=0, len=buffer.Length>>2; i < len; i+=2) 
    //              {
    //                  uint t=(x^(x<<11));
    //                  x=y; y=z; z=w;
    //                  pDWord[i] = w = (w^(w>>19))^(t^(t>>8));
    //
    //                  t=(x^(x<<11));
    //                  x=y; y=z; z=w;
    //                  pDWord[i+1] = w = (w^(w>>19))^(t^(t>>8));
    //              }
    //          }
    //
    //          this.x=x; this.y=y; this.z=z; this.w=w;
    //      }

    #endregion

    #region Public Methods [Methods not present on System.Random]

    /// <summary>
    /// Generates a uint. Values returned are over the full range of a uint, 
    /// uint.MinValue to uint.MaxValue, inclusive.
    /// 
    /// This is the fastest method for generating a single random number because the underlying
    /// random number generator algorithm generates 32 random bits that can be cast directly to 
    /// a uint.
    /// </summary>
    /// <returns></returns>
    public uint NextUInt() {
        uint t = (x ^ (x << 11));
        x = y; y = z; z = w;
        return (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8)));
    }

    /// <summary>
    /// Generates a random int over the range 0 to int.MaxValue, inclusive. 
    /// This method differs from Next() only in that the range is 0 to int.MaxValue
    /// and not 0 to int.MaxValue-1.
    /// 
    /// The slight difference in range means this method is slightly faster than Next()
    /// but is not functionally equivalent to System.Random.Next().
    /// </summary>
    /// <returns></returns>
    public int NextInt() {
        uint t = (x ^ (x << 11));
        x = y; y = z; z = w;
        return (int)(0x7FFFFFFF & (w = (w ^ (w >> 19)) ^ (t ^ (t >> 8))));
    }


    // Buffer 32 bits in bitBuffer, return 1 at a time, keep track of how many have been returned
    // with bitBufferIdx.
    uint bitBuffer;
    uint bitMask = 1;

    /// <summary>
    /// Generates a single random bit.
    /// This method's performance is improved by generating 32 bits in one operation and storing them
    /// ready for future calls.
    /// </summary>
    /// <returns></returns>
    public bool NextBool() {
        if (bitMask == 1) {
            // Generate 32 more bits.
            uint t = (x ^ (x << 11));
            x = y; y = z; z = w;
            bitBuffer = w = (w ^ (w >> 19)) ^ (t ^ (t >> 8));

            // Reset the bitMask that tells us which bit to read next.
            bitMask = 0x80000000;
            return (bitBuffer & bitMask) == 0;
        }

        return (bitBuffer & (bitMask >>= 1)) == 0;
    }

    #endregion
}

测试场景：

public delegate string RollDelegate();
private void Test() {
    List<string> rollMethodNames = new List<string>(){
        "Large Lookup Fast Random UInt",
        "Large Lookup Fast Random",
        "Large Lookup Optimized Random",
        "Fastest Optimized Random Modded",
        "Numbers",
        "Large Lookup Parameterless Random",
        "Large Lookup",
        "Lookup Optimized Modded",
        "Fastest Optimized Modded",
        "Optimized Modded Const",
        "Optimized Modded",
        "Modded",
        "Simple",
        "Another simple with HashSet",
        "Another Simple",
        "Option (Compiled) Regex",
        "Regex",
        "EndsWith",
    };
    List<RollDelegate> rollMethods = new List<RollDelegate>{
        RollLargeLookupFastRandomUInt,
        RollLargeLookupFastRandom,
        RollLargeLookupOptimizedRandom,
        FastestOptimizedRandomModded,
        RollNumbers,
        RollLargeLookupParameterlessRandom,
        RollLargeLookup,
        RollLookupOptimizedModded,
        FastestOptimizedModded,
        RollOptimizedModdedConst,
        RollOptimizedModded,
        RollModded,
        RollSimple,
        RollSimpleHashSet,
        RollAnotherSimple,
        RollOptionRegex,
        RollRegex,
        RollEndsWith
    };
    int trial = 10000000;
    InitLargeLookup();
    for (int k = 0; k < rollMethods.Count; ++k) {
        rnd = new Random(10000);
        fastRnd = new FastRandom(10000);
        logBox.GetTimeLapse();
        for (int i = 0; i < trial; ++i)
            rollMethods[k]();
        logBox.WriteTimedLogLine(rollMethodNames[k] + ": " + logBox.GetTimeLapse());
    }
}

结果（优先32位）：

[2016-05-30 08:20:54.056 UTC] Large Lookup Fast Random UInt: 219 ms
[2016-05-30 08:20:54.296 UTC] Large Lookup Fast Random: 238 ms
[2016-05-30 08:20:54.524 UTC] Large Lookup Optimized Random: 228 ms
[2016-05-30 08:20:54.810 UTC] Fastest Optimized Random Modded: 286 ms
[2016-05-30 08:20:55.347 UTC] Numbers: 537 ms
[2016-05-30 08:20:55.596 UTC] Large Lookup Parameterless Random: 248 ms
[2016-05-30 08:20:55.916 UTC] Large Lookup: 320 ms
[2016-05-30 08:20:56.231 UTC] Lookup Optimized Modded: 315 ms
[2016-05-30 08:20:56.577 UTC] Fastest Optimized Modded: 345 ms
[2016-05-30 08:20:57.049 UTC] Optimized Modded Const: 472 ms
[2016-05-30 08:20:57.521 UTC] Optimized Modded: 471 ms
[2016-05-30 08:20:58.017 UTC] Modded: 496 ms
[2016-05-30 08:20:59.685 UTC] Simple: 1668 ms
[2016-05-30 08:21:01.824 UTC] Another simple with HashSet: 2138 ms
[2016-05-30 08:21:04.837 UTC] Another Simple: 3013 ms
[2016-05-30 08:21:13.794 UTC] Option (Compiled) Regex: 8956 ms
[2016-05-30 08:21:28.827 UTC] Regex: 15032 ms
[2016-05-30 08:21:53.589 UTC] EndsWith: 24763 ms

结果（非首选 32 位）：

[2016-05-30 08:16:00.934 UTC] Large Lookup Fast Random UInt: 273 ms
[2016-05-30 08:16:01.230 UTC] Large Lookup Fast Random: 294 ms
[2016-05-30 08:16:01.503 UTC] Large Lookup Optimized Random: 273 ms
[2016-05-30 08:16:01.837 UTC] Fastest Optimized Random Modded: 333 ms
[2016-05-30 08:16:02.245 UTC] Numbers: 408 ms
[2016-05-30 08:16:02.532 UTC] Large Lookup Parameterless Random: 287 ms
[2016-05-30 08:16:02.816 UTC] Large Lookup: 284 ms
[2016-05-30 08:16:03.145 UTC] Lookup Optimized Modded: 329 ms
[2016-05-30 08:16:03.486 UTC] Fastest Optimized Modded: 340 ms
[2016-05-30 08:16:03.824 UTC] Optimized Modded Const: 337 ms
[2016-05-30 08:16:04.154 UTC] Optimized Modded: 330 ms
[2016-05-30 08:16:04.524 UTC] Modded: 370 ms
[2016-05-30 08:16:05.700 UTC] Simple: 1176 ms
[2016-05-30 08:16:07.309 UTC] Another simple with HashSet: 1609 ms
[2016-05-30 08:16:09.774 UTC] Another Simple: 2465 ms
[2016-05-30 08:16:17.450 UTC] Option (Compiled) Regex: 7675 ms
[2016-05-30 08:16:34.090 UTC] Regex: 16640 ms
[2016-05-30 08:16:54.793 UTC] EndsWith: 20702 ms

还有这张图片：