C#中的'fixed'语句在包含固定数组的托管不安全结构体上的开销是多少?

17

我一直在试图确定在使用C#中的fixed语句处理包含固定数组的托管不安全结构时的真实成本。请注意,我并不是指非托管结构。

具体地说,是否有理由避免'MultipleFixed'类所示的模式?仅仅固定数据的成本为零,接近零(==与进入/退出固定范围时设置和清除单个标志的成本相似),还是足够显著以至于尽可能避免?

显然,这些类是虚构的,只是为了帮助解释问题。这是用于XNA游戏中高使用率数据结构的,此数据的读写性能至关重要,因此如果需要固定数组并将其传递到每个地方,我会这样做,但如果没有任何区别,我更喜欢保持fixed()局部于方法中,以帮助使函数签名略微可移植到不支持不安全代码的平台。(是的,这是一些额外的代码,但无论需要什么都行...)

unsafe struct ByteArray
{
   public fixed byte Data[1024];
}

class MultipleFixed
{
   unsafe void SetValue(ref ByteArray bytes, int index, byte value)
   {
       fixed(byte* data = bytes.Data)
       {
           data[index] = value;
       }
   }

    unsafe bool Validate(ref ByteArray bytes, int index, byte expectedValue)
    {
       fixed(byte* data = bytes.Data)
       {
           return data[index] == expectedValue;
       }
    }

    void Test(ref ByteArray bytes)
    {
        SetValue(ref bytes, 0, 1);
        Validate(ref bytes, 0, 1);
    }
}

class SingleFixed
{
   unsafe void SetValue(byte* data, int index, byte value)
   {
       data[index] = value;
   }

    unsafe bool Validate(byte* data, int index, byte expectedValue)
    {
       return data[index] == expectedValue;
    }

    unsafe void Test(ref ByteArray bytes)
    {
        fixed(byte* data = bytes.Data)
        {
            SetValue(data, 0, 1);
            Validate(data, 0, 1);
        }
    }
}

此外,我搜索了类似的问题,最接近的是这个,但该问题不同之处在于它仅关注纯托管代码以及在该上下文中使用fixed的具体成本。


1
也许可以考虑修改被接受的答案。很明显,被接受的答案是不准确的。 - Engineer
另请参阅 https://mattwarren.org/2016/10/26/How-does-the-fixed-keyword-work/。 - Jason Sparc
2个回答

21

那是一个我自己也感到有趣的问题。

我得出的结果表明,与“fixed”语句本身略有不同的原因导致了性能损失。

你可以看到我运行的测试和下面的结果,但以下是我从中得出的观察结果:

  • 使用纯指针(x *)而没有IntPtr使用'fixed'的性能与托管代码一样好;在发布模式下,如果不经常使用fixed,则甚至更好-这是访问多个数组值的最有效方法
  • 在调试模式下,在循环内使用“fixed”会对性能产生很大的负面影响,但在发布模式下,它的效果几乎和正常的数组访问(FixedAccess)一样好
  • 在引用类型参数值(float [])上使用'ref'始终更具有性能或相同的性能(两种模式都是如此)
  • 使用IntPtr算术时,调试模式与发布模式相比性能显著下降,但对于两种模式来说,性能都不如普通的数组访问(IntPtrAccess)
  • 如果使用的偏移量未对齐到数组值的偏移量,则无论模式如何,性能都很差(实际上需要相同的时间来处理两种模式)。这对于'float'是正确的,但它不会对'int'产生任何影响

多次运行测试会得到略有不同但基本一致的结果。也许我应该运行很多系列的测试并取平均时间,但我没有时间来做那件事 :)

首先是测试类:

class Test {
    public static void NormalAccess (float[] array, int index) {
        array[index] = array[index] + 2;
    }

    public static void NormalRefAccess (ref float[] array, int index) {
        array[index] = array[index] + 2;
    }

    public static void IntPtrAccess (IntPtr arrayPtr, int index) {
        unsafe {
            var array = (float*) IntPtr.Add (arrayPtr, index << 2);
            (*array) = (*array) + 2;
        }
    }

    public static void IntPtrMisalignedAccess (IntPtr arrayPtr, int index) {
        unsafe {
            var array = (float*) IntPtr.Add (arrayPtr, index); // getting bits of a float
            (*array) = (*array) + 2;
        }
    }

    public static void FixedAccess (float[] array, int index) {
        unsafe {
            fixed (float* ptr = &array[index])
                (*ptr) = (*ptr) + 2;
        }
    }

    public unsafe static void PtrAccess (float* ptr) {
        (*ptr) = (*ptr) + 2;
    }

}

以下是测试:

    static int runs = 1000*1000*100;
    public static void Print (string name, Stopwatch sw) {
        Console.WriteLine ("{0}, items/sec = {1:N} \t {2}", sw.Elapsed, (runs / sw.ElapsedMilliseconds) * 1000, name);
    }

    static void Main (string[] args) {
        var buffer = new float[1024*1024*100];
        var len = buffer.Length;

        var sw = new Stopwatch();
        for (int i = 0; i < 1000; i++) {
            Test.FixedAccess (buffer, 55);
            Test.NormalAccess (buffer, 66);
        }

        Console.WriteLine ("Starting {0:N0} items", runs);


        sw.Restart ();
        for (int i = 0; i < runs; i++)
            Test.NormalAccess (buffer, i % len);
        sw.Stop ();

        Print ("Normal access", sw);

        sw.Restart ();
        for (int i = 0; i < runs; i++)
            Test.NormalRefAccess (ref buffer, i % len);
        sw.Stop ();

        Print ("Normal Ref access", sw);

        sw.Restart ();
        unsafe {
            fixed (float* ptr = &buffer[0])
                for (int i = 0; i < runs; i++) {
                    Test.IntPtrAccess ((IntPtr) ptr, i % len);
                }
        }
        sw.Stop ();

        Print ("IntPtr access (fixed outside loop)", sw);

        sw.Restart ();
        unsafe {
            fixed (float* ptr = &buffer[0])
                for (int i = 0; i < runs; i++) {
                    Test.IntPtrMisalignedAccess ((IntPtr) ptr, i % len);
                }
        }
        sw.Stop ();

        Print ("IntPtr Misaligned access (fixed outside loop)", sw);

        sw.Restart ();
        for (int i = 0; i < runs; i++)
            Test.FixedAccess (buffer, i % len);
        sw.Stop ();

        Print ("Fixed access (fixed inside loop)", sw);

        sw.Restart ();
        unsafe {
            fixed (float* ptr = &buffer[0]) {
                for (int i = 0; i < runs; i++) {
                    Test.PtrAccess (ptr + (i % len));
                }
            }
        }
        sw.Stop ();

        Print ("float* access (fixed outside loop)", sw);

        sw.Restart ();
        unsafe {
            for (int i = 0; i < runs; i++) {
                fixed (float* ptr = &buffer[i % len]) {
                    Test.PtrAccess (ptr);
                }
            }
        }
        sw.Stop ();

        Print ("float* access (fixed in loop)", sw);

最终结果如下:

调试模式


请注意,本次翻译保留了HTML标签。
Starting 100,000,000 items
00:00:01.0373583, items/sec = 96,432,000.00      Normal access
00:00:00.8582307, items/sec = 116,550,000.00     Normal Ref access
00:00:01.8822085, items/sec = 53,134,000.00      IntPtr access (fixed outside loop)
00:00:10.5356369, items/sec = 9,492,000.00       IntPtr Misaligned access (fixed outside loop)
00:00:01.6860701, items/sec = 59,311,000.00      Fixed access (fixed inside loop)
00:00:00.7577868, items/sec = 132,100,000.00     float* access (fixed outside loop)
00:00:01.0387792, items/sec = 96,339,000.00      float* access (fixed in loop)

发布模式

Starting 100,000,000 items
00:00:00.7454832, items/sec = 134,228,000.00     Normal access
00:00:00.6619090, items/sec = 151,285,000.00     Normal Ref access
00:00:00.9859089, items/sec = 101,522,000.00     IntPtr access (fixed outside loop)
00:00:10.1289018, items/sec = 9,873,000.00       IntPtr Misaligned access (fixed outside loop)
00:00:00.7899355, items/sec = 126,742,000.00     Fixed access (fixed inside loop)
00:00:00.5718507, items/sec = 175,131,000.00     float* access (fixed outside loop)
00:00:00.6842333, items/sec = 146,198,000.00     float* access (fixed in loop)

8
从经验上看,在32位JIT中,开销最好情况下约为270%,而在64位上约为200%(每次“调用”fixed时,开销会变得更糟)。因此,如果性能真的很关键,我会尽量减少您的fixed块。
抱歉,我对fixed / unsafe代码不太熟悉,不知道为什么会出现这种情况。
详细信息:
我还添加了一些TestMore方法,这些方法将您的两个测试方法调用10次,而不是2次,以给出多个方法在您的固定结构上被调用的实际场景。
我使用的代码:
class Program
{
    static void Main(string[] args)
    {
        var someData = new ByteArray();
        int iterations = 1000000000;
        var multiple = new MultipleFixed();
        var single = new SingleFixed();

        // Warmup.
        for (int i = 0; i < 100; i++)
        {
            multiple.Test(ref someData);
            single.Test(ref someData);
            multiple.TestMore(ref someData);
            single.TestMore(ref someData);
        }

        // Environment.
        if (Debugger.IsAttached)
            Console.WriteLine("Debugger is attached!!!!!!!!!! This run is invalid!");
        Console.WriteLine("CLR Version: " + Environment.Version);
        Console.WriteLine("Pointer size: {0} bytes", IntPtr.Size);
        Console.WriteLine("Iterations: " + iterations);

        Console.Write("Starting run for Single... ");
        var sw = Stopwatch.StartNew();
        for (int i = 0; i < iterations; i++)
        {
            single.Test(ref someData);
        }
        sw.Stop();
        Console.WriteLine("Completed in {0:N3}ms - {1:N2}/sec", sw.Elapsed.TotalMilliseconds, iterations / sw.Elapsed.TotalSeconds);

        Console.Write("Starting run for More Single... ");
        sw = Stopwatch.StartNew();
        for (int i = 0; i < iterations; i++)
        {
            single.Test(ref someData);
        }
        sw.Stop();
        Console.WriteLine("Completed in {0:N3}ms - {1:N2}/sec", sw.Elapsed.TotalMilliseconds, iterations / sw.Elapsed.TotalSeconds);


        Console.Write("Starting run for Multiple... ");
        sw = Stopwatch.StartNew();
        for (int i = 0; i < iterations; i++)
        {
            multiple.Test(ref someData);
        }
        sw.Stop();
        Console.WriteLine("Completed in {0:N3}ms - {1:N2}/sec", sw.Elapsed.TotalMilliseconds, iterations / sw.Elapsed.TotalSeconds);

        Console.Write("Starting run for More Multiple... ");
        sw = Stopwatch.StartNew();
        for (int i = 0; i < iterations; i++)
        {
            multiple.TestMore(ref someData);
        }
        sw.Stop();
        Console.WriteLine("Completed in {0:N3}ms - {1:N2}/sec", sw.Elapsed.TotalMilliseconds, iterations / sw.Elapsed.TotalSeconds);


        Console.ReadLine();
    }
}

unsafe struct ByteArray
{
    public fixed byte Data[1024];
}

class MultipleFixed
{
    unsafe void SetValue(ref ByteArray bytes, int index, byte value)
    {
        fixed (byte* data = bytes.Data)
        {
            data[index] = value;
        }
    }

    unsafe bool Validate(ref ByteArray bytes, int index, byte expectedValue)
    {
        fixed (byte* data = bytes.Data)
        {
            return data[index] == expectedValue;
        }
    }

    public void Test(ref ByteArray bytes)
    {
        SetValue(ref bytes, 0, 1);
        Validate(ref bytes, 0, 1);
    }
    public void TestMore(ref ByteArray bytes)
    {
        SetValue(ref bytes, 0, 1);
        Validate(ref bytes, 0, 1);
        SetValue(ref bytes, 0, 2);
        Validate(ref bytes, 0, 2);
        SetValue(ref bytes, 0, 3);
        Validate(ref bytes, 0, 3);
        SetValue(ref bytes, 0, 4);
        Validate(ref bytes, 0, 4);
        SetValue(ref bytes, 0, 5);
        Validate(ref bytes, 0, 5);
    }
}

class SingleFixed
{
    unsafe void SetValue(byte* data, int index, byte value)
    {
        data[index] = value;
    }

    unsafe bool Validate(byte* data, int index, byte expectedValue)
    {
        return data[index] == expectedValue;
    }

    public unsafe void Test(ref ByteArray bytes)
    {
        fixed (byte* data = bytes.Data)
        {
            SetValue(data, 0, 1);
            Validate(data, 0, 1);
        }
    }
    public unsafe void TestMore(ref ByteArray bytes)
    {
        fixed (byte* data = bytes.Data)
        {
            SetValue(data, 0, 1);
            Validate(data, 0, 1);
            SetValue(data, 0, 2);
            Validate(data, 0, 2);
            SetValue(data, 0, 3);
            Validate(data, 0, 3);
            SetValue(data, 0, 4);
            Validate(data, 0, 4);
            SetValue(data, 0, 5);
            Validate(data, 0, 5);
        }
    }
}

以下是在.NET 4.0、32位JIT下的结果:

CLR Version: 4.0.30319.239
Pointer size: 4 bytes
Iterations: 1000000000
Starting run for Single... Completed in 2,092.350ms - 477,931,580.94/sec
Starting run for More Single... Completed in 2,236.767ms - 447,073,934.63/sec
Starting run for Multiple... Completed in 5,775.922ms - 173,132,528.92/sec
Starting run for More Multiple... Completed in 26,637.862ms - 37,540,550.36/sec

在.NET 4.0中,64位JIT:

CLR Version: 4.0.30319.239
Pointer size: 8 bytes
Iterations: 1000000000
Starting run for Single... Completed in 2,907.946ms - 343,885,316.72/sec
Starting run for More Single... Completed in 2,904.903ms - 344,245,585.63/sec
Starting run for Multiple... Completed in 5,754.893ms - 173,765,185.93/sec
Starting run for More Multiple... Completed in 18,679.593ms - 53,534,358.13/sec

谢谢 - 好信息!我仍然想知道开销的根本原因,但是获得良好的性能是主要目标。 - Eric Cosky
是的,我会听从Skeet、Lippart或Gravel的意见来解释“为什么”。但是,如果您尝试调整结构体的大小,可能会告诉您运行时是否在每个“fixed”处制作了结构体的副本。我猜测钉定操作会复制整个结构体。(也请参见:http://www.dotnetperls.com/fixed-buffer) - ligos
13
这个测试结果不准确。你完全没有合理使用“fix”的方式。正确的用法应该是固定一次,写多次,然后取消固定。 - JakeSays

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接