为什么在.NET 5中将引用类型用作泛型类型参数时会更“慢”？

Question

为什么在.NET 5中将引用类型用作泛型类型参数时会更“慢”？

c#performancegenericsnested

3

今天我遇到了这个问题：当将引用类型用作外部泛型类型的类型参数时，嵌套类型中的其他方法速度会变慢约10倍。不管我使用哪种类型 - 所有引用类型似乎都会“减慢”代码速度。（抱歉标题可能不太合适。）

在.NET 5 / Release版本下测试。

我错过了什么吗？

编辑2：

我将尝试进一步解释问题并清理代码。如果您仍然想看旧版本，请复制以下内容：

https://gist.github.com/sneusse/1b5ee408dd3fdd74fcf9d369e144b35f

新代码展示了同样的问题，希望分散注意力更少。

类WthGeneric<T>被实例化两次
第一个实例使用任何引用类型作为类型参数（这里： object）
第二个实例使用任何值类型作为类型参数（这里： long）
由于两者都是同一类的实例，因此它们都具有相同的方法WhatIsHappeningHere
两个实例都没有以任何方式使用泛型参数。

这导致一个问题：为什么相同实例方法的运行时间比另一个高10倍？

输出：

System.Object: 516,8448ms
System.Int64: 50,6958ms

代码：

using System;
using System.Diagnostics;
using System.Linq;

namespace Perf
{
    public interface IWthGeneric
    {
        int WhatIsHappeningHere();
    }
    
    // This is a generic class. Note that the generic
    // type argument 'T' is _NOT_ used at all!
    public class WthGeneric<T> : IWthGeneric
    {
        // This is part of the issue.
        // If this field is not accessed or moved *outside*
        // of the generic 'WthGeneric' class, the code is fast again
        // ** also with reference types **
        public static int StaticVar = 12;

        static class NestedClass
        {
            public static int Add(int value) => StaticVar + value;
        }

        public int WhatIsHappeningHere()
        {
            var x = 0;
            for (int i = 0; i < 100000000; i++)
            {
                x += NestedClass.Add(i);
            }
            return x;
        }
    }
    
    public class RunMe
    {
        public static void Run()
        {
            // The interface is used so nothing could ever get inlined.
            var wthObject  = (IWthGeneric) new WthGeneric<object>();
            var wthValueType = (IWthGeneric) new WthGeneric<long>();

            void Test(IWthGeneric instance)
            {
                var sw = Stopwatch.StartNew();
                var x  = instance.WhatIsHappeningHere();
                Console.WriteLine(
                    $"{instance.GetType().GetGenericArguments().First()}: " +
                    $"{sw.Elapsed.TotalMilliseconds}ms");
            }

            for (int i = 0; i < 10; i++)
            {
                Test(wthObject);
                Test(wthValueType);
            }
        }
    }
}

- sneusse

你有没有尝试查看IL代码？我猜它可能会对你解释很多。 - GrayCat

@GrayCat IL不会解释这个问题。问题在于泛型类型参数未被使用。此外，由于它未被使用，因此它不会被存储（因此不会成为缓存局部性问题，也不会成为垃圾回收问题），也不会被装箱。这似乎是Jitter的一个问题。 - Theraot

我添加了IL代码，但正如@Theraot所提到的，这可能不是问题的原因。 - sneusse

这里的问题不是两个版本做了不同的事情吗？一个只执行加法，而另一个调用函数，访问类上的字段，然后再执行加法？你可以查看 WthGeneric<T> 的 IL，而不是 RunMe。 - GrayCat

1

啊，现在清楚多了，谢谢！ - GrayCat

显示剩余9条评论

2个回答

4

我准备说这是抖动的问题。也许“问题”这个词用得太重了，因为抖动没有对这种情况进行优化。

使用SharpLap查看此代码的JIT汇编：

using SharpLab.Runtime;

[JitGeneric(typeof(int))]
public class A<T>
{
    public static int X;

    public static class B
    {
        public static int C() => X;
    }
}

注意：属性JitGeneric(typeof(int))告诉SharpLab使用泛型参数int进行即时编译。没有泛型参数，无法对泛型类型进行即时编译。

; Core CLR v5.0.321.7212 on x86

A`1[[System.Int32, System.Private.CoreLib]]..ctor()
    L0000: ret

A`1+B[[System.Int32, System.Private.CoreLib]].C()
    L0000: mov ecx, 0x2051c600
    L0005: xor edx, edx
    L0007: call 0x5e646b70
    L000c: mov eax, [eax+4]
    L000f: ret

在线尝试。

同时，对于这段代码：

using SharpLab.Runtime;

[JitGeneric(typeof(object))]
public class A<T>
{
    public static int X;

    public static class B
    {
        public static int C() => X;
    }
}

注意：是的，这是相同的类，只不过现在我告诉SharpLap为泛型参数object进行JIT。

我们得到了这个：

; Core CLR v5.0.321.7212 on x86

A`1[[System.__Canon, System.Private.CoreLib]]..ctor()
    L0000: ret

A`1+B[[System.__Canon, System.Private.CoreLib]].C()
    L0000: push ebp
    L0001: mov ebp, esp
    L0003: push eax
    L0004: mov [ebp-4], ecx
    L0007: mov edx, [ecx+0x20]
    L000a: mov edx, [edx]
    L000c: mov edx, [edx+8]
    L000f: test edx, edx
    L0011: je short L0015
    L0013: jmp short L0021
    L0015: mov edx, 0x2046cec4
    L001a: call 0x5e4e4090
    L001f: mov edx, eax
    L0021: mov ecx, edx
    L0023: call 0x5e4fa760
    L0028: mov eax, [eax+4]
    L002b: mov esp, ebp
    L002d: pop ebp
    L002e: ret

在线尝试。

我们观察到对于引用类型的泛型参数，我们需要更长的代码。这个代码必要吗？嗯，我们正在访问一个泛型类的公共静态字段。让我们看看如果另一个类不是嵌套的，它会是什么样子：

using SharpLab.Runtime;

public static class Bint
{
    public static int C() => A<int>.X;
}

public static class Bobject
{
    public static int C() => A<object>.X;
}

[JitGeneric(typeof(object))]
public class A<T>
{
    public static int X;
}

我们得到了这段代码：

; Core CLR v5.0.321.7212 on x86

Bint.C()
    L0000: mov ecx, 0x209fc618
    L0005: xor edx, edx
    L0007: call 0x5e646b70
    L000c: mov eax, [eax+4]
    L000f: ret

Bobject.C()
    L0000: mov ecx, 0x209fc618
    L0005: mov edx, 1
    L000a: call 0x5e646b70
    L000f: mov eax, [eax+4]
    L0012: ret

A`1[[System.__Canon, System.Private.CoreLib]]..ctor()
    L0000: ret

因此，不，我们不需要代码的长版本。我们必须得出结论，抖动并没有适当地优化这种情况。

在线尝试。

- Theraot

我不明白第二个版本的相关性：在这个版本中，类型是静态已知的，因此可以进行内联。但是第一个版本使用了 System.__Canon 假对象类型，并且类型事先未知，因此无法进行优化。 - Charlieface

@Charlieface 类型在 JIT 编译时已知，对吧？我们可以认为 JIT 编译器没有针对这种情况进行优化。公正地说，我不太了解 JIT 编译器的原理，因此我没有试图提供理由。 - Theraot

感谢您的解释和提供SharpLab的链接，我之前不知道这个工具 - 真棒 :) - sneusse

正如我之前所说，抖动只为所有引用类型生成一个版本。 - Charlieface

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Charlieface · Accepted Answer

我不是100%确定，但我认为我知道JIT没有对这个进行优化的原因：

据我了解，每个一般泛型类型通常只有一个引用类型的JIT编译代码版本，命名为 System.__Canon，而类型参数作为实际的typeref参数传递。而对于值类型，每个值类型都会单独生成。

这是因为引用类型在JIT中看起来总是相同的：指向具有其第一个字段作为其 typeref 和 methodtable 指针的对象的指针。但值类型都是不同的，所以必须自定义构建。

你说你不使用类型参数，但实际上你确实使用了。当您访问泛型类型的静态字段时，每个实例化的泛型类型都需要一个单独的静态字段副本。

因此，现在的代码必须对类型参数的 typeref 进行指针查找才能获取静态字段的值。

但在值类型版本中，typeref 是静态已知的，因此每次都是直接内存访问。