为什么LINQ方法Any不检查Count？

Question

为什么LINQ方法Any不检查Count？

4

如果我们查看扩展方法Any的源代码，我们会发现它总是使用枚举器：

public static bool Any<TSource>(this IEnumerable<TSource> source)
{
    if (source == null)
        throw Error.ArgumentNull(nameof (source));
    using (IEnumerator<TSource> enumerator = source.GetEnumerator())
    {
        if (enumerator.MoveNext())
            return true;
    }
    return false;
}

我认为，如果集合是IList类型的，像SingleOrDefault方法一样检查Count属性是否更好（性能方面）:

public static TSource SingleOrDefault<TSource>(this IEnumerable<TSource> source)
{
    if (source == null)
        throw Error.ArgumentNull(nameof(source));
    IList<TSource> sourceList = source as IList<TSource>;
    if (sourceList != null)
    {
        switch (sourceList.Count)
        {
            case 0:
                return default(TSource);
            case 1:
                return sourceList[0];
        }
    }
    else
    {
        //...
    }
    throw Error.MoreThanOneElement();
}

我说，它可以看起来像这样：

private static bool Any<TSource>(IEnumerable<TSource> source)
{
    if (source == null)
        throw new ArgumentNullException(nameof(source));

    IList<TSource> sourceList = source as IList<TSource>;

    if (sourceList != null)
    {
        return sourceList.Count != 0;
    }

    using (IEnumerator<TSource> enumerator = source.GetEnumerator())
    {
        if (enumerator.MoveNext())
            return true;
    }
    return false;
}

我写了一个基准测试来测试它：

namespace AnyTests
{

    class Program
    {
        static void Main(string[] args)
        {
            BenchmarkRunner.Run<Test>();
        }
    }

    public class Test
    {
        private readonly List<int> list1 = new List<int>(new[] { 1, 2, 3, 4, 5 });

        private readonly IEnumerable<int> list2 = GetCollection();

        private static IEnumerable<int> GetCollection()
        {
            yield return 1;
        }

        [Benchmark]
        public void TestLinqAnyList()
        {
            Enumerable.Any(list1);
        }

        [Benchmark]
        public void TestNewAnyList()
        {
            NewAny(list1);
        }

        [Benchmark]
        public void TestLinqAnyEnumerable()
        {
            Enumerable.Any(list2);
        }

        [Benchmark]
        public void TestNewAnyEnumerable()
        {
            NewAny(list2);
        }


        private static bool NewAny<TSource>(IEnumerable<TSource> source)
        {
            if (source == null)
                throw new ArgumentNullException(nameof(source));

            IList<TSource> sourceList = source as IList<TSource>;

            if (sourceList != null)
            {
                return sourceList.Count != 0;
            }

            using (IEnumerator<TSource> enumerator = source.GetEnumerator())
            {
                if (enumerator.MoveNext())
                    return true;
            }
            return false;
        }
    }
}

结果显示，它的表现大约提高了两倍：

// * Summary *

BenchmarkDotNet=v0.10.13, OS=Windows 10 Redstone 3 [1709, Fall Creators Update] (10.0.16299.192)
Intel Core i7-7700 CPU 3.60GHz (Kaby Lake), 1 CPU, 8 logical cores and 4 physical cores
Frequency=3515624 Hz, Resolution=284.4445 ns, Timer=TSC
  [Host]     : .NET Framework 4.7.1 (CLR 4.0.30319.42000), 32bit LegacyJIT-v4.7.2600.0
  DefaultJob : .NET Framework 4.7.1 (CLR 4.0.30319.42000), 32bit LegacyJIT-v4.7.2600.0


                Method |     Mean |     Error |    StdDev |
---------------------- |---------:|----------:|----------:|
       TestLinqAnyList | 26.80 ns | 0.1382 ns | 0.1154 ns |
        TestNewAnyList | 12.75 ns | 0.0480 ns | 0.0426 ns |
 TestLinqAnyEnumerable | 18.03 ns | 0.0947 ns | 0.0886 ns |
  TestNewAnyEnumerable | 23.51 ns | 0.0913 ns | 0.0762 ns |

对于 IList 来说，性能大约提高了两倍，对于 IEnumerable 来说，性能下降了约 20%。

那么问题来了：为什么在 SingleOrDefault 方法中使用优化而不在 Any 方法中使用呢？

- Backs

1

在 .net core 存储库中存在一个问题：https://github.com/dotnet/corefx/issues/23700。您可以在那里阅读第一手信息，了解其利弊（特别是 Jon Hannas 的评论）。您会发现一些令人惊讶的事情，例如对于数组 - 当前版本实际上比检查 ICollection.Count 更快（而对于列表则稍慢）。 - Evk

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- mjwills · Accepted Answer

你提出问题的假设可能是：

Count方法快速，为什么不使用它？

为什么 Any 没有使用它的一个合理答案是，Count 并非总是快速的。他们选择的实现优点在于其相对稳定且低成本（即大约为 O(1)）。然而，在某些情况下，它可能不如 Count 快速（正如你所指出的一样）。

实现 IList 或 ICollection 接口的类并不能保证拥有一个快速的 Count 属性。例如，ConcurrentDictionary 通常比现有的 Any 实现慢，当 Count > 0 时更是如此。

此外，使用 IList 的代码应该改用 ICollection，因为你的代码不需要 IList 提供的额外功能访问。