“Possible multiple enumeration of IEnumerable”与“参数可以声明为基本类型”的区别。

Question

“Possible multiple enumeration of IEnumerable”与“参数可以声明为基本类型”的区别。

15

在 Resharper 5 中，以下代码会导致对 list 出现 "参数可以使用基础类型进行声明" 的警告:

public void DoSomething(List<string> list)
{
    if (list.Any())
    {
        // ...
    }
    foreach (var item in list)
    {
        // ...
    }
}

在Resharper 6中，情况并非如此。但是，如果我将方法更改为以下内容，则仍会收到该警告：

public void DoSomething(List<string> list)
{
    foreach (var item in list)
    {
        // ...
    }
}

原因是，这个版本中，列表只被枚举一次，因此将其更改为 IEnumerable<string> 不会自动引入另一个警告。现在，如果我手动将第一个版本更改为使用 IEnumerable<string> 而不是 List<string>，则在方法体中 list 的两个出现都会收到警告("可能多次枚举 IEnumerable")。

public void DoSomething(IEnumerable<string> list)
{
    if (list.Any()) // <- here
    {
        // ...
    }
    foreach (var item in list) // <- and here
    {
        // ...
    }
}

我明白为什么会出现警告，但我想知道如何解决这个问题，假设该方法确实只需要一个 IEnumerable<T> 而不是一个 List<T>，因为我只想枚举项目而不想更改列表。
在方法开头添加 list = list.ToList(); 可以消除这个警告：

public void DoSomething(IEnumerable<string> list)
{
    list = list.ToList();
    if (list.Any())
    {
        // ...
    }
    foreach (var item in list)
    {
        // ...
    }
}

我明白这样做可以消除警告，但对我来说看起来有点像一个hack... 有什么建议可以更好地解决警告，并在方法签名中仍然使用最通用的类型吗？要解决以下所有问题才能得到一个好的解决方案：

不在方法内调用 ToList()，因为它会影响性能
不使用 ICollection<T> 或更专业化的接口/类，因为它们会改变从调用者看到的方法语义。
不要多次迭代 IEnumerable<T>，以避免多次访问数据库或类似的情况。

注意：我知道这不是Resharper的问题，因此，我不想压制此警告，而是要解决底层原因，因为该警告是合法的。

更新：请不要关心 Any 和 foreach。我不需要帮助将这些语句合并为只枚举可枚举对象的一个语句。实际上，在此方法中，任何多次枚举可枚举对象的行为都可能导致问题！

- Daniel Hilgarth

13个回答

5

您应该使用 IEnumerable<T> 并忽略“多次迭代”警告。

这个消息是在警告您，如果将惰性枚举（例如迭代器或昂贵的 LINQ 查询）传递给您的方法，部分枚举将会执行两次。

- SLaks

这是否意味着遵循Resharper 5的建议可能会让您处于潜在的不安全状态（如果您正在传递“惰性可枚举”或其他内容）？ - heisenberg

@kekekela：不是不安全，只是不够优化。 - Daniel Hilgarth

1

@SLaks：由于我只想枚举项目而不想更改集合，所以将参数更改为“ICollection<T>”不是正确的做法，因为它会给调用者留下我的方法可能更改集合的印象。 - Daniel Hilgarth

@SLaks：我知道List<T>甚至更糟，这就是为什么我不想使用它的原因。我也知道调用ToList()很慢，特别是如果有一系列方法，每个方法都接受一个IEnumerable<T>。基本上，我想要的是一个解决方案，它没有多个枚举IEnumerable<T>的问题，同时在方法签名中不使用“错误”的类型，并且没有ToList()的性能影响。 - Daniel Hilgarth

你所能做的就是仔细记录你的方法，并希望调用者能够理解。 - SLaks

显示剩余3条评论

5

没有完美的解决方案，根据情况选择一种。

如果不修改列表，可以通过首先尝试 "enumerable as List" 来优化 enumerable.ToList
对 IEnumerable 进行两次迭代，但要明确告知调用者（进行文档记录）
拆分为两个方法
使用 List 来避免 "as" / ToList 的成本和可能的双重枚举成本

第一种解决方案（ToList）可能是适用于任何 Enumerable 的公共方法中最“正确”的解决方案。

您可以忽略 Resharper 的问题，警告在一般情况下是合法的，但在您的特定情况下可能是错误的。特别是如果该方法旨在供内部使用，并且您对调用者有完全控制。

- Guillaume

4

存在一种通用解决方案来解决Resharper警告的问题：IEnumerable缺乏重复性保证，以及List基类（或潜在昂贵的ToList()解决方法）。

创建一个专门的类，例如"RepeatableEnumerable"，实现IEnumerable，使用以下逻辑框架实现"GetEnumerator()"：

从内部列表中收集并输出所有已经收集的项。
如果封装的枚举器还有更多的项，
- 当封装的枚举器可以移动到下一项时，
  1. 从内部枚举器获取当前项。
  2. 将当前项添加到内部列表中。
  3. 输出当前项。
标记内部枚举器已经没有更多的项。

添加扩展方法和适当的优化，其中包含已经是可重复的包装参数。Resharper不再在以下代码中标记指定的警告：

public void DoSomething(IEnumerable<string> list)
{
    var repeatable = list.ToRepeatableEnumeration();
    if (repeatable.Any()) // <- no warning here anymore.
      // Further, this will read at most one item from list.  A
      // query (SQL LINQ) with a 10,000 items, returning one item per second
      // will pass this block in 1 second, unlike the ToList() solution / hack.
    {
        // ...
    }

    foreach (var item in repeatable) // <- and no warning here anymore, either.
      // Further, this will read in lazy fashion.  In the 10,000 item, one 
      // per second, query scenario, this loop will process the first item immediately
      // (because it was read already for Any() above), and then proceed to
      // process one item every second.
    {
        // ...
    }
}

通过一些努力，您还可以将RepeatableEnumerable转换为LazyList，这是IList的完整实现，但这超出了这个特定问题的范围。 :)

更新：在评论中请求代码实现--不确定为什么原始PDL不足够，但在任何情况下，以下忠实地实现了我建议的算法（我的自己的实现实现了完整的IList接口；这有点超出了我想在这里发布的范围... :)）

public class RepeatableEnumerable<T> : IEnumerable<T>
{
    readonly List<T> innerList;
    IEnumerator<T> innerEnumerator;

    public RepeatableEnumerable( IEnumerator<T> innerEnumerator )
    {
        this.innerList = new List<T>();
        this.innerEnumerator = innerEnumerator;
    }

    public IEnumerator<T> GetEnumerator()
    {
        // 1. Yield all items already collected so far from the inner list.
        foreach( var item in innerList ) yield return item;

        // 2. If the wrapped enumerator has more items
        if( innerEnumerator != null )
        {
            // 2A. while the wrapped enumerator can move to the next item
            while( innerEnumerator.MoveNext() )
            {
                // 1. Get the current item from the inner enumerator.
                var item = innerEnumerator.Current;
                // 2. Add the current item to the inner list.
                innerList.Add( item );
                // 3. Yield the current item
                yield return item;
            }

            // 3. Mark the inner enumerator as having no more items.
            innerEnumerator.Dispose();
            innerEnumerator = null;
        }
    }

    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
    {
        return GetEnumerator();
    }
}

// Add extension methods and appropriate optimizations where the wrapped parameter is already repeatable.
public static class RepeatableEnumerableExtensions
{
    public static RepeatableEnumerable<T> ToRepeatableEnumerable<T>( this IEnumerable<T> items )
    {
        var result = ( items as RepeatableEnumerable<T> )
            ?? new RepeatableEnumerable<T>( items.GetEnumerator() );
        return result;
    }
}

- T.Tobler

好答案，好主意。但是有一个很大的缺点：使用OrderBy的可重复枚举器不一定按照OrderBy定义的顺序返回项目。考虑以下示例：

var items = new List<int> { 5, 7, 3 }; var enumerable = items.OrderBy(x => x); var repeatable = enumerable.AsRepeatable(); repeatable.Any(); list.Add(1); foreach(var item in repeatable) Console.WriteLine(item);

结果将是 3, 1, 5, 7，而不是 1, 3, 5, 7。 - Daniel Hilgarth

我在我的实现中添加了一个单元测试。它按预期工作；我用 Assert.IsTrue( new[]{ 3, 5, 7 }.SequenceEqual( repeatable ) ); 替换了你在这里的最后一行。你是否真的看到了这种情况下的意外行为？此外，请注意，在枚举集合时更改它是没有定义的；例如，许多集合将抛出“集合已修改”异常。 - T.Tobler

不，我没有看到那种行为，但根据你的描述，它听起来应该是这样的。你能分享一下你的实现吗？ - Daniel Hilgarth

3

我知道这个问题很老了，已经标记为已回答，但是我很惊讶没有人建议手动迭代枚举器：

// NOTE: list is of type IEnumerable<T>.
//       The name was taken from the OP's code.
var enumerator = list.GetEnumerator();
if (enumerator.MoveNext())
{
    // Run your list.Any() logic here
    ...

    do
    {
        var item = enumerator.Current;
        // Run your foreach (var item in list) logic here
        ...
    } while (enumerator.MoveNext());
}

这似乎比其他答案更加简单明了。

- Jeff G

转换为枚举器可以工作，但会破坏IEnumerable的许多函数组合功能。您无法在更深层嵌套的代码中应用过滤器或其他LINQ语句。此外，您的调用者必须向您传递列表，这意味着a）给您修改列表的访问权限，并且b）将您绑定到特定的实现（即List甚至IList可能对未来的重构过于限制）。 - srm

@srm 我不理解你评论的第二部分，因为这个逻辑并不需要调用者传递给我的是一个 List<T>。那正是重点（上面代码中变量 "list" 的类型是 IEnumerable<T>，正如 OP 的代码一样）。至于你的第一点，是正确的。手动开始遍历 list 后，您就不能对其调用任何 LINQ 表达式，否则会进行多次迭代。启用此类代码的唯一方法是像 T.Tobler 的答案一样缓存可枚举结果，或使用 ToList()。然而，这超出了问题的范围。 - Jeff G

2

为什么不：

bool any;

foreach (var item in list)
{
    any = true;
    // ...
}
if(any)
{
    //...
}

更新： 就个人而言，我不会为了解决这样的警告而彻底更改代码。我只会禁用警告并继续进行。警告建议您改变代码的一般流程以使其更好；如果您不是为了解决警告而改进代码（甚至可能使其更糟），那么警告就失去了意义。

例如：

// ReSharper disable PossibleMultipleEnumeration
        public void DoSomething(IEnumerable<string> list)
        {
            if (list.Any()) // <- here
            {
                // ...
            }
            foreach (var item in list) // <- and here
            {
                // ...
            }
        }
// ReSharper restore PossibleMultipleEnumeration

- Peter Ritchie

我并不是在寻找针对我的特定样本的解决方案。请查看我的问题更新（末尾加粗部分）... - Daniel Hilgarth

@Peter Ritchie 你在哪里可以找到禁用ReSharper警告的名称？ReSharper选项中有吗，还是有一个列表可以参考？ - RJ Cuthbertson

1

在Alt+Enter菜单中，有一个选项叫做“带注释禁用一次”。 - Peter Ritchie

2

UIMS* - 从根本上说，不存在大问题。IEnumerable<T>曾经是“代表相同类型的一堆东西的非常基础的东西，因此在方法签名中使用它是正确的”。现在它也成为了一个“可能在后台进行评估并可能需要一段时间，因此现在您必须始终担心这个东西”的东西。

就好像IDictionary突然被扩展以支持延迟加载值，通过一个类型为Func<TKey,TValue>的LazyLoader属性。实际上，这很有用，但不适合添加到IDictionary中，因为现在每次收到IDictionary时，我们都必须担心这个属性。但这就是我们所处的位置。

因此，似乎“如果一个方法接受一个IEnumerable并评估两次，始终通过ToList()强制评估”是您能做的最好的事情。Jetbrains做得好，给我们这个警告。

*(除非我漏掉了什么……刚刚想出来，但似乎很有用)

- Bean Taxi

2

通常情况下，你需要一个状态对象，在其中可以通过foreach循环将项目推入该对象中，并从中获取最终结果。

可枚举的LINQ运算符的缺点是它们主动枚举源而不接受被推送到它们的项，因此它们不符合您的要求。

例如，如果您只需要1,000,000个整数序列的最小值和最大值，这些整数的检索成本为1,000美元的处理器时间，则最终代码可能如下所示：

public class MinMaxAggregator
{
    private bool _any;
    private int _min;
    private int _max;

    public void OnNext(int value)
    {
        if (!_any)
        {
            _min = _max = value;
            _any = true;
        }
        else
        {
            if (value < _min) _min = value;
            if (value > _max) _max = value;
        }
    }

    public MinMax GetResult()
    {
        if (!_any) throw new InvalidOperationException("Sequence contains no elements.");
        return new MinMax(_min, _max);
    }
}

public static MinMax DoSomething(IEnumerable<int> source)
{
    var aggr = new MinMaxAggregator();
    foreach (var item in source) aggr.OnNext(item);
    return aggr.GetResult();
}

事实上，您刚刚重新实现了Min()和Max()操作符的逻辑。当然这很容易，但它们只是任意复杂逻辑的示例，您可以轻松地以LINQish方式表达它们。

解决方案在昨晚的散步中想到我了：我们需要PUSH...这就是REACTIVE！所有受欢迎的运算符也存在于反应式版本中，为PUSH范式构建。它们可以像可枚举的对应物一样任意链接在一起，以满足您所需的任何复杂度。

因此，最小/最大示例归结为：

public static MinMax DoSomething(IEnumerable<int> source)
{
    // bridge over to the observable world
    var connectable = source.ToObservable(Scheduler.Immediate).Publish();
    // express the desired result there (note: connectable is observed by multiple observers)
    var combined = connectable.Min().CombineLatest(connectable.Max(), (min, max) => new MinMax(min, max));
    // subscribe
    var resultAsync = combined.GetAwaiter();
    // unload the enumerable into connectable
    connectable.Connect();
    // pick up the result
    return resultAsync.GetResult();
}

- tinudu

好的，OP（现在我注意到是您）要求一个解决方案，1./2. 不需要枚举被实体化和 3. 不会多次枚举它。 - tinudu

确实，现在我明白了。那个答案太棒了！ :-) - Daniel Hilgarth

1

在接受可枚举对象的方法中要小心。基本类型的“警告”只是一个提示，而枚举警告则是真正的警告。

然而，由于您使用了 any 和 foreach，因此您的列表将至少被枚举两次。如果您添加 ToList()，则您的枚举将被枚举三次 - 请删除 ToList()。

我建议将resharpers警告设置为基本类型的提示。这样，您仍然有一个提示（绿色下划线）和快速修复它的可能性（alt+enter），并且文件中没有“警告”。

如果枚举IEnumerable是昂贵的操作，例如从文件或数据库加载某些内容，或者您有一个计算值并使用yield return的方法，请注意。在这种情况下，首先执行 ToList() 或 ToArray() 以仅加载/计算所有数据一次。

- Zebi

有一件事以前没有人说过。Any()已经迭代尝试找到元素。如果您调用ToList()，它也将迭代以创建列表。使用IEnumerable的最初想法仅是迭代，任何其他操作都会引发执行迭代。 - marcelo-ferraz

0

你只能迭代一次：

public void DoSomething(IEnumerable<string> list)
{
    bool isFirstItem = true;
    foreach (var item in list)
    {
        if (isFirstItem)
        {
            isFirstItem = false;
            // ...
        }
        // ...
    }
}

- Guillaume

在我的特定情况下，你是正确的。但这种解决方案并不能适用于所有情况。我想要一个通用的方法。 - Daniel Hilgarth

我并不是在寻找针对我的特定样本的解决方案。请查看我的问题更新（末尾加粗部分）... - Daniel Hilgarth

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- srm · Accepted Answer

这个类提供了一种方法，可以将枚举的第一个项目分离出来，然后生成该枚举的其余部分的可枚举对象，而不会导致双重枚举，从而避免潜在的性能损失。使用方式如下（其中T是您要枚举的任何类型）：

var split = new SplitFirstEnumerable(currentIEnumerable);
T firstItem = split.First;
IEnumerable<T> remaining = split.Remaining;

这是类本身：

/// <summary>
/// Use this class when you want to pull the first item off of an IEnumerable
/// and then enumerate over the remaining elements and you want to avoid the
/// warning about "possible double iteration of IEnumerable" AND without constructing
/// a list or other duplicate data structure of the enumerable. You construct 
/// this class from your existing IEnumerable and then use its First and 
/// Remaining properties for your algorithm.
/// </summary>
/// <typeparam name="T">The type of item you are iterating over; there are no
/// "where" restrictions on this type.</typeparam>
public class SplitFirstEnumerable<T>
{
    private readonly IEnumerator<T> _enumerator;

    /// <summary>
    /// Constructor
    /// </summary>
    /// <remarks>Will throw an exception if there are zero items in enumerable or 
    /// if the enumerable is already advanced past the last element.</remarks>
    /// <param name="enumerable">The enumerable that you want to split</param>
    public SplitFirstEnumerable(IEnumerable<T> enumerable)
    {
        _enumerator = enumerable.GetEnumerator();
        if (_enumerator.MoveNext())
        {
            First = _enumerator.Current;
        }
        else
        {
            throw new ArgumentException("Parameter 'enumerable' must have at least 1 element to be split.");
        }
    }

    /// <summary>
    /// The first item of the original enumeration, equivalent to calling
    /// enumerable.First().
    /// </summary>
    public T First { get; private set; }

    /// <summary>
    /// The items of the original enumeration minus the first, equivalent to calling
    /// enumerable.Skip(1).
    /// </summary>
    public IEnumerable<T> Remaining
    {
        get
        {
            while (_enumerator.MoveNext())
            {
                yield return _enumerator.Current;
            }
        }
    }
}

这假定IEnumerable至少有一个元素可供使用。如果您想执行更多的类型设置，比如FirstOrDefault，您需要捕获在构造函数中可能抛出的异常。