使用LINQ将列表分割成子列表

Question

使用LINQ将列表分割成子列表

c#linqdata-structures

467

有没有办法将 List<SomeObject> 拆分成多个SomeObject的列表，使用项目索引作为每个拆分的分隔符？

让我举个例子：

我有一个 List<SomeObject>，我需要一个 List<List<SomeObject>> 或 List<SomeObject>[]，以便这些结果列表中的每个列表都包含原始列表的3个项（连续的）。

例如：

原始列表：[a, g, e, w, p, s, q, f, x, y, i, m, c]
结果列表：[a, g, e]，[w, p, s]，[q, f, x]，[y, i, m]，[c]

我还需要该函数的结果列表大小为参数。

- felipecsl

34个回答

395

我刚刚写了这个，我认为它比其他提议的解决方案更加优雅：

/// <summary>
/// Break a list of items into chunks of a specific size
/// </summary>
public static IEnumerable<IEnumerable<T>> Chunk<T>(this IEnumerable<T> source, int chunksize)
{
    while (source.Any())
    {
        yield return source.Take(chunksize);
        source = source.Skip(chunksize);
    }
}

- CaseyB

16

喜欢这个解决方案。我建议添加这个合理性检查以防止无限循环:if (chunksize <= 0) throw new ArgumentException("块大小必须大于零。", "chunksize"); - mroach

14

我喜欢这个，但它不是超级有效的。 - Sam Saffron

66

我喜欢这个，但时间效率是“O(n²)”。你可以遍历列表并获得“O(n)”的时间效率。 - hIpPy

10

@hIpPy，为什么是n^2？在我看来它是线性的。 - V Maharajh

24

每次都会用包装后的IEnumerable替换source。因此，从source中取出元素需要通过多层Skip。 - Lasse Espeholt

显示剩余7条评论

123

通常建议的CaseyB方法很好用，事实上如果您传入一个List<T>，很难找到错误，也许我会将其更改为：

public static IEnumerable<IEnumerable<T>> ChunkTrivialBetter<T>(this IEnumerable<T> source, int chunksize)
{
   var pos = 0; 
   while (source.Skip(pos).Any())
   {
      yield return source.Skip(pos).Take(chunksize);
      pos += chunksize;
   }
}

这将避免大量的调用链。然而，这种方法有一个普遍的缺陷。每个块会实例化两个枚举，为了突出这个问题，请尝试运行:

foreach (var item in Enumerable.Range(1, int.MaxValue).Chunk(8).Skip(100000).First())
{
   Console.WriteLine(item);
}
// wait forever

为了解决这个问题，我们可以尝试Cameron的方法，它仅在枚举一次时通过了上述测试。

问题是它有一个不同的缺陷，它会实例化每个块中的所有项，这种方法的问题是会消耗大量内存。

为了说明这一点，请尝试运行:

foreach (var item in Enumerable.Range(1, int.MaxValue)
               .Select(x => x + new string('x', 100000))
               .Clump(10000).Skip(100).First())
{
   Console.Write('.');
}
// OutOfMemoryException

最后，任何实现都应该能够处理块的乱序迭代，例如：

Enumerable.Range(1,3).Chunk(2).Reverse().ToArray()
// should return [3],[1,2]

许多高度优化的解决方案，如我第一次修改的答案，在那里失败了。同样的问题也可以在casperOne的优化答案中看到。

为了解决所有这些问题，您可以使用以下方法：

namespace ChunkedEnumerator
{
    public static class Extensions 
    {
        class ChunkedEnumerable<T> : IEnumerable<T>
        {
            class ChildEnumerator : IEnumerator<T>
            {
                ChunkedEnumerable<T> parent;
                int position;
                bool done = false;
                T current;


                public ChildEnumerator(ChunkedEnumerable<T> parent)
                {
                    this.parent = parent;
                    position = -1;
                    parent.wrapper.AddRef();
                }

                public T Current
                {
                    get
                    {
                        if (position == -1 || done)
                        {
                            throw new InvalidOperationException();
                        }
                        return current;

                    }
                }

                public void Dispose()
                {
                    if (!done)
                    {
                        done = true;
                        parent.wrapper.RemoveRef();
                    }
                }

                object System.Collections.IEnumerator.Current
                {
                    get { return Current; }
                }

                public bool MoveNext()
                {
                    position++;

                    if (position + 1 > parent.chunkSize)
                    {
                        done = true;
                    }

                    if (!done)
                    {
                        done = !parent.wrapper.Get(position + parent.start, out current);
                    }

                    return !done;

                }

                public void Reset()
                {
                    // per http://msdn.microsoft.com/en-us/library/system.collections.ienumerator.reset.aspx
                    throw new NotSupportedException();
                }
            }

            EnumeratorWrapper<T> wrapper;
            int chunkSize;
            int start;

            public ChunkedEnumerable(EnumeratorWrapper<T> wrapper, int chunkSize, int start)
            {
                this.wrapper = wrapper;
                this.chunkSize = chunkSize;
                this.start = start;
            }

            public IEnumerator<T> GetEnumerator()
            {
                return new ChildEnumerator(this);
            }

            System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
            {
                return GetEnumerator();
            }

        }

        class EnumeratorWrapper<T>
        {
            public EnumeratorWrapper (IEnumerable<T> source)
            {
                SourceEumerable = source;
            }
            IEnumerable<T> SourceEumerable {get; set;}

            Enumeration currentEnumeration;

            class Enumeration
            {
                public IEnumerator<T> Source { get; set; }
                public int Position { get; set; }
                public bool AtEnd { get; set; }
            }

            public bool Get(int pos, out T item) 
            {

                if (currentEnumeration != null && currentEnumeration.Position > pos)
                {
                    currentEnumeration.Source.Dispose();
                    currentEnumeration = null;
                }

                if (currentEnumeration == null)
                {
                    currentEnumeration = new Enumeration { Position = -1, Source = SourceEumerable.GetEnumerator(), AtEnd = false };
                }

                item = default(T);
                if (currentEnumeration.AtEnd)
                {
                    return false;
                }

                while(currentEnumeration.Position < pos) 
                {
                    currentEnumeration.AtEnd = !currentEnumeration.Source.MoveNext();
                    currentEnumeration.Position++;

                    if (currentEnumeration.AtEnd) 
                    {
                        return false;
                    }

                }

                item = currentEnumeration.Source.Current;

                return true;
            }

            int refs = 0;

            // needed for dispose semantics 
            public void AddRef()
            {
                refs++;
            }

            public void RemoveRef()
            {
                refs--;
                if (refs == 0 && currentEnumeration != null)
                {
                    var copy = currentEnumeration;
                    currentEnumeration = null;
                    copy.Source.Dispose();
                }
            }
        }

        public static IEnumerable<IEnumerable<T>> Chunk<T>(this IEnumerable<T> source, int chunksize)
        {
            if (chunksize < 1) throw new InvalidOperationException();

            var wrapper =  new EnumeratorWrapper<T>(source);

            int currentPos = 0;
            T ignore;
            try
            {
                wrapper.AddRef();
                while (wrapper.Get(currentPos, out ignore))
                {
                    yield return new ChunkedEnumerable<T>(wrapper, chunksize, currentPos);
                    currentPos += chunksize;
                }
            }
            finally
            {
                wrapper.RemoveRef();
            }
        }
    }

    class Program
    {
        static void Main(string[] args)
        {
            int i = 10;
            foreach (var group in Enumerable.Range(1, int.MaxValue).Skip(10000000).Chunk(3))
            {
                foreach (var n in group)
                {
                    Console.Write(n);
                    Console.Write(" ");
                }
                Console.WriteLine();
                if (i-- == 0) break;
            }


            var stuffs = Enumerable.Range(1, 10).Chunk(2).ToArray();

            foreach (var idx in new [] {3,2,1})
            {
                Console.Write("idx " + idx + " ");
                foreach (var n in stuffs[idx])
                {
                    Console.Write(n);
                    Console.Write(" ");
                }
                Console.WriteLine();
            }

            /*

10000001 10000002 10000003
10000004 10000005 10000006
10000007 10000008 10000009
10000010 10000011 10000012
10000013 10000014 10000015
10000016 10000017 10000018
10000019 10000020 10000021
10000022 10000023 10000024
10000025 10000026 10000027
10000028 10000029 10000030
10000031 10000032 10000033
idx 3 7 8
idx 2 5 6
idx 1 3 4
             */

            Console.ReadKey();


        }

    }
}

还有一轮优化可以引入，用于无序迭代块，但这超出了本篇文章的范围。

至于应该选择哪种方法？完全取决于你试图解决的问题。如果你不关心第一个缺陷，简单的答案非常吸引人。

注意，像大多数方法一样，这对于多线程不安全，如果您希望使其线程安全，您需要修改EnumeratorWrapper。

- Sam Saffron

错误是在 Enumerable.Range(0, 100).Chunk(3).Reverse().ToArray() 里面吗？还是会抛出异常在 Enumerable.Range(0, 100).ToArray().Chunk(3).Reverse().ToArray()？ - Cameron MacFarland

分块IQueryable<>怎么样？我猜如果我们想将最大数量的操作委托给提供程序，采用Take/Skip方法可能是最优选择。 - Guillaume86

@Guillaume86 我同意，如果你有一个IList或IQueryable，你可以采取各种捷径，使这个过程更快（Linq在内部为所有其他方法执行此操作）。 - Sam Saffron

1

这绝对是效率最高的答案。我在使用 SqlBulkCopy 与对每个列运行其他进程的 IEnumerable 时遇到了问题，因此必须通过一次有效地运行。这将允许我将 IEnumerable 分解为可管理的大小块。(对于那些想知道的人，我启用了 SqlBulkCopy 的流模式，但似乎有问题)。 - Brain2000

这个简单的linq方法只用了不到6秒，现在变得非常快！ - Jerther

显示剩余2条评论

73

你可以使用多个查询来使用Take和Skip，但我认为这会在原始列表上增加太多迭代。

相反，我认为你应该创建自己的迭代器，如下所示：

public static IEnumerable<IEnumerable<T>> GetEnumerableOfEnumerables<T>(
  IEnumerable<T> enumerable, int groupSize)
{
   // The list to return.
   List<T> list = new List<T>(groupSize);

   // Cycle through all of the items.
   foreach (T item in enumerable)
   {
     // Add the item.
     list.Add(item);

     // If the list has the number of elements, return that.
     if (list.Count == groupSize)
     {
       // Return the list.
       yield return list;

       // Set the list to a new list.
       list = new List<T>(groupSize);
     }
   }

   // Return the remainder if there is any,
   if (list.Count != 0)
   {
     // Return the list.
     yield return list;
   }
}

然后可以调用它，它支持LINQ，因此您可以对生成的序列执行其他操作。

根据Sam的回答，我觉得有一种更简单的方法可以做到这一点，而不需要：

再次遍历列表（我最初没有这样做）
在释放块之前将项目分组材料化（对于大量项目的块，会出现内存问题）
Sam发布的所有代码

话虽如此，这里是另一个版本，我已经将其编码为扩展方法，称为Chunk，适用于IEnumerable<T>：

public static IEnumerable<IEnumerable<T>> Chunk<T>(this IEnumerable<T> source, 
    int chunkSize)
{
    // Validate parameters.
    if (source == null) throw new ArgumentNullException(nameof(source));
    if (chunkSize <= 0) throw new ArgumentOutOfRangeException(nameof(chunkSize),
        "The chunkSize parameter must be a positive value.");

    // Call the internal implementation.
    return source.ChunkInternal(chunkSize);
}

上面没有什么意外的，只是基本的错误检查。

接下来是ChunkInternal：

private static IEnumerable<IEnumerable<T>> ChunkInternal<T>(
    this IEnumerable<T> source, int chunkSize)
{
    // Validate parameters.
    Debug.Assert(source != null);
    Debug.Assert(chunkSize > 0);

    // Get the enumerator.  Dispose of when done.
    using (IEnumerator<T> enumerator = source.GetEnumerator())
    do
    {
        // Move to the next element.  If there's nothing left
        // then get out.
        if (!enumerator.MoveNext()) yield break;

        // Return the chunked sequence.
        yield return ChunkSequence(enumerator, chunkSize);
    } while (true);
}

基本上，它获取IEnumerator<T>并手动迭代每个项。它检查当前是否有要枚举的项。在枚举完每个块后，如果没有剩余项，它就会中断。

一旦检测到序列中有项，它就将内部IEnumerable<T>实现的责任委托给ChunkSequence：

private static IEnumerable<T> ChunkSequence<T>(IEnumerator<T> enumerator, 
    int chunkSize)
{
    // Validate parameters.
    Debug.Assert(enumerator != null);
    Debug.Assert(chunkSize > 0);

    // The count.
    int count = 0;

    // There is at least one item.  Yield and then continue.
    do
    {
        // Yield the item.
        yield return enumerator.Current;
    } while (++count < chunkSize && enumerator.MoveNext());
}

由于在传递给ChunkSequence的IEnumerator<T>上已经调用了MoveNext，它会产生Current返回的项，然后增加计数，确保不返回超过chunkSize个项，并在每次迭代后移动到序列中的下一项（但如果产生的项数超过块大小，则短路）。

如果没有剩余项，则InternalChunk方法将在外部循环中进行另一次遍历，但是当第二次调用MoveNext时，它仍将根据文档返回falseas per the documentation（我强调）：

如果MoveNext通过集合的末尾，则枚举器位于集合中的最后一个元素之后，并且MoveNext返回false。 当枚举器处于此位置时，对MoveNext的后续调用也将返回false，直到调用Reset为止。

此时，循环将中断，序列的序列将终止。

这是一个简单的测试：

static void Main()
{
    string s = "agewpsqfxyimc";

    int count = 0;

    // Group by three.
    foreach (IEnumerable<char> g in s.Chunk(3))
    {
        // Print out the group.
        Console.Write("Group: {0} - ", ++count);

        // Print the items.
        foreach (char c in g)
        {
            // Print the item.
            Console.Write(c + ", ");
        }

        // Finish the line.
        Console.WriteLine();
    }
}

输出：

Group: 1 - a, g, e,
Group: 2 - w, p, s,
Group: 3 - q, f, x,
Group: 4 - y, i, m,
Group: 5 - c,

重要提示：如果您不耗尽整个子序列或在父序列的任何点中断，则此方法将无法正常工作。这是一个重要的警告，但如果您的使用情况是消耗序列的每个元素，则此方法适用于您。

另外，如果更改顺序，它会做一些奇怪的事情，就像Sam's did at one point一样。

- casperOne

我认为这是最好的解决方案……唯一的问题是列表没有Length……它有Count。但这很容易改变。我们可以通过不构建列表，而是返回包含对主列表的偏移/长度组合引用的可枚举对象来使其更好。因此，如果组大小很大，我们就不会浪费内存。如果您想让我写出来，请在评论中提出。 - Amir

@Amir 我想看到那个写出来的文档 - samandmoore

1

这很不错也很快 - Cameron在你之后也发布了一个非常相似的帖子，唯一的限制是它缓冲块，如果块和项目大小很大，这可能导致内存不足。请参见我的答案，以获得另一种选择，尽管更加复杂。 - Sam Saffron

这是在效率和简单性之间权衡中的明显赢家。不过，注释太多了 :) - Ohad Schneider

刚将这段代码（更新部分）投入到一个项目中测试，结果返回了null...我发现Marc Andre的代码以及JaredPar的代码似乎都可行。 - Kevin Cook

显示剩余7条评论

63

更新 .NET 6.0

.NET 6.0在System.Linq命名空间中添加了一个新的本地Chunk方法：

public static System.Collections.Generic.IEnumerable<TSource[]> Chunk<TSource> (
   this System.Collections.Generic.IEnumerable<TSource> source, int size);

使用这种新方法，除了最后一块之外，每个块的大小都是size。最后一块将包含剩余的元素，可能会更小。

以下是一个示例：

var list = Enumerable.Range(1, 100);
var chunkSize = 10;

foreach(var chunk in list.Chunk(chunkSize)) //Returns a chunk with the correct size. 
{
    Parallel.ForEach(chunk, (item) =>
    {
        //Do something Parallel here. 
        Console.WriteLine(item);
    });
}

你可能在想，为什么不使用Skip和Take呢？的确，我认为这只是更加简洁并且使事情更易读的一点点。

- Majid Shahabfar

1

如果你被困在一个老的框架上，.NET源代码可以在这里找到：https://github.com/dotnet/runtime/blob/main/src/libraries/System.Linq/src/System/Linq/Chunk.cs。这个实现非常简洁，似乎与Sam Saffron的答案相当，而我之前一直在使用的就是.NET 6之前的版本。 - Gyromite

59

好的，这是我的看法：

完全懒惰：适用于无限枚举
没有中间的复制/缓冲
O(n)的执行时间
当内部序列只被部分消耗时也可以工作

public static IEnumerable<IEnumerable<T>> Chunks<T>(this IEnumerable<T> enumerable,
                                                    int chunkSize)
{
    if (chunkSize < 1) throw new ArgumentException("chunkSize must be positive");

    using (var e = enumerable.GetEnumerator())
    while (e.MoveNext())
    {
        var remaining = chunkSize;    // elements remaining in the current chunk
        var innerMoveNext = new Func<bool>(() => --remaining > 0 && e.MoveNext());

        yield return e.GetChunk(innerMoveNext);
        while (innerMoveNext()) {/* discard elements skipped by inner iterator */}
    }
}

private static IEnumerable<T> GetChunk<T>(this IEnumerator<T> e,
                                          Func<bool> innerMoveNext)
{
    do yield return e.Current;
    while (innerMoveNext());
}

使用示例

var src = new [] {1, 2, 3, 4, 5, 6}; 

var c3 = src.Chunks(3);      // {{1, 2, 3}, {4, 5, 6}}; 
var c4 = src.Chunks(4);      // {{1, 2, 3, 4}, {5, 6}}; 

var sum   = c3.Select(c => c.Sum());    // {6, 15}
var count = c3.Count();                 // 2
var take2 = c3.Select(c => c.Take(2));  // {{1, 2}, {4, 5}}

解释

这段代码通过嵌套两个基于yield的迭代器来工作。

外部迭代器必须跟踪内部（块）迭代器已经有效消耗了多少个元素。这是通过使用innerMoveNext()关闭remaining实现的。在下一个块由外部迭代器生成之前，未使用的块元素将被丢弃。

这是必要的，因为否则会得到不一致的结果，当内部枚举没有（完全）消耗时（例如c3.Count()将返回6）。

注意：答案已更新以解决@aolszowka指出的缺点。

- 3dGrabber

3

非常好。我的“正确”解决方案比这个复杂得多。在我看来，这是最佳答案。 - CaseyB

1

当调用ToArray()时，这会出现意外的（从API角度来看）行为，而且它也不是线程安全的。 - aolszowka

@aolszowka：您能详细说明一下吗？ - 3dGrabber

1

@aolszowka：非常有价值的观点。我已经添加了一个警告和使用部分。代码假定您在内部可枚举上进行迭代。但是，使用您的解决方案会放弃惰性。我认为可以通过自定义缓存IEnumerator来兼顾两全。如果我找到解决方案，我会在这里发布... - 3dGrabber

1

@3dGrabber 我正在尝试使用这个（因为它很优雅）来拆分更大的复杂对象集合（基本上是获取和.ToList()），但似乎无法返回超过第一个块。没有自定义枚举器。意识到这很模糊，你有任何想法为什么会发生这种情况，使用这个直接（非泛型）的副本？ - downwitch

显示剩余9条评论

18

完全懒惰，无需数数或复制：

public static class EnumerableExtensions
{

  public static IEnumerable<IEnumerable<T>> Split<T>(this IEnumerable<T> source, int len)
  {
     if (len == 0)
        throw new ArgumentNullException();

     var enumer = source.GetEnumerator();
     while (enumer.MoveNext())
     {
        yield return Take(enumer.Current, enumer, len);
     }
  }

  private static IEnumerable<T> Take<T>(T head, IEnumerator<T> tail, int len)
  {
     while (true)
     {
        yield return head;
        if (--len == 0)
           break;
        if (tail.MoveNext())
           head = tail.Current;
        else
           break;
     }
  }
}

- xtofs

这个解决方案非常优雅，我很抱歉只能给这个答案点赞一次。 - Mark

3

我不认为这会彻底失败，但它可能会表现出一些奇怪的行为。如果你有100个物品，并将它们分成10个批次，列举出所有批次但不列举任何批次中的物品，你最终会得到100个批次，每个批次只有1个物品。 - CaseyB

1

正如@CaseyB所提到的，这个问题与3dGrabber在这里解决的问题相同https://dev59.com/fXRC5IYBdhLWcg3wD87r#20953521，但是它非常快！ - drzaus

1

这是一个非常优秀的解决方案。完美地实现了它所承诺的功能。 - Rod Hartzell

到目前为止，这是最优雅且直截了当的解决方案。唯一的问题是，您应该添加一个检查负数的功能，并将ArgumentNullException替换为ArgumentException。 - Romain Vergnory

14

我认为以下建议是最快的。我牺牲了源Enumerable的懒惰，以便使用Array.Copy并提前知道每个子列表的长度。

public static IEnumerable<T[]> Chunk<T>(this IEnumerable<T> items, int size)
{
    T[] array = items as T[] ?? items.ToArray();
    for (int i = 0; i < array.Length; i+=size)
    {
        T[] chunk = new T[Math.Min(size, array.Length - i)];
        Array.Copy(array, i, chunk, 0, chunk.Length);
        yield return chunk;
    }
}

- Marc-André Bertrand

不仅是最快的，它还可以正确处理结果上的进一步可枚举操作，即items.Chunk(5).Reverse().SelectMany(x => x)。 - too

12

如果有人对打包/维护的解决方案感兴趣，MoreLINQ库提供了Batch扩展方法，它可以匹配您请求的行为：

IEnumerable<char> source = "Example string";
IEnumerable<IEnumerable<char>> chunksOfThreeChars = source.Batch(3);

Batch实现类似于Cameron MacFarland的答案，但增加了一个重载以在返回前转换chunk/batch，并且性能相当不错。

- Kevinoid

5

应该接受这个答案。不要重复造轮子，应该使用MoreLINQ。 - Otabek Kholikov

2

确实。我在 Github 上检查了源代码，它比这个页面上的任何东西都要好。包括我的答案 :) 我最初确实检查了 moreLinq，但我正在寻找名称中带有“Chunk”的东西。 - Zar Shardan

这对我来说是迄今为止最简单、最容易和最快实现的解决方案。这应该是最佳答案，似乎其他人陷入了对此进行 leetcode 而不是选择最简单的解决方案。 - J_L

11

我几年前写了一个Clump扩展方法。它非常好用，是这里最快的实现方式。:P

/// <summary>
/// Clumps items into same size lots.
/// </summary>
/// <typeparam name="T"></typeparam>
/// <param name="source">The source list of items.</param>
/// <param name="size">The maximum size of the clumps to make.</param>
/// <returns>A list of list of items, where each list of items is no bigger than the size given.</returns>
public static IEnumerable<IEnumerable<T>> Clump<T>(this IEnumerable<T> source, int size)
{
    if (source == null)
        throw new ArgumentNullException("source");
    if (size < 1)
        throw new ArgumentOutOfRangeException("size", "size must be greater than 0");

    return ClumpIterator<T>(source, size);
}

private static IEnumerable<IEnumerable<T>> ClumpIterator<T>(IEnumerable<T> source, int size)
{
    Debug.Assert(source != null, "source is null.");

    T[] items = new T[size];
    int count = 0;
    foreach (var item in source)
    {
        items[count] = item;
        count++;

        if (count == size)
        {
            yield return items;
            items = new T[size];
            count = 0;
        }
    }
    if (count > 0)
    {
        if (count == size)
            yield return items;
        else
        {
            T[] tempItems = new T[count];
            Array.Copy(items, tempItems, count);
            yield return tempItems;
        }
    }
}

- Cameron MacFarland

它应该可以工作，但是它正在缓冲100%的块，我试图避免这种情况...但结果非常棘手。 - Sam Saffron

@SamSaffron 是的。特别是如果你把像 PLINQ 这样的东西混合在一起，这也是我最初实现的目的。 - Cameron MacFarland

扩展了我的答案，让我知道你的想法。 - Sam Saffron

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- JaredPar · Accepted Answer

444

尝试以下代码。

public static List<List<T>> Split<T>(IList<T> source)
{
    return  source
        .Select((x, i) => new { Index = i, Value = x })
        .GroupBy(x => x.Index / 3)
        .Select(x => x.Select(v => v.Value).ToList())
        .ToList();
}

这个思路是首先按索引将元素分组。除以三的效果是将它们分成3个一组。然后将每个组转换为列表，并将IEnumerable的List转换为List的Lists。

- JaredPar

33

GroupBy会进行隐式排序，这可能会导致性能问题。我们需要一种类似于SelectMany的反向操作。 - yfeldblum

6

@Justice，GroupBy 可能会通过哈希实现。你如何知道 GroupBy 的实现“可能会导致性能下降”？答：@Justice，您如何知道 GroupBy 的实现“可能会导致性能下降”？ - Amy B

7

GroupBy 在枚举完所有元素之前不会返回任何内容，这就是它变慢的原因。 OP 想要的列表是连续的，因此更好的方法可以在枚举原始列表的任何其他部分之前生成第一个子列表 [a,g,e] 。 - Colonel Panic

11

以无限的IEnumerable为极端例子。GroupBy(x=>f(x)).First()永远不会产生分组。虽然OP提到了列表，但如果我们编写代码以支持IEnumerable，并且仅进行一次迭代，我们就能获得更好的性能优势。 - Colonel Panic

12

尽管按你的方法顺序不能保持不变，但了解这点仍然是好事。不过，你会把它们分成(0,3,6,9,...), (1,4,7,10,...), (2,5,8,11,...)。如果顺序无关紧要，那么这样做就没问题，但在这种情况下，听起来顺序很重要。 - Reafexus

显示剩余9条评论