LINQ中的“RemoveAll”如何比迭代更快？

Question

LINQ中的“RemoveAll”如何比迭代更快？

22

以下代码：

List<Interval> intervals = new List<Interval>();
List<int> points = new List<int>();

//Initialization of the two lists
// [...]

foreach (var point in points)
{
    intervals.RemoveAll (x => x.Intersects (point));
}

当列表的大小约为10000时，它至少比此快100倍：

List<Interval> intervals = new List<Interval>();
List<int> points = new List<int>();

//Initialization of the two lists
// [...]

foreach (var point in points)
{
    for (int i = 0; i < intervals.Count;)
    {
        if (intervals[i].Intersects(point))
        {
            intervals.Remove(intervals[i]);
        }
        else
        {
            i++;
        }
    }
}

怎么可能？ "RemoveAll"在幕后执行了什么操作？根据MSDN的说法，"RemoveAll"执行线性搜索，因此其时间复杂度为O(n)。因此我预计两者具有相似的性能表现。

将"Remove"替换为"RemoveAt"后，迭代速度会更快，与"RemoveAll"相当。但是，两者"Remove"和"RemoveAt"的时间复杂度都为O(n)，那么它们之间的性能差异为什么这么大？难道这只是因为"Remove（item）"将列表元素与"item"进行比较，而"RemoveAt"不执行任何比较吗？

- Brainless

20

"RemoveAll" 方法不使用 LINQ，它是 "List<T>" 上的标准方法。这是因为 "RemoveAll" 在原地修改集合，而 LINQ 不会修改集合。 - Dan

7

如果使用intervals.RemoveAt(i);而不是intervals.Remove(intervals[i]);，我认为你可以加快第二个代码示例的速度。 - ASh

3

RemoveAll 和 Remove 都是 O(n) 的，因此很容易相信那个额外有一个 for 循环的函数会慢 n 倍。 - vgru

3

@Brainless 的 RemoveAt 不执行任何比较，它只是删除指定位置的项。另一方面，Remove 必须搜索与其参数相等的项。 - Panagiotis Kanavos

2

@Brainless：在可读性和性能方面，我认为最好的方法是使用RemoveAll和LINQ的组合：intervals.RemoveAll(i => points.Any(p => i.Intersects(p))); - Tim Schmelter

显示剩余6条评论

4个回答

12

RemoveAll 可以通过检查 n 个元素的条件并最多移动 n 个元素来以 O(n) 的时间复杂度完成。

你的循环时间复杂度为 O(n^2)，因为每个 Remove 需要检查最多 n 个元素。即使你将其更改为 RemoveAt，它仍然需要移动最多 n 个元素。

这可能是最快的解决方案：intervals.RemoveAll(x => points.Any(x.Intersects));

- Henrik

3

List 是一个数组，从数组中删除一个元素需要将该元素后面的所有元素移动到前一个索引位置，因此 a[i] 会被移动到 a[i-1]。

如果需要重复执行此操作，则需要多次移动，即使更多的元素符合删除条件。 RemoveAll 可以通过在遍历列表并找到更多匹配删除条件的元素时一次性将这些元素向前移动多个索引来进行优化。

- Alex

0

区别在于Remove本身是O(n)，因此您会得到O(n^2)。

用新集合和赋值替换for。

items = items.Where(i => ...).ToList();

这个方法的算法时间复杂度与RemoveAll相同，但使用额外的O(n)内存。

- George Polevoy

1

我怀疑RemoveAll不会这样做；它使用O(n)额外的内存。 - Mark Sowul

http://referencesource.microsoft.com/#mscorlib/system/collections/generic/list.cs,82567b42bbfc416e - Mark Sowul

更新了答案，使那部分更加精确。 - George Polevoy

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Dennis_E · Accepted Answer

如果从 List<T> 中删除一个项目，其后面的所有项目将向后移动一个位置。因此，如果删除 n 个项目，则会移动很多项目 n 次。
RemoveAll 只会移动一次，可以在 List<T> 的源代码中看到：source
另一件事是，Remove(T item) 将在整个 List 中搜索该项，因此需要执行 n 次操作。
虽然这与您的问题无关，但我还想指出一点：
如果使用 for 循环从 List 中删除项目，则从末尾开始更容易：

for (int i = intervals.Count - 1; i >= 0; i--)
{
    if (intervals[i].Intersects(point))
    {
        intervals.RemoveAt(i);
    }
}

这样做，你就不需要那个丑陋的else语句。