在比较两个列表中的项时,我有哪些选项?我遇到了一些性能问题,想知道是否有更快的替代方法:
int[] foo = { 1, 2, 3, 4, 5 };
int[] bar = { 6, 7, 8, 9, 1 };
var result = foo.Any(x => bar.Contains(x));
无论我使用lambda方法还是自己使用foreach
,我都认为性能损失仍然是O(N^2)
。 我能做些什么来影响它吗?
在比较两个列表中的项时,我有哪些选项?我遇到了一些性能问题,想知道是否有更快的替代方法:
int[] foo = { 1, 2, 3, 4, 5 };
int[] bar = { 6, 7, 8, 9, 1 };
var result = foo.Any(x => bar.Contains(x));
无论我使用lambda方法还是自己使用foreach
,我都认为性能损失仍然是O(N^2)
。 我能做些什么来影响它吗?
您可以使用Enumerable.Intersect:
var result = foo.Intersect(bar).Any();
这将从bar
项创建Set<T>
,然后枚举foo
直到找到第一个匹配项。内部实现如下:
Set<int> set = new Set<int>();
foreach (int local in bar) // M times
set.Add(local); // O(1)
foreach (int value in foo) // N times max
{
if (!set.Remove(value)) // O(1)
continue;
yield return value;
}
您可以使用哈希集:
int[] foo = { 1, 2, 3, 4, 5 };
int[] bar = { 6, 7, 8, 9, 1 };
var hashSet = new Hashset<int>(bar);
var result = foo.Any(x => hashSet.Contains(x));
或者你可以像这样使用 Except
和 Any
:
var result = !foo.Except(bar).Any();
我敢打赌这是与Sergey的解决方案比赛:p
HashSet
上的比较应该是 O(1),但这是否只是将“性能损失”转移到创建 HashSet
上呢? - Thorsten Dittmarvar hashSet = new HashSet<int>(bar);
(否则会编译错误)。 - Matthew WatsonO(N+M)
。如果您正在为Any
内部的第二个集合元素重新创建集合(并且集合创建是O(N)
),那么它将使解决方案恢复为O(N*M)
。 - Patryk Ćwiek为了完整起见,这里有一个基准程序来测试本主题中各种答案。
看起来这个程序表明HashSet方法略微更快:
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
namespace Demo
{
internal class Program
{
private void run()
{
var foo = Enumerable.Range( 0, 100000).ToArray();
var bar = Enumerable.Range(100000, 100000).ToArray();
int trials = 4;
Stopwatch sw = new Stopwatch();
for (int i = 0; i < trials; ++i)
{
sw.Restart();
method1(foo, bar);
Console.WriteLine("method1() took " +sw.Elapsed);
sw.Restart();
for (int j = 0; j < 100; ++j)
method2(foo, bar);
Console.WriteLine("method2()*100 took " +sw.Elapsed);
sw.Restart();
for (int j = 0; j < 100; ++j)
method3(foo, bar);
Console.WriteLine("method3()*100 took " +sw.Elapsed);
Console.WriteLine();
}
}
private static bool method1(int[] foo, int[] bar)
{
return foo.Any(bar.Contains);
}
private static bool method2(int[] foo, int[] bar)
{
var hashSet = new HashSet<int>(bar);
return foo.Any(hashSet.Contains);
}
private static bool method3(int[] foo, int[] bar)
{
return foo.Intersect(bar).Any();
}
private static void Main()
{
new Program().run();
}
}
}
我的电脑上的结果(发布版本)如下。请注意,我运行了method2()和method3()各100次,因为它们比method1()快得多:
method1() took 00:00:12.2781951
method2()*100 took 00:00:00.4920760
method3()*100 took 00:00:00.7045298
method1() took 00:00:11.9267980
method2()*100 took 00:00:00.4688330
method3()*100 took 00:00:00.6886865
method1() took 00:00:11.8959856
method2()*100 took 00:00:00.4736563
method3()*100 took 00:00:00.6875508
method1() took 00:00:11.9083229
method2()*100 took 00:00:00.4572404
method3()*100 took 00:00:00.6838919
йҷӨеӨ–
,并йӣҶ
,еҺ»йҮҚ
- Patryk Ćwiek