在C#中从List<T>中移除重复项

Question

在C#中从List<T>中移除重复项

c#listgenericsduplicates

629

有没有快速去重C#中通用List的方法？

- JC Grubbs

5

你是否关心结果中元素的顺序？这可能会排除一些解决方案。 - Colonel Panic

3

一行代码解决方案：ICollection<MyClass> withoutDuplicates = new HashSet<MyClass>(inputList); 该代码使用哈希集合（HashSet）来去除输入列表（inputList）中的重复项，并将结果保存在一个不包含重复项的MyClass对象集合（withoutDuplicates）中。 - Harald Coppoolse

这个方法会在哪里被使用？ - kimiahdri

32个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Alfred Udah · Answer 1

这将取出不重复的元素，并将其再次转换为列表：

List<type> myNoneDuplicateValue = listValueWithDuplicate.Distinct().ToList();

- WonderWorker · Answer 2

使用Linq的Union方法。 注意：此解决方案不需要了解Linq，除了它的存在。 代码首先在类文件顶部添加以下内容：

using System.Linq;

现在，您可以使用以下代码从名为obj1的对象中删除重复项：

obj1 = obj1.Union(obj1).ToList();

注意：将obj1重命名为您的对象的名称。

它是如何工作的

Union命令列出了两个源对象中每个条目的一个。由于 obj1 是两个源对象，因此将其减少到每个条目的一个。
ToList()返回一个新列表。这是必需的，因为 Linq 命令如 Union 将结果作为 IEnumerable 结果返回，而不是修改原始列表或返回新列表。

- Grant · Answer 3

作为帮助方法（不使用Linq）：

public static List<T> Distinct<T>(this List<T> list)
{
    return (new HashSet<T>(list)).ToList();
}

- gary · Answer 4

这里有一个用于原地移除相邻重复项的扩展方法，需要先调用Sort()并传入相同的IComparer。这比Lasse V. Karlsen的版本更有效率，后者会反复调用RemoveAt（导致多个块内存移动）。

public static void RemoveAdjacentDuplicates<T>(this List<T> List, IComparer<T> Comparer)
{
    int NumUnique = 0;
    for (int i = 0; i < List.Count; i++)
        if ((i == 0) || (Comparer.Compare(List[NumUnique - 1], List[i]) != 0))
            List[NumUnique++] = List[i];
    List.RemoveRange(NumUnique, List.Count - NumUnique);
}

- dush88c · Answer 5

通过Nuget安装MoreLINQ包，您可以轻松地通过属性区分对象列表。

IEnumerable<Catalogue> distinctCatalogues = catalogues.DistinctBy(c => c.CatalogueCode);

- Motti · Answer 6

如果您不在意顺序，可以将项目插入到HashSet中，如果您想要保持顺序，可以像这样做：

var unique = new List<T>();
var hs = new HashSet<T>();
foreach (T t in list)
    if (hs.Add(t))
        unique.Add(t);

或者使用 Linq 的方式：

var hs = new HashSet<T>();
list.All( x =>  hs.Add(x) );

编辑：HashSet方法的时间复杂度是O(N)，空间复杂度也是O(N)。而按照@lassevk和其他人建议的排序并去重的方式，则时间复杂度为O(N*lgN)，空间复杂度为O(1)，因此我认为（与一开始看到的不同）排序的方式并不劣于使用HashSet的方式。

- Reza Jenabi · Answer 7

如果你有两个类Product和Customer，我们想要从它们的列表中移除重复项。

public class Product
{
    public int Id { get; set; }
    public string ProductName { get; set; }
}

public class Customer
{
    public int Id { get; set; }
    public string CustomerName { get; set; }

}

你必须按照以下形式定义一个通用类：

public class ItemEqualityComparer<T> : IEqualityComparer<T> where T : class
{
    private readonly PropertyInfo _propertyInfo;

    public ItemEqualityComparer(string keyItem)
    {
        _propertyInfo = typeof(T).GetProperty(keyItem, BindingFlags.GetProperty | BindingFlags.Instance | BindingFlags.Public);
    }

    public bool Equals(T x, T y)
    {
        var xValue = _propertyInfo?.GetValue(x, null);
        var yValue = _propertyInfo?.GetValue(y, null);
        return xValue != null && yValue != null && xValue.Equals(yValue);
    }

    public int GetHashCode(T obj)
    {
        var propertyValue = _propertyInfo.GetValue(obj, null);
        return propertyValue == null ? 0 : propertyValue.GetHashCode();
    }
}

然后，您可以删除列表中的重复项。

var products = new List<Product>
            {
                new Product{ProductName = "product 1" ,Id = 1,},
                new Product{ProductName = "product 2" ,Id = 2,},
                new Product{ProductName = "product 2" ,Id = 4,},
                new Product{ProductName = "product 2" ,Id = 4,},
            };
var productList = products.Distinct(new ItemEqualityComparer<Product>(nameof(Product.Id))).ToList();

var customers = new List<Customer>
            {
                new Customer{CustomerName = "Customer 1" ,Id = 5,},
                new Customer{CustomerName = "Customer 2" ,Id = 5,},
                new Customer{CustomerName = "Customer 2" ,Id = 5,},
                new Customer{CustomerName = "Customer 2" ,Id = 5,},
            };
var customerList = customers.Distinct(new ItemEqualityComparer<Customer>(nameof(Customer.Id))).ToList();

这段代码通过Id删除重复项。如果您想通过其他属性删除重复项，可以将nameof(YourClass.DuplicateProperty)更改为nameof(Customer.CustomerName)，然后通过CustomerName属性删除重复项。

- Chris · Answer 8

4

也许更容易的方法是确保不会将重复项添加到列表中。

if(items.IndexOf(new_item) < 0) 
    items.add(new_item)

- Chris

1

我目前是这样做的，但是条目越多，检查重复项所需的时间就越长。 - Robert Strauch

我这里也有同样的问题。我每次都使用List<T>.Contains方法，但是有超过1,000,000个条目。这个过程会减慢我的应用程序。我首先使用了List<T>.Distinct().ToList<T>()。 - RPDeshaies

这个方法非常慢。 - Darkgaze

- Moctar Haiz · Answer 9

一个简单直观的实现：

public static List<PointF> RemoveDuplicates(List<PointF> listPoints)
{
    List<PointF> result = new List<PointF>();

    for (int i = 0; i < listPoints.Count; i++)
    {
        if (!result.Contains(listPoints[i]))
            result.Add(listPoints[i]);
        }

        return result;
    }

- flagamba · Answer 10

您可以使用 Union

obj2 = obj1.Union(obj1).ToList();