我正在遵循stackoverflow上的一篇关于在C#中从List<T>
中删除重复项的先前帖子。
如果<T>
是某个用户定义的类型,例如:
class Contact
{
public string firstname;
public string lastname;
public string phonenum;
}
建议使用(HashMap),但不会删除重复项。我想我需要重新定义一些比较两个对象的方法,是吗?
HashSet<T>
可以去除重复项,因为它是一个集合......但仅当您的类型适当地定义了相等性时。
我猜你所说的“重复”是指“具有与另一个对象相等的字段值的对象”——您需要覆盖Equals
/GetHashCode
才能使其工作,并且/或者实现IEquatable<Contact>
...或者您可以向HashSet<T>
构造函数提供一个IEqualityComparer<Contact>
。
您也可以使用Distinct
LINQ扩展方法而不是使用HashSet<T>
。例如:
list = list.Distinct().ToList();
但是,你需要以某种方式提供适当的相等性定义。
这是一个示例实现。请注意,我将其设为不可变类型(可变类型的相等性很奇特,因为两个对象一分钟可以相等,下一分钟就可能不等),并且将字段设置为私有,使用公共属性。最后,我封装了该类 - 不可变类型通常应该被封装,并且这使得相等性更容易讨论。
using System;
using System.Collections.Generic;
public sealed class Contact : IEquatable<Contact>
{
private readonly string firstName;
public string FirstName { get { return firstName; } }
private readonly string lastName;
public string LastName { get { return lastName; } }
private readonly string phoneNumber;
public string PhoneNumber { get { return phoneNumber; } }
public Contact(string firstName, string lastName, string phoneNumber)
{
this.firstName = firstName;
this.lastName = lastName;
this.phoneNumber = phoneNumber;
}
public override bool Equals(object other)
{
return Equals(other as Contact);
}
public bool Equals(Contact other)
{
if (object.ReferenceEquals(other, null))
{
return false;
}
if (object.ReferenceEquals(other, this))
{
return true;
}
return FirstName == other.FirstName &&
LastName == other.LastName &&
PhoneNumber == other.PhoneNumber;
}
public override int GetHashCode()
{
// Note: *not* StringComparer; EqualityComparer<T>
// copes with null; StringComparer doesn't.
var comparer = EqualityComparer<string>.Default;
// Unchecked to allow overflow, which is fine
unchecked
{
int hash = 17;
hash = hash * 31 + comparer.GetHashCode(FirstName);
hash = hash * 31 + comparer.GetHashCode(LastName);
hash = hash * 31 + comparer.GetHashCode(PhoneNumber);
return hash;
}
}
}
编辑:好的,针对 GetHashCode()
实现的解释请求:
EqualityComparer<T>.Default
总是处理这个问题,这很不错……所以我使用它来获取每个字段的哈希码。顺便说两种处理 null 值的替代方法:
public override int GetHashCode()
{
// Unchecked to allow overflow, which is fine
unchecked
{
int hash = 17;
hash = hash * 31 + (FirstName ?? "").GetHashCode();
hash = hash * 31 + (LastName ?? "").GetHashCode();
hash = hash * 31 + (PhoneNumber ?? "").GetHashCode();
return hash;
}
}
或者public override int GetHashCode()
{
// Unchecked to allow overflow, which is fine
unchecked
{
int hash = 17;
hash = hash * 31 + (FirstName == null ? 0 : FirstName.GetHashCode());
hash = hash * 31 + (LastName == null ? 0 : LastName.GetHashCode());
hash = hash * 31 + (PhoneNumber == null ? 0 : PhoneNumber.GetHashCode());
return hash;
}
}
class Contact {
public int Id { get; set; }
public string Name { get; set; }
public override string ToString()
{
return string.Format("{0}:{1}", Id, Name);
}
static private IEqualityComparer<Contact> comparer;
static public IEqualityComparer<Contact> Comparer {
get { return comparer ?? (comparer = new EqualityComparer()); }
}
class EqualityComparer : IEqualityComparer<Contact> {
bool IEqualityComparer<Contact>.Equals(Contact x, Contact y)
{
if (x == y)
return true;
if (x == null || y == null)
return false;
return x.Name == y.Name; // let's compare by Name
}
int IEqualityComparer<Contact>.GetHashCode(Contact c)
{
return c.Name.GetHashCode(); // let's compare by Name
}
}
}
class Program {
public static void Main()
{
var list = new List<Contact> {
new Contact { Id = 1, Name = "John" },
new Contact { Id = 2, Name = "Sylvia" },
new Contact { Id = 3, Name = "John" }
};
var distinctNames = list.Distinct(Contact.Comparer).ToList();
foreach (var contact in distinctNames)
Console.WriteLine(contact);
}
}
提供
1:John
2:Sylvia
sealed class ContactFirstNameLastNameComparer : IEqualityComparer<Contact>
{
public bool Equals (Contact x, Contact y)
{
return x.firstname == y.firstname && x.lastname == y.lastname;
}
public int GetHashCode (Contact obj)
{
return obj.firstname.GetHashCode () ^ obj.lastname.GetHashCode ();
}
}
然后使用 System.Linq.Enumerable.Distinct
(假设您至少使用 .NET 3.5)
var unique = contacts.Distinct (new ContactFirstNameLastNameComparer ()).ToArray ();
顺便说一下 HashSet<>
,请注意 HashSet<>
作为构造函数参数需要一个 IEqualityComparer<>
。