我偶然发现了关于“生日悖论”以及在重写“GetHashCode”方法时的影响的一篇文章,这让我陷入了困境。
在测试中,我们发现在调用
我想我明白了为什么会这样,因为
如果你需要一个例子来说明我的意思:
C#
我可以通过重写
在测试中,我们发现在调用
ToLookup()
扩展方法时,只使用了GetHashCode
,尽管我们已经提供了Equals的实现。我想我明白了为什么会这样,因为
ToLookup
、HashSet
、Dictionary
等内部工作原理都是使用HashCodes来存储和/或索引它们的元素?
有没有办法以某种方式提供功能,使得相等比较实际上是使用equals方法执行的?还是我不应该担心碰撞问题?我自己没有做过数学计算,但根据我链接的第一篇文章,只需要在列表中有77,163个元素,就有50%的碰撞几率。
如果我理解正确,一个比较属性的Equals()重写方法,例如
Return (a.Property1 == b.Property1 && a.Property2 == b.Property2 && ...)
应该有零碰撞的机会吗?那么我怎样才能让我的 ToLookup()
进行这种等值比较呢?
如果你需要一个例子来说明我的意思:
C#
class Program
{
static void Main(string[] args)
{
DoStuff();
Console.ReadKey();
}
public class AnEntity
{
public int KeyProperty1 { get; set; }
public int KeyProperty2 { get; set; }
public int KeyProperty3 { get; set; }
public string OtherProperty1 { get; set; }
public List<string> OtherProperty2 { get; set; }
}
public class KeyEntity
{
public int KeyProperty1 { get; set; }
public int KeyProperty2 { get; set; }
public int KeyProperty3 { get; set; }
}
public static void DoStuff()
{
var a = new AnEntity {KeyProperty1 = 1, KeyProperty2 = 2, KeyProperty3 = 3, OtherProperty1 = "foo"};
var b = new AnEntity {KeyProperty1 = 1, KeyProperty2 = 2, KeyProperty3 = 3, OtherProperty1 = "bar"};
var c = new AnEntity {KeyProperty1 = 999, KeyProperty2 = 999, KeyProperty3 = 999, OtherProperty1 = "yada"};
var entityList = new List<AnEntity> { a, b, c };
var lookup = entityList.ToLookup(n => new KeyEntity {KeyProperty1 = n.KeyProperty1, KeyProperty2 = n.KeyProperty2, KeyProperty3 = n.KeyProperty3});
// I want these to all return true
Debug.Assert(lookup.Count == 2);
Debug.Assert(lookup[new KeyEntity {KeyProperty1 = 1, KeyProperty2 = 2, KeyProperty3 = 3}].First().OtherProperty1 == "foo");
Debug.Assert(lookup[new KeyEntity {KeyProperty1 = 1, KeyProperty2 = 2, KeyProperty3 = 3}].Last().OtherProperty1 == "bar");
Debug.Assert(lookup[new KeyEntity {KeyProperty1 = 999, KeyProperty2 = 999, KeyProperty3 = 999}].Single().OtherProperty1 == "yada");
}
}
VB
Module Program
Public Sub Main(args As String())
DoStuff()
Console.ReadKey()
End Sub
Public Class AnEntity
Public Property KeyProperty1 As Integer
Public Property KeyProperty2 As Integer
Public Property KeyProperty3 As Integer
Public Property OtherProperty1 As String
Public Property OtherProperty2 As List(Of String)
End Class
Public Class KeyEntity
Public Property KeyProperty1 As Integer
Public Property KeyProperty2 As Integer
Public Property KeyProperty3 As Integer
End Class
Public Sub DoStuff()
Dim a = New AnEntity With {.KeyProperty1 = 1, .KeyProperty2 = 2, .KeyProperty3 = 3, .OtherProperty1 = "foo"}
Dim b = New AnEntity With {.KeyProperty1 = 1, .KeyProperty2 = 2, .KeyProperty3 = 3, .OtherProperty1 = "bar"}
Dim c = New AnEntity With {.KeyProperty1 = 999, .KeyProperty2 = 999, .KeyProperty3 = 999, .OtherProperty1 = "yada"}
Dim entityList = New List(Of AnEntity) From {a, b, c}
Dim lookup = entityList.ToLookup(Function(n) New KeyEntity With {.KeyProperty1 = n.KeyProperty1, .KeyProperty2 = n.KeyProperty2, .KeyProperty3 = n.KeyProperty3})
' I want these to all return true
Debug.Assert(lookup.Count = 2)
Debug.Assert(lookup(New KeyEntity With {.KeyProperty1 = 1, .KeyProperty2 = 2, .KeyProperty3 = 3}).First().OtherProperty1 = "foo")
Debug.Assert(lookup(New KeyEntity With {.KeyProperty1 = 1, .KeyProperty2 = 2, .KeyProperty3 = 3}).Last().OtherProperty1 = "bar")
Debug.Assert(lookup(New KeyEntity With {.KeyProperty1 = 999, .KeyProperty2 = 999, .KeyProperty3 = 999}).Single().OtherProperty1 = "yada")
End Sub
End Module
我可以通过重写
GetHashcode()
来使其工作,没有问题。但是我不想使用GetHashcode
,因为如果我的列表中有109,125个元素,显然我已经有75%的碰撞几率了?如果使用上述的Equals()
重写,我认为碰撞几率将为0%。
GetHashCode()
的对象存储在一个字典中。通过自定义的Equals()
方法来确认传递的对象是否已经存在于字典中。如果不存在,则增加计数器,将当前计数器索引的对象存储在字典中,并返回当前计数器/索引。如果对象已经存在,则从字典中返回其键。然而,这种方法非常耗费资源,特别是当你只关心特别大的列表中的冲突时。有没有更好的方法呢? - Smudge202GetHashCode
重写中犯了个错误。创建了一个新的实现以匹配帖子中的代码后,它按预期工作并在哈希码匹配时调用Equals()
... 我会和镜子里的自己好好聊一下,找出问题所在。谢谢! - Smudge202