如何计算 hits@k？在知识库中的链接预测中，hits@k表示什么意思？

Question

如何计算 hits@k？在知识库中的链接预测中，hits@k表示什么意思？

entityoperations-researchknowledge-graphentity-linking

7

我研究知识网络中的链接预测论文。作者通常会报告“Hits@k”。我想知道如何计算hits@k，并且这对模型和结果意味着什么？

- SilentFlame

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- lukostaz · Accepted Answer

简单来说，它是指在一堆合成负样本中，有多少个正三元组在前n个位置排名靠前。

在下面的例子中，假设测试集仅包含两个真实正样本：

Jack   born_in   Italy
Jack   friend_with   Thomas

假设这样的正三元组（以下用*标识）与四个合成负例进行排名对比。

接下来，使用您预训练的嵌入模型为每个正例及其合成负例分配一个得分。然后，按降序对三元组进行排序。在下面的示例中，第一个三元组排名第二，另一个三元组排名第一（相对于它们各自的合成负例）：

s        p         o            score   rank
Jack   born_in   Ireland        0.789      1
Jack   born_in   Italy          0.753      2  *
Jack   born_in   Germany        0.695      3
Jack   born_in   China          0.456      4
Jack   born_in   Thomas         0.234      5

s        p         o            score   rank
Jack   friend_with   Thomas     0.901      1  *
Jack   friend_with   China      0.345      2
Jack   friend_with   Italy      0.293      3
Jack   friend_with   Ireland    0.201      4
Jack   friend_with   Germany    0.156      5

接下来，统计在top-1或者top-3位置出现的正确答案数量，并除以测试集中三元组的数量（在本例中包括2个三元组）：

Hits@3= 2/2 = 1.0
Hits@1= 1/2 = 0.5

AmpliGraph 提供了一个API来计算Hits@n - 在这里查看文档。