如何计算图的熵?

17

我有一组随机生成的正式图形,想计算每个图形的熵。换句话说,我有几个网络,想计算每个网络的信息量。

以下是两个包含图形熵正式定义的来源:
http://www.cs.washington.edu/homes/anuprao/pubs/CSE533Autumn2010/lecture4.pdf (PDF格式) http://arxiv.org/abs/0711.4175v1

我需要的代码以图形作为输入(作为边列表或邻接矩阵),并输出比特数或其他某种信息内容度量。

由于我找不到任何实现此功能的代码,因此我打算根据正式定义从头开始编写代码。如果有人已经解决了这个问题并愿意分享代码,那将非常感激。

3个回答

8

我最终使用了两种不同的论文来定义图熵:
复杂网络信息理论:演化与结构约束
R.V. Sole 和 S. Valverde(2004)

基于拓扑配置的网络熵及其在随机网络中的计算
B.H. Wang、W.X. Wang 和 T. Zhou

下面是计算每种方法的代码。代码假定您有一个无向无权图,没有自环。它以邻接矩阵作为输入,并返回以比特为单位的熵量。它是用R实现的,并利用了sna包

graphEntropy <- function(adj, type="SoleValverde") {
  if (type == "SoleValverde") {
    return(graphEntropySoleValverde(adj))
  }
  else {
    return(graphEntropyWang(adj))
  }
}

graphEntropySoleValverde <- function(adj) {
  # Calculate Sole & Valverde, 2004 graph entropy
  # Uses Equations 1 and 4
  # First we need the denominator of q(k)
  # To get it we need the probability of each degree
  # First get the number of nodes with each degree
  existingDegrees = degree(adj)/2
  maxDegree = nrow(adj) - 1
  allDegrees = 0:maxDegree
  degreeDist = matrix(0, 3, length(allDegrees)+1) # Need an extra zero prob degree for later calculations
  degreeDist[1,] = 0:(maxDegree+1)
  for(aDegree in allDegrees) {
    degreeDist[2,aDegree+1] = sum(existingDegrees == aDegree)
  }
  # Calculate probability of each degree
  for(aDegree in allDegrees) {
    degreeDist[3,aDegree+1] = degreeDist[2,aDegree+1]/sum(degreeDist[2,])
  }
  # Sum of all degrees mult by their probability
  sumkPk = 0
  for(aDegree in allDegrees) {
    sumkPk = sumkPk + degreeDist[2,aDegree+1] * degreeDist[3,aDegree+1]
  }
  # Equivalent is sum(degreeDist[2,] * degreeDist[3,])
  # Now we have all the pieces we need to calculate graph entropy
  graphEntropy = 0
  for(aDegree in 1:maxDegree) {
    q.of.k = ((aDegree + 1)*degreeDist[3,aDegree+2])/sumkPk
    # 0 log2(0) is defined as zero
    if (q.of.k != 0) {
      graphEntropy = graphEntropy + -1 * q.of.k * log2(q.of.k)
    }
  }
  return(graphEntropy)
}

graphEntropyWang <- function(adj) {
  # Calculate Wang, 2008 graph entropy
  # Uses Equation 14
  # bigN is simply the number of nodes
  # littleP is the link probability.  That is the same as graph density calculated by sna with gden().
  bigN = nrow(adj)
  littleP = gden(adj)
  graphEntropy = 0
  if (littleP != 1 && littleP != 0) {
    graphEntropy = -1 * .5 * bigN * (bigN - 1) * (littleP * log2(littleP) + (1-littleP) * log2(1-littleP))
  }
  return(graphEntropy)
}

6
顺便说一下,我曾经实施了这些功能并计算了真实图的熵,但是我对这些度量感到失望。Wang度量仅取决于图的大小和密度,完全不考虑图的结构,它主要是密度的度量。Sole度量则反映了节点度数计数之间剩余差异的多样性。它更多地是相似性的度量而不是其他任何东西。我仍然不知道如何量化一个图的复杂程度。 - shotgun_approach

1
你可以使用Koerner's entropy(应用于图形的Shannon熵)。有关文献的良好参考是这里。但请注意,计算通常是NP-hard的(因为需要搜索所有顶点子集,这是愚蠢的原因)。

1
如果您有一个带权图,一个好的开始是对所有权重进行排序和计数。然后,您可以使用公式-log(p)+log(2)(http://en.wikipedia.org/wiki/Binary_entropy_function)来确定所需代码的位数。也许这不起作用是因为它是二进熵函数?

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接