通过属性对网络节点进行聚类

4

我有一个类似于d的数据框,如下所示。

d <- structure(list(ID = c("KP1009", "GP3040", "KP1757", "GP2243", 
                           "KP682", "KP1789", "KP1933", "KP1662", "KP1718", "GP3339", "GP4007", 
                           "GP3398", "GP6720", "KP808", "KP1154", "KP748", "GP4263", "GP1132", 
                           "GP5881", "GP6291", "KP1004", "KP1998", "GP4123", "GP5930", "KP1070", 
                           "KP905", "KP579", "KP1100", "KP587", "GP913", "GP4864", "KP1513", 
                           "GP5979", "KP730", "KP1412", "KP615", "KP1315", "KP993", "GP1521", 
                           "KP1034", "KP651", "GP2876", "GP4715", "GP5056", "GP555", "GP408", 
                           "GP4217", "GP641"),
                    Type = c("B", "A", "B", "A", "B", "B", "B", 
                             "B", "B", "A", "A", "A", "A", "B", "B", "B", "A", "A", "A", "A", 
                             "B", "B", "A", "A", "B", "B", "B", "B", "B", "A", "A", "B", "A", 
                             "B", "B", "B", "B", "B", "A", "B", "B", "A", "A", "A", "A", "A", 
                             "A", "A"),
                    Set = c(15L, 1L, 10L, 21L, 5L, 9L, 12L, 15L, 16L, 
                            19L, 22L, 3L, 12L, 22L, 15L, 25L, 10L, 25L, 12L, 3L, 10L, 8L, 
                            8L, 20L, 20L, 19L, 25L, 15L, 6L, 21L, 9L, 5L, 24L, 9L, 20L, 5L, 
                            2L, 2L, 11L, 9L, 16L, 10L, 21L, 4L, 1L, 8L, 5L, 11L), Loc = c(3L, 
                            2L, 3L, 1L, 3L, 3L, 3L, 1L, 2L, 1L, 3L, 1L, 1L, 2L, 2L, 1L, 3L, 
                            2L, 2L, 2L, 3L, 2L, 3L, 2L, 1L, 3L, 3L, 3L, 2L, 3L, 1L, 3L, 3L, 
                            1L, 3L, 2L, 3L, 1L, 1L, 1L, 2L, 3L, 3L, 3L, 2L, 2L, 3L, 3L)),
               .Names = c("ID", "Type", "Set", "Loc"), class = "data.frame",
               row.names = c(NA, -48L))

我正在尝试将数据框中的集合 (d$Sets) 可视化为网络图。

sets <- unique(d$Set[duplicated(d$Set)])
rel <-  vector("list", length(sets))
for (i in 1:length(sets)) {
  rel[[i]] <- as.data.frame(t(combn(subset(d, d$Set ==sets[i])$ID, 2)))
}
library(data.table)
rel <- rbindlist(rel)

library(igraph)
g <- graph.data.frame(rel, directed=F, vertices=d)

V(g)$color = ifelse(V(g)$Type == "A", "red", "green")

layout <- layout.fruchterman.reingold(g, niter = 500)

plot.igraph(g, vertex.size=8,
            vertex.label.cex=0.9, layout = layout)

我根据V(g)$Typed$Type对节点进行了着色。

enter image description here

现在,在最终的图中,所有类型的集合都汇聚在一起。我想将成员类型相同的集合绘制为单独的组,这样最终就会有三组集合。

  1. 成员为A类型的集合
  2. 成员为B类型的集合
  3. 成员为A和B类型的集合

类似于这样enter image description here

如何使用igraph包实现此类聚类?


我认为你错过了定义layout的地方,这可能是layout.fruchterman.reingold,并带有一些非默认参数。另外,你似乎已经以某种方式绘制了聚类图,你用什么工具制作了最终的图像? - hrbrmstr
@hrbrmstr 是的,布局是 layout.fruchterman.reingold。我已经更新了代码。 - Crops
@hrbrmstr 最后一张图片不是用 igraph 生成的。这正是我最终想要的结果。因此,我通过图像编辑软件编辑第一张图片来澄清所需的结果,并创建了最终的图片。 - Crops
1
你能否将成对的数据子集化,然后使用 par(mfcol=c(nrows=1, ncols=3)) 将三个单独的图形并排绘制出来? - hrbrmstr
@hrbrmstr 我猜这只需要两种类型的节点(A=红色和B=绿色)就可以实现。但是我处理的数据可能会有多达6个节点。 - Crops
1个回答

3
这是我用您上面的代码创建g对象来解决它的方法。这比乍一看要棘手得多,因为您希望在组/连通性/群集级别上实现多颜色成员身份。
##  Find cluster membership:
c <- clusters(g)
d <- data.frame(membership=c$membership, color=V(g)$color, id=1:length(V(g)))
c$red_members <- aggregate(d$color=="red", by=list(d$membership), FUN=sum)[,2]
c$green_members <- aggregate(d$color=="green", by=list(d$membership), FUN=sum)[,2]
V(g)$group_has_red <- (c$red_members[ c$membership ] > 0)
V(g)$group_has_green <- (c$green_members[ c$membership ] > 0)


##  Create sub-graphs containing the appropriate membership:
g_mixed <- delete.vertices(g, !(V(g)$group_has_red & V(g)$group_has_green))
g_red <- delete.vertices(g, !(V(g)$group_has_red & !(V(g)$group_has_green)))
g_green <- delete.vertices(g, !(V(g)$group_has_green & !(V(g)$group_has_red)))

par(mfrow=c(1,3))
plot(g_green, vertex.size=8, vertex.label=NA)
plot(g_mixed, vertex.size=8, vertex.label=NA)
plot(g_red, vertex.size=8, vertex.label=NA)

enter image description here


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接