R中的网络弦图问题

50

我有一些类似于data.framed的数据,如下所示。

d <- structure(list(ID = c("KP1009", "GP3040", "KP1757", "GP2243", 
                           "KP682", "KP1789", "KP1933", "KP1662", "KP1718", "GP3339", "GP4007", 
                           "GP3398", "GP6720", "KP808", "KP1154", "KP748", "GP4263", "GP1132", 
                           "GP5881", "GP6291", "KP1004", "KP1998", "GP4123", "GP5930", "KP1070", 
                           "KP905", "KP579", "KP1100", "KP587", "GP913", "GP4864", "KP1513", 
                           "GP5979", "KP730", "KP1412", "KP615", "KP1315", "KP993", "GP1521", 
                           "KP1034", "KP651", "GP2876", "GP4715", "GP5056", "GP555", "GP408", 
                           "GP4217", "GP641"),
                    Type = c("B", "A", "B", "A", "B", "B", "B", 
                             "B", "B", "A", "A", "A", "A", "B", "B", "B", "A", "A", "A", "A", 
                             "B", "B", "A", "A", "B", "B", "B", "B", "B", "A", "A", "B", "A", 
                             "B", "B", "B", "B", "B", "A", "B", "B", "A", "A", "A", "A", "A", 
                             "A", "A"),
                    Set = c(15L, 1L, 10L, 21L, 5L, 9L, 12L, 15L, 16L, 
                            19L, 22L, 3L, 12L, 22L, 15L, 25L, 10L, 25L, 12L, 3L, 10L, 8L, 
                            8L, 20L, 20L, 19L, 25L, 15L, 6L, 21L, 9L, 5L, 24L, 9L, 20L, 5L, 
                            2L, 2L, 11L, 9L, 16L, 10L, 21L, 4L, 1L, 8L, 5L, 11L), Loc = c(3L, 
                                                                                          2L, 3L, 1L, 3L, 3L, 3L, 1L, 2L, 1L, 3L, 1L, 1L, 2L, 2L, 1L, 3L, 
                                                                                          2L, 2L, 2L, 3L, 2L, 3L, 2L, 1L, 3L, 3L, 3L, 2L, 3L, 1L, 3L, 3L, 
                                                                                          1L, 3L, 2L, 3L, 1L, 1L, 1L, 2L, 3L, 3L, 3L, 2L, 2L, 3L, 3L)),
               .Names = c("ID", "Type", "Set", "Loc"), class = "data.frame",
               row.names = c(NA, -48L))

我想探索使用和下面类似的弦图来研究 d$ID 成员之间的关系。

enter image description here

看起来在 R 中有几种方法可以实现这一点。 (R中的弦图)
在我的数据中,关系是根据 d$Set (非定向) 并且分组是根据 d$Loc 的。以下是我尝试将这些关系映射为弦图的结果。

尝试1:使用igraph

我已经尝试使用 igraph,节点大小根据度数调整。
# Get vertex relationships
sets <- unique(d$Set[duplicated(d$Set)])
rel <-  vector("list", length(sets))
for (i in 1:length(sets)) {
  rel[[i]] <- as.data.frame(t(combn(subset(d, d$Set ==sets[i])$ID, 2)))
}
library(data.table)
rel <- rbindlist(rel)

# Get the graph
g <- graph.data.frame(rel, directed=F, vertices=d)
clr <- as.factor(V(g)$Loc)
levels(clr) <- c("salmon", "wheat", "lightskyblue")
V(g)$color <- as.character(clr)

# Plot
plot(g, layout = layout.circle, vertex.size=degree(g)*5, vertex.label=NA)

enter image description here

如何修改绘图使其看起来像第一张图片?似乎没有选项可以修改igraph layout.circle

尝试2:使用Circlize

Rcirclize中,似乎可以实现更平滑的贝塞尔曲线和分组。但是在这里,我无法将节点分组,并根据度数调整它们的大小,因为它们被绘制成扇形。

par(mar = c(1, 1, 1, 1), lwd = 0.1, cex = 0.7)
circos.initialize(factors = as.factor(d$ID), xlim = c(0, 10))
circos.trackPlotRegion(factors = as.factor(d$ID), ylim = c(0, 0.5), bg.col = V(g)$color,
                       bg.border = NA, track.height = 0.05)
for(i in 1:nrow(rel)) {
  circos.link(rel[i,1], 0, rel[i,2],0, h = 0.4)

}

输入图像描述

然而,在这里没有修改节点的选项。实际上,它们只能被绘制为扇形?在这种情况下,是否有任何方法将扇形修改为大小根据度数的圆形节点?

尝试3:使用edgebundleR(https://github.com/garthtarr/edgebundleR)

require(edgebundleR)
edgebundle(g,tension = 0.1,cutoff = 0.5, fontsize = 18,padding=40)

enter image description here这里似乎有限的选项来修改美学。


2
http://christophergandrud.github.io/d3Network/ 怎么样? - Roman Luštrik
1
您可以通过对邻接矩阵进行排序并使用edge.curve参数为边缘添加一些曲线来对变量进行分组。抱歉,代码如下:m <- tcrossprod(table(d[c(1,3)])) ; grp <- d[order(d$ID), "Loc"] ; m2 <- m[order(grp), order(grp) ] ; diag(m2) <- 0 ; g <- graph.adjacency(m2, mode="undirected"); clr <- as.factor(sort(grp)); levels(clr) <- c("salmon", "wheat", "lightskyblue"); V(g)$color <- as.character(clr); par(mar=rep(0,4)); plot(g, layout = layout.circle, vertex.size=degree(g)*5, vertex.label=NA, edge.curved=seq(-0.5, 0.5, length = ecount(g))) - user20650
1
嗨,Crops;是的,差不多了,但还没完全解决。由于问题已被关闭为重复(因此上面有代码转储),我无法发布答案。 - user20650
2
我知道你正在使用R,但为什么不尝试一下circos(http://circos.ca/)呢?一个使用R + circos的替代方案是http://www.bioconductor.org/packages/release/bioc/html/OmicCircos.html。 - AndreiR
1
我很高兴能够为 edgeBundler 添加美学效果。您是否有所引用图片的源代码? - timelyportfolio
显示剩余5条评论
2个回答

30

我对edgebundleR进行了许多更改。现在这些更改已经在主存储库中。以下代码应该能让您接近所需的结果。实时示例

# devtools::install_github("garthtarr/edgebundleR")

library(edgebundleR)
library(igraph)
library(data.table)

d <- structure(list(ID = c("KP1009", "GP3040", "KP1757", "GP2243", 
                           "KP682", "KP1789", "KP1933", "KP1662", "KP1718", "GP3339", "GP4007", 
                           "GP3398", "GP6720", "KP808", "KP1154", "KP748", "GP4263", "GP1132", 
                           "GP5881", "GP6291", "KP1004", "KP1998", "GP4123", "GP5930", "KP1070", 
                           "KP905", "KP579", "KP1100", "KP587", "GP913", "GP4864", "KP1513", 
                           "GP5979", "KP730", "KP1412", "KP615", "KP1315", "KP993", "GP1521", 
                           "KP1034", "KP651", "GP2876", "GP4715", "GP5056", "GP555", "GP408", 
                           "GP4217", "GP641"),
                    Type = c("B", "A", "B", "A", "B", "B", "B", 
                             "B", "B", "A", "A", "A", "A", "B", "B", "B", "A", "A", "A", "A", 
                             "B", "B", "A", "A", "B", "B", "B", "B", "B", "A", "A", "B", "A", 
                             "B", "B", "B", "B", "B", "A", "B", "B", "A", "A", "A", "A", "A", 
                             "A", "A"),
                    Set = c(15L, 1L, 10L, 21L, 5L, 9L, 12L, 15L, 16L, 
                            19L, 22L, 3L, 12L, 22L, 15L, 25L, 10L, 25L, 12L, 3L, 10L, 8L, 
                            8L, 20L, 20L, 19L, 25L, 15L, 6L, 21L, 9L, 5L, 24L, 9L, 20L, 5L, 
                            2L, 2L, 11L, 9L, 16L, 10L, 21L, 4L, 1L, 8L, 5L, 11L), Loc = c(3L, 
                                                                                          2L, 3L, 1L, 3L, 3L, 3L, 1L, 2L, 1L, 3L, 1L, 1L, 2L, 2L, 1L, 3L, 
                                                                                          2L, 2L, 2L, 3L, 2L, 3L, 2L, 1L, 3L, 3L, 3L, 2L, 3L, 1L, 3L, 3L, 
                                                                                          1L, 3L, 2L, 3L, 1L, 1L, 1L, 2L, 3L, 3L, 3L, 2L, 2L, 3L, 3L)),
               .Names = c("ID", "Type", "Set", "Loc"), class = "data.frame",
               row.names = c(NA, -48L))

# let's add Loc to our ID
d$key <- d$ID
d$ID <- paste0(d$Loc,".",d$ID)

# Get vertex relationships
sets <- unique(d$Set[duplicated(d$Set)])
rel <-  vector("list", length(sets))
for (i in 1:length(sets)) {
  rel[[i]] <- as.data.frame(t(combn(subset(d, d$Set ==sets[i])$ID, 2)))
}

rel <- rbindlist(rel)

# Get the graph
g <- graph.data.frame(rel, directed=F, vertices=d)
clr <- as.factor(V(g)$Loc)
levels(clr) <- c("salmon", "wheat", "lightskyblue")
V(g)$color <- as.character(clr)
V(g)$size = degree(g)*5
# Plot
plot(g, layout = layout.circle, vertex.label=NA)


edgebundle( g )->eb

eb

enter image description here


如何更改边缘的颜色? - rrs
这三行代码定义了颜色:clr <- as.factor(V(g)$Loc) levels(clr) <- c("salmon", "wheat", "lightskyblue") V(g)$color <- as.character(clr)还有其他的方法。例如,只需执行 V(g)$color <- "red" 就可以将所有东西都变成红色。 - timelyportfolio
如果您想根据其他参数为每个边缘着色,则此方法无法实现。例如,在igraph中,您可以通过E(g)$color来为边缘着色,但是edgebundleR包仅使用源节点的颜色来着色边缘。因此,所有出边必须相同。 - rrs
我明白了,这个应该很明显。抱歉我理解起来花了一些时间。这些代码行 https://github.com/garthtarr/edgebundleR/blob/master/inst/htmlwidgets/edgebundleR.js#L80 显示了你提到的问题。让我试着做一些尝试并尝试提出一个答案。 - timelyportfolio
感谢@timelyportfolio。我也一直在看那行代码,但我不确定如何从图中获取边缘属性。我对R很熟悉,但对javascript并不了解。所以我甚至无法弄清楚源节点是如何被传递的! - rrs
1
在重新熟悉了代码之后,我意识到这将需要重写大部分代码或者需要进行一些黑客手段。我会在下面的答案中发布这个黑客方法。 - timelyportfolio

8

很抱歉,我需要为另一个问题添加答案,但是我不知道如何处理评论中提出的额外问题。 评论问如何着色边缘。一般来说,回答会很容易,但在这种情况下,答案需要重写 edgebundleR 中的大部分代码或需要使用hack。我将使用下面的hack。

library(edgebundleR)
library(igraph)
library(data.table)

d <- structure(list(ID = c("KP1009", "GP3040", "KP1757", "GP2243", 
                           "KP682", "KP1789", "KP1933", "KP1662", "KP1718", "GP3339", "GP4007", 
                           "GP3398", "GP6720", "KP808", "KP1154", "KP748", "GP4263", "GP1132", 
                           "GP5881", "GP6291", "KP1004", "KP1998", "GP4123", "GP5930", "KP1070", 
                           "KP905", "KP579", "KP1100", "KP587", "GP913", "GP4864", "KP1513", 
                           "GP5979", "KP730", "KP1412", "KP615", "KP1315", "KP993", "GP1521", 
                           "KP1034", "KP651", "GP2876", "GP4715", "GP5056", "GP555", "GP408", 
                           "GP4217", "GP641"),
                    Type = c("B", "A", "B", "A", "B", "B", "B", 
                             "B", "B", "A", "A", "A", "A", "B", "B", "B", "A", "A", "A", "A", 
                             "B", "B", "A", "A", "B", "B", "B", "B", "B", "A", "A", "B", "A", 
                             "B", "B", "B", "B", "B", "A", "B", "B", "A", "A", "A", "A", "A", 
                             "A", "A"),
                    Set = c(15L, 1L, 10L, 21L, 5L, 9L, 12L, 15L, 16L, 
                            19L, 22L, 3L, 12L, 22L, 15L, 25L, 10L, 25L, 12L, 3L, 10L, 8L, 
                            8L, 20L, 20L, 19L, 25L, 15L, 6L, 21L, 9L, 5L, 24L, 9L, 20L, 5L, 
                            2L, 2L, 11L, 9L, 16L, 10L, 21L, 4L, 1L, 8L, 5L, 11L), Loc = c(3L, 
                                                                                          2L, 3L, 1L, 3L, 3L, 3L, 1L, 2L, 1L, 3L, 1L, 1L, 2L, 2L, 1L, 3L, 
                                                                                          2L, 2L, 2L, 3L, 2L, 3L, 2L, 1L, 3L, 3L, 3L, 2L, 3L, 1L, 3L, 3L, 
                                                                                          1L, 3L, 2L, 3L, 1L, 1L, 1L, 2L, 3L, 3L, 3L, 2L, 2L, 3L, 3L)),
               .Names = c("ID", "Type", "Set", "Loc"), class = "data.frame",
               row.names = c(NA, -48L))

# let's add Loc to our ID
d$key <- d$ID
d$ID <- paste0(d$Loc,".",d$ID)

# Get vertex relationships
sets <- unique(d$Set[duplicated(d$Set)])
rel <-  vector("list", length(sets))
for (i in 1:length(sets)) {
  rel[[i]] <- as.data.frame(t(combn(subset(d, d$Set ==sets[i])$ID, 2)))
}

rel <- rbindlist(rel)

# Get the graph
g <- graph.data.frame(rel, directed=F, vertices=d)
clr <- as.factor(V(g)$Loc)
levels(clr) <- c("salmon", "wheat", "lightskyblue")
V(g)$color <- as.character(clr)

# Plot
plot(g, layout = layout.circle, vertex.size=degree(g)*5, vertex.label=NA)


edgebundle( g )->eb

eb

# temporary hack to accomplish edge coloring
# requires newest Github version of htmlwidgets
# devtools::install_github("ramnathv/htmlwidgets")

# add some imaginary colors
E(g)$color <- c("purple","green","black")[floor(runif(length(E(g)),1,4))]
# now append these edge attributes to our htmlwidget x
eb$x$edges <- jsonlite::toJSON(get.data.frame(g,what="edges"))

eb <- htmlwidgets::onRender(
  eb,
'
function(el,x){
  // loop through each of our edges supplied
  //  and change the color
  x.edges.map(function(edge){
    var source = edge.from.split(".")[1];
    var target = edge.to.split(".")[1];
    d3.select(el).select(".link.source-" + source + ".target-" + target)
      .style("stroke",edge.color);
  })
}
'
)
eb

出于某种原因这个程序不起作用。我可以更新边缘并在大 JSON 混沌中看到例如 "color":"green",但当我从 onRender 运行代码及以下时,图表仍然看起来一样。 - rrs
有没有办法使用 saveWidget 并发布到 gist?你是否已经从 Github 安装了最新的 htmlwidgets - timelyportfolio
我也刚尝试了一下 - 在安装了最新的htmlwidgetsedgebundler之后。这是错误信息:Error: 'onRender' is not an exported object from 'namespace:htmlwidgets' - jalapic
这里介绍的功能是https://github.com/ramnathv/htmlwidgets/pull/172引进的。不知道为什么从Github安装后会出现错误。嗯......我可以添加“tasks”功能,但这应该可以消除这个需求。 - timelyportfolio
@timelyportfolio 大部分都能正常工作。原来节点名称/ID中不能有空格。但是颜色看起来不正确,似乎新颜色与原始边缘颜色混合了。 - rrs
显示剩余2条评论

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接