如何在地图上绘制网络并尽量减少重叠。

28

我有一些作者及其所属城市或国家,想知道是否可以在地图上绘制共同作者的网络(图1),并拥有各国的坐标。请考虑来自同一国家的多位作者。[编辑:应生成几个网络(如示例中),并且不应显示可避免的重叠]。这适用于数十位作者。缩放选项是可取的。将来如果有更好的回答会有100美元的奖励承诺。

refs5 <- read.table(text="
                    row          bibtype year volume   number    pages      title          journal                          author
                    Bennett_1995 article 1995     76    <NA> 113--176 angiosperms.  \"Annals of Botany\"           \"Bennett Md, Leitch Ij\"
                    Bennett_1997 article 1997     80       2 169--196 estimates.  \"Annals of Botany\"           \"Bennett MD, Leitch IJ\"
                    Bennett_1998 article 1998     82 SUPPL.A 121--134 weeds.  \"Annals of Botany\" \"Bennett MD, Leitch IJ, Hanson L\"
                    Bennett_2000 article 2000     82 SUPPL.A 121--134 weeds.  \"Annals of Botany\" \"Bennett MD, Someone IJ\"
                    Leitch_2001 article 2001     83 SUPPL.A 121--134 weeds.  \"Annals of Botany\" \"Leitch IJ, Someone IJ\"
                    New_2002 article 2002     84 SUPPL.A 121--134 weeds.  \"Annals of Botany\" \"New IJ, Else IJ\"" , header=TRUE,stringsAsFactors=FALSE)

rownames(refs5) <- refs5[,1]
refs5<-refs5[,2:9]
citations <- as.BibEntry(refs5)

authorsl <- lapply(citations, function(x) as.character(toupper(x$author)))
unique.authorsl<-unique(unlist(authorsl))
coauth.table <- matrix(nrow=length(unique.authorsl),
                       ncol = length(unique.authorsl),
                       dimnames = list(unique.authorsl, unique.authorsl), 0)
for(i in 1:length(citations)){
  paper.auth <- unlist(authorsl[[i]])
  coauth.table[paper.auth,paper.auth] <- coauth.table[paper.auth,paper.auth] + 1
}
coauth.table <- coauth.table[rowSums(coauth.table)>0, colSums(coauth.table)>0]
diag(coauth.table) <- 0
coauthors<-coauth.table

bip = network(coauthors,
              matrix.type = "adjacency",
              ignore.eval = FALSE,
              names.eval = "weights")

authorcountry <- read.table(text="
 author country
1    \"LEITCH IJ\"     Argentina
2    \"HANSON L\"          USA
3    \"BENNETT MD\"       Brazil
4    \"SOMEONE IJ\"       Brazil
5    \"NEW IJ\"           Brazil
6    \"ELSE IJ\"          Brazil",header=TRUE,fill=TRUE,stringsAsFactors=FALSE)


matched<-   authorcountry$country[match(unique.authorsl, authorcountry$author)]

bip %v% "Country" = matched
colorsmanual<-c("red","darkgray","gainsboro")
names(colorsmanual) <- unique(matched)

gdata<- ggnet2(bip, color = "Country", palette = colorsmanual, legend.position = "right",label = TRUE,  
               alpha = 0.9, label.size = 3, edge.size="weights", 
               size="degree", size.legend="Degree Centrality") + theme(legend.box = "horizontal")
gdata
在地图上添加作者、行和气泡的名称。请注意,多个作者可能来自同一城市或国家,并且不应重叠。 figure 1 图1 网络
编辑:当前JanLauGe答案重叠了两个不相关的网络。作者“ELSE”和“NEW”需要像图1中一样与其他人分开。

as.BibEntry 函数属于哪个包? - and-bri
1
这篇关于绘制Uber行程的文章可能会给你一些灵感:https://drsimonj.svbtle.com/plotting-my-trips-with-uber - Simon Jackson
as.bibentry来自RefManageR。 - Ferroao
你能澄清一下“不应重叠”是什么意思吗?哪些方面不应该重叠,地图上的点、线或网络? - SymbolixAU
点和线(网络)只有在国家的位置要求时才能重叠,如图2所示,这是可以避免的,就像ELSE和NEW作者的图1一样。 - Ferroao
2个回答

25

你是在寻找使用与你当前已经使用的包完全一致的解决方案,还是愿意使用其他套件?以下是我的方法,我从 network 对象中提取图形属性,并使用 ggplot2map 套件将其绘制到地图上。


首先,我重新创建了您提供的示例数据。

library(tidyverse)
library(sna)
library(maps)
library(ggrepel)
set.seed(1)

coauthors <- matrix(
  c(0,3,1,1,3,0,1,0,1,1,0,0,1,0,0,0),
  nrow = 4, ncol = 4, 
  dimnames = list(c('BENNETT MD', 'LEITCH IJ', 'HANSON L', 'SOMEONE ELSE'),
                  c('BENNETT MD', 'LEITCH IJ', 'HANSON L', 'SOMEONE ELSE')))

coords <- data_frame(
  country = c('Argentina', 'Brazil', 'USA'),
  coord_lon = c(-63.61667, -51.92528, -95.71289),
  coord_lat = c(-38.41610, -14.23500, 37.09024))

authorcountry <- data_frame(
  author = c('LEITCH IJ', 'HANSON L', 'BENNETT MD', 'SOMEONE ELSE'),
  country = c('Argentina', 'USA', 'Brazil', 'Brazil'))

现在我使用snp函数network生成图形对象。
# Generate network
bip <- network(coauthors,
               matrix.type = "adjacency",
               ignore.eval = FALSE,
               names.eval = "weights")

# Graph with ggnet2 for centrality
gdata <- ggnet2(bip, color = "Country", legend.position = "right",label = TRUE,  
               alpha = 0.9, label.size = 3, edge.size="weights", 
               size="degree", size.legend="Degree Centrality") + theme(legend.box = "horizontal")

我们可以从网络对象中提取每个边的值,并且可以从ggnet2对象中获取节点的中心度,如下所示:

# Combine data
authors <- 
  # Get author numbers
  data_frame(
    id = seq(1, nrow(coauthors)),
    author = sapply(bip$val, function(x) x$vertex.names)) %>%
  left_join(
    authorcountry,
    by = 'author') %>%
  left_join(
    coords,
    by = 'country') %>%
  # Jittering points to avoid overlap between two authors
  mutate(
    coord_lon = jitter(coord_lon, factor = 1),
    coord_lat = jitter(coord_lat, factor = 1))

# Get edges from network
networkdata <- sapply(bip$mel, function(x) 
  c('id_inl' = x$inl, 'id_outl' = x$outl, 'weight' = x$atl$weights)) %>%
  t %>% as_data_frame

dt <- networkdata %>%
  left_join(authors, by = c('id_inl' = 'id')) %>%
  left_join(authors, by = c('id_outl' = 'id'), suffix = c('.from', '.to')) %>%
  left_join(gdata$data %>% select(label, size), by = c('author.from' = 'label')) %>%
  mutate(edge_id = seq(1, nrow(.)),
         from_author = author.from,
         from_coord_lon = coord_lon.from,
         from_coord_lat = coord_lat.from,
         from_country = country.from,
         from_size = size,
         to_author = author.to,
         to_coord_lon = coord_lon.to,
         to_coord_lat = coord_lat.to,
         to_country = country.to) %>%
  select(edge_id, starts_with('from'), starts_with('to'), weight)

现在应该看起来像这样:
dt
# A tibble: 8 × 11
  edge_id  from_author from_coord_lon from_coord_lat from_country from_size    to_author to_coord_lon
    <int>        <chr>          <dbl>          <dbl>        <chr>     <dbl>        <chr>        <dbl>
1       1   BENNETT MD      -51.12756     -16.992729       Brazil         6    LEITCH IJ    -65.02949
2       2   BENNETT MD      -51.12756     -16.992729       Brazil         6     HANSON L    -96.37907
3       3   BENNETT MD      -51.12756     -16.992729       Brazil         6 SOMEONE ELSE    -52.54160
4       4    LEITCH IJ      -65.02949     -35.214117    Argentina         4   BENNETT MD    -51.12756
5       5    LEITCH IJ      -65.02949     -35.214117    Argentina         4     HANSON L    -96.37907
6       6     HANSON L      -96.37907      36.252312          USA         4   BENNETT MD    -51.12756
7       7     HANSON L      -96.37907      36.252312          USA         4    LEITCH IJ    -65.02949
8       8 SOMEONE ELSE      -52.54160      -9.551913       Brazil         2   BENNETT MD    -51.12756
# ... with 3 more variables: to_coord_lat <dbl>, to_country <chr>, weight <dbl>

现在开始将这些数据绘制到地图上:
world_map <- map_data('world') 
myMap <- ggplot() +
  # Plot map
  geom_map(data = world_map, map = world_map, aes(map_id = region),
           color = 'gray85',
           fill = 'gray93') +
  xlim(c(-120, -20)) + ylim(c(-50, 50)) + 
  # Plot edges
  geom_segment(data = dt, 
               alpha = 0.5,
               color = "dodgerblue1",
               aes(x = from_coord_lon, y = from_coord_lat,
                   xend = to_coord_lon, yend = to_coord_lat,
                   size = weight)) +
  scale_size(range = c(1,3)) +
  # Plot nodes
  geom_point(data = dt,
             aes(x = from_coord_lon,
                 y = from_coord_lat,
                 size = from_size,
                 colour = from_country)) +
  # Plot names
  geom_text_repel(data = dt %>% 
                    select(from_author, 
                           from_coord_lon, 
                           from_coord_lat) %>% 
                    unique,
                  colour = 'dodgerblue1',
                  aes(x = from_coord_lon, y = from_coord_lat, label = from_author)) + 
  coord_equal() +
  theme_bw()

显然,您可以使用 ggplot2 语法以通常的方式更改颜色和设计。请注意,您还可以使用 geom_curvearrow 美学来获得类似于评论中链接到的超级帖子中的图表。

enter image description here


1
你还说“几位作者可能来自同一个城市或国家,不应该重复”。您是打算将它们相加还是每个人都应该有一个独立的节点?从您的问题中我没有很清楚。如果您能提供更多细节,我很乐意在我的答案中添加建议。 - JanLauGe
1
你的主要目的似乎是可视化作者网络的空间组成部分。因此,我建议只使用“抖动”来偏移来自同一国家的作者。对于较小的国家,这可能会导致点位于国外。但是,可以通过按国家着色节点或添加国家标签来避免由此产生的误解。我将在我的答案中添加一个示例。 - JanLauGe
2
@Ferroao 使用库(ggplot2); world_map <- map_data(“world”) - tatxif
1
感谢提醒关于 world_map 的事项。@Ferroao,正如我所说,您可以使用所有通常的 ggplot 语法来更改绘图的外观。线条粗细可以使用 scale_size 属性进行修改。我已经在代码中添加了一个示例,并相应地更新了绘图。这符合您的要求吗? - JanLauGe
谢谢,能否在保留点的同时消除图例中的线条?或者将线条和点分开成两个图例。您示例中现有的点大小为2、4、6,但图例中还有更多的大小。如何去除多余的大小? - Ferroao
显示剩余4条评论

0
作为避免两个网络重叠的努力,我对ggplot的x和y坐标进行了修改,默认情况下不会重叠网络,请参见问题中的图1。
# get centroid positions for countries
# add coordenates to authorcountry table

# download and unzip
# https://worldmap.harvard.edu/data/geonode:country_centroids_az8
setwd("~/country_centroids_az8")
library(rgdal)
cent <- readOGR('.', "country_centroids_az8", stringsAsFactors = F)
countrycentdf<-cent@data[,c("name","Longitude","Latitude")]
countrycentdf$name[which(countrycentdf$name=="United States")]<-"USA"
colnames(countrycentdf)[names(countrycentdf)=="name"]<-"country"

authorcountry$Longitude<-countrycentdf$Longitude[match(authorcountry$country,countrycentdf$country)]
authorcountry$Latitude <-countrycentdf$Latitude [match(authorcountry$country,countrycentdf$country)]

# original coordenates of plot and its transformation
ggnetbuild<-ggplot_build(gdata)
allcoord<-ggnetbuild$data[[3]][,c("x","y","label")]
allcoord$Latitude<-authorcountry$Latitude [match(allcoord$label,authorcountry$author)]
allcoord$Longitude<-authorcountry$Longitude [match(allcoord$label,authorcountry$author)]
allcoord$country<-authorcountry$country [match(allcoord$label,authorcountry$author)]

# increase with factor the distance among dots
factor<-7
allcoord$coord_lat<-allcoord$y*factor+allcoord$Latitude
allcoord$coord_lon<-allcoord$x*factor+allcoord$Longitude
allcoord$author<-allcoord$label

# plot as in answer of JanLauGe, without jitter
library(tidyverse)
library(ggrepel)
  authors <- 
    # Get author numbers
    data_frame(
      id = seq(1, nrow(coauthors)),
      author = sapply(bip$val, function(x) x$vertex.names)) %>%
    left_join(
      allcoord,
      by = 'author') 

  # Continue as in answer of JanLauGe
  networkdata <- ## 
  dt <- ## 
  world_map <- map_data('world') 
  myMap <- ## 
  myMap

networks not overlapped


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接