将嵌套列表合并为方括号

3
我希望编写一个函数,将具有任意长度和级别数量的嵌套列表转换为字符串,并可用作来自LaTeX包forest的树的输入。下面是我的进展情况:我设法将树中每个没有子节点的节点都封装在方括号中,但如何检索中间节点的名称并将其连接成单个字符串? forest环境中的字符串显示了我想要将示例列表转换为的样子。
\documentclass[a4paper]{article}
\usepackage{forest}
\begin{document}
<<list>>=
library("tidyverse")
nestedlist <- list("A"=list("B"=45:50, "C"=LETTERS[21:26],
    "D"=list("E"=7:10, "F"=list("G","H"))))                   
squarebrackets <- function(x){
    if(class(x) == "list")
        map(x, squarebrackets)
    else
        paste0("[",x,"]") %>%
            paste0(., collapse="")
}

squarebrackets(nestedlist)
@
\begin{forest}
[A[B[45][46][47][48][49][50]][C[U][V][W][X][Y][Z]][D[E[7][8][9][10]][F[G][H]]]]
\end{forest}
\end{document}

2
请问您能否提供样本数据的预期输出结果? - CL.
[A[B[45][46][47][48][49][50]][C[U][V][W][X][Y][Z]][D[E[7][8][9][10]][F[G][H]]]][A[B[45][46][47][48][49][50]][C[U][V][W][X][Y][Z]][D[E[7][8][9][10]][F[G][H]]]] - Crocodopolis
根据您的输出结果,nestedlist中的“F”应该被定义为F=c("G", "H")而不是F=list("G", "H")吗? - henrik_ibsen
这应该产生相同的输出,因为该列表是底层的。 - Crocodopolis
1个回答

1

一种方法是利用unlist()自动创建的名称层次结构。这也将使F=c("G", "H")F=list("G", "H")以相同的方式处理。

下面的示例不允许节点名称中包含数字,并且节点名称必须是唯一的。这可以通过使用rapply()方法进行改进。

定义替代方括号

squarebracketsAlt <- function(inlist){

  #create the name hierarchy
  storeList <- unlist(inlist)

  #get unique names which represents levels in hierarchy
  uniqueNames <- unique(unlist(strsplit(gsub("[0-9]", "", names(storeList)), "\\.")))

  #keep the names to search for length of node brackets
  vecNames <- names(storeList)
  storeVec <- paste0("[", storeList, "]")
  names(storeVec) <- vecNames

  for(i in  uniqueNames){

    #determine the two positions of the node brackets
    whereBrack <- grep(paste0("\\.",i, "\\."),
                       paste0(".", gsub("[0-9]", "", names(storeVec)), "."))

    #add the start bracket and node name to vector
    storeVec <- append(storeVec, paste0("[", i), after=(whereBrack[1]-1))
    #add the end bracket to vector
    storeVec <- append(storeVec, paste0("]") , after=(whereBrack[length(whereBrack)]+1))


  }
  #collapse and output
  cat(paste(storeVec, collapse=""))

}

在您的嵌套列表中尝试它:

nestedlist <- list("A"=list("B"=45:50, "C"=LETTERS[21:26],
                            "D"=list("E"=7:10, "F"=list("G","H")))) 

squarebracketsAlt(nestedlist)

输出:

[A[B[45][46][47][48][49][50]][C[U][V][W][X][Y][Z]][D[E[7][8][9][10]][F[G][H]]]]

enter image description here

示例更大的层次结构:

nestedlist1 <- list("Ad fe"=list("B"=45:50, "C"=list("U"=letters[1:10],LETTERS[22:26]),
                                "D"=list("E"=7:10, "F"=list("G"=list("ZZ foo"=list("AA bar"=c(1:10),2,3,4,5)),"H", "C"))))  

squarebracketsAlt(nestedlist1)

输出:

[Ad fe[B[45][46][47][48][49][50]][C[U[a][b][c][d][e][f][g][h][i][j]][V][W][X][Y][Z]][D[E[7][8][9][10]][F[G[ZZ foo[AA bar[1][2][3][4][5][6][7][8][9][10]][2][3][4][5]]][H][C]]]]

enter image description here

现实生活中的例子:

nestedlist2 <- list("Main Area"=
                     list("Fishing vessel"=c("trawler", "line", "skipper"), "Oil tanker"=c("Large", "Small", "Medium size"=
                          list("Barents Sea", "Norwegian Sea", "Kara Sea", "Greenland"))))
squarebracketsAlt(nestedlist2)

输出:

[Main Area[Fishing vessel[trawler][line][skipper]][Oil tanker[Large][Small][Medium size[Barents Sea][Norwegian Sea][Kara Sea][Greenland]]]]

enter image description here


带有前导数字的示例:

squarebracketsAltNum <- function(inlist){

  #create the name hierarchy
  storeList <- unlist(inlist)

  #get unique names which represents levels in hierarchy
  uniqueNames <- unique(paste(gsub("[A-z].*", "", unlist(strsplit(names(storeList), "\\."))),
                              unlist(strsplit(gsub("[0-9]", "", names(storeList)), "\\.")), sep=""))

  #keep the names to search for length of node brackets
  vecNames <- names(storeList)
  storeVec <- paste0("[", storeList, "]")
  names(storeVec) <- vecNames
  k <- 1

  for(i in  uniqueNames){
    cat(i, "\n")
    #determine the two positions of the node brackets
    whereBrack <- grep(paste0("\\.",i),
                       paste0(".", names(storeVec)))

    #change position of number and character
    namePaster <- unique(paste(unlist(strsplit(gsub("[0-9]", "", names(storeList)), "\\.")),
                               gsub("[A-z].*", "", unlist(strsplit(names(storeList), "\\."))), sep=""))[k]


    #add the start bracket and node name to vector
    storeVec <- append(storeVec, paste0("[", namePaster), after=(whereBrack[1]-1))
    #add the end bracket to vector
    storeVec <- append(storeVec, paste0("]") , after=(whereBrack[length(whereBrack)]+1))
    k <- k+1

  }
  #collapse and output
  cat(paste(storeVec, collapse=""))

}

数字和单词/句子之间不应有空格。使用正则表达式进行调整以解决此问题:

nestedlist <- list("100A"=list("4B"=45:50, "3C"=LETTERS[21:26],
                               "D"=list("E"=7:10, "78F"=c("G","H"))))          
squarebracketsAlt(nestedlist)

输出:

[A100[B4[45][46][47][48][49][50]][C3[U][V][W][X][Y][Z]][D[E[7][8][9][10]][F78[G][H]]]]

如果您确保没有底层列表,那么 unlist 不应该产生任何在节点名称中找不到的数字,对吗?这是否意味着可以将此函数重写为以保证不具有底层列表的列表作为其参数的函数,但允许节点名称中包含数字? - Crocodopolis
1
问题不一定在于底层。据我所知,unlist()将根据您的嵌套列表生成数字。rapply()也是如此。解决方案可以是在名称前添加数字,然后删除尾随数字,同时更改前导数字的位置。一些正则表达式的调整应该可以允许更灵活的名称。请参见编辑中的示例。 - henrik_ibsen

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接