按“/”分隔的列转为嵌套列表

6

我有一个列,其中每个行元素都用“/”分隔:

data.frame(column=c("a","a/air","a/aero/breath","b","b/boy","b/bag/band/brand"))

在每个"/"后面,我如何将其转换为嵌套列表。目标是获得:

list(a=list("air"=1,aero=list("breath"=1)),b=list("boy"=1,bag=list(band=list("brand"=1)))) 

我需要这个来使用shinyTree包从列中创建一棵树。

我在层级的最后一个元素末尾添加了“=1”,因为它必须显示在shinyTree输出中。 然后可以将列表放入下面的代码中以获取tree

library(shiny)
library(shinyTree)

tree <- list(a=list("air"=1,aero=list("breath"=1)),b=list("boy"=1,bag=list(band=list("brand"=1)))) 


typeof(tree)

ui <- fluidPage(
  fluidPage(
    sidebarLayout(
      sidebarPanel(
        actionButton('reset', 'Reset nodes')
      ),
      mainPanel(
        shinyTree("tree", ),
        hr(),
        "Selected nodes:",
        verbatimTextOutput("idSelected")#,
      )
    )
  )
)

server <- function(input, output, session) {
  
  treeSelection <- reactiveVal(list())
  
  output$tree = renderTree({
    tree
  })
  
  observeEvent(input$reset, {
    updateTree(session, "tree", data = tree)
    treeSelection(list())
  })
  
  observeEvent(input$tree, {
    treeSelection(get_selected(input$tree, format = "classid"))
  })
  
  output$idSelected <- renderPrint({
    treeSelection()
  })
  
}

shinyApp(ui, server)

1
请您在易于复制的可重现格式中添加输入数据和输出应该如何呈现。请阅读此处有关如何提供可重现示例的信息:https://dev59.com/eG025IYBdhLWcg3whGSx - Jakub.Novotny
我已经编辑了问题。简而言之,我想将数据框列转换为上面所示的列表。我希望现在更清楚了。@Jakub.Novotny - Sahib
2个回答

3

由于变量看起来像路径,因此我将示例数据创建为向量形式

paths <- c(
  "a",
  "a/air",
  "a/aero/breath",
  "b",
  "b/boy",
  "b/bag/band/brand"
)

接下来,您可以使用以下函数获取嵌套列表。我希望变量名的选择足够说明问题。

pathsToNestedList <- function(x) {
  pathSplit <- strsplit(x,"/")
  pathStarts <- sapply(pathSplit,"[[",1)
  uniquePathStarts <- unique(pathStarts)
  
  pathEnds <- sapply(pathSplit, function(pathParts) {
    if(length(pathParts) <= 1) return("")
    paste0(pathParts[2:length(pathParts)],collapse="/")
  })
  
  splitLengths <- sapply(pathSplit,length)
  stillToParse <- unique(pathStarts[splitLengths > 1])
  
  endedIndices <- pathEnds == ""
  endedHere <- pathStarts[endedIndices]
  endedHere <- setdiff(endedHere,stillToParse)
  
  if(length(endedHere)) {
    pathEnds <- pathEnds[!endedIndices]
    pathStarts <- pathStarts[!endedIndices]
    uniquePathStarts <- unique(pathStarts)
    return(c(
      setNames(as.list(rep(1,length(endedHere))),endedHere),
      setNames(lapply(uniquePathStarts, function(ps) {
        pathsToNestedList(pathEnds[pathStarts == ps])
      }),uniquePathStarts)
    ))
  } else {
    return(
      setNames(lapply(uniquePathStarts, function(ps) {
        pathsToNestedList(pathEnds[!endedIndices & (pathStarts == ps)])
      }),uniquePathStarts))
  }
}

注意:我根据您的更新问题更新了我的答案。
更新:该函数可以简化为:
pathsToNestedList <- function(x) {
  nonNaIndices <- !is.na(x)
  nonEmptyIndices <- x != ""
  x <- x[nonNaIndices & nonEmptyIndices]
  if(!length(x)) return()
  
  pathSplit <- strsplit(x,"/")
  pathStarts <- sapply(pathSplit,"[[",1)
  
  pathEnds <- sapply(pathSplit, function(pathParts) {
    if(length(pathParts) <= 1) return("")
    paste0(pathParts[2:length(pathParts)],collapse="/")
  })
  
  splitLengths <- sapply(pathSplit,length)
  stillToParse <- unique(pathStarts[splitLengths > 1])
  
  endedIndices <- pathEnds == ""
  endedHere <- pathStarts[endedIndices]
  endedHere <- setdiff(endedHere,stillToParse)
  
  pathEnds <- pathEnds[!endedIndices]
  pathStarts <- pathStarts[!endedIndices]
  uniquePathStarts <- unique(pathStarts)
  
  #Concatenate the list of paths that ended with a list that is parsed again.
  #If one of those lists is empty, the concatenation behaves like
  #one would expect: It does nothing.
  return(
    c(setNames(as.list(rep(1,length(endedHere))),endedHere),
      setNames(lapply(uniquePathStarts, function(ps) {
        pathsToNestedList(pathEnds[pathStarts == ps])
      }),uniquePathStarts)
    )
  )
}

此外,我发现它会在使用NA和空字符串时崩溃。因此,在函数开头添加了一个删除部分。

代码创建列表完美,但有没有办法使每个列表的最后一个元素(如果不等于列表)为“=1”?这样shinyTree就可以读取它。谢谢@Jonas - Sahib
1
我已经更新了代码,它可以产生你所期望的输出。Endingelements不再作为列表字符元素存储,它们现在是具有元素1名称的列表的名称。 - Jonas
甚至可以摆脱 if 语句。现在最终版本应该更容易理解了。 - Jonas

3

另一个选项是使用rrapply()函数,这个函数属于rrapply包,它有专门的选项how = "unmelt"可以将数据框解除"melting"状态并转化为嵌套列表:

library(rrapply)
library(data.table)

paths <- c("a","a/air","a/aero/breath","b","b/boy","b/bag/band/brand")

## create data.frame/data.table with node paths
paths_melt <- as.data.table(tstrsplit(paths[grepl("/", paths)], split = "/"))
paths_melt[, value := 1L]
paths_melt
#>    V1   V2     V3    V4 value
#> 1:  a  air   <NA>  <NA>     1
#> 2:  a aero breath  <NA>     1
#> 3:  b  boy   <NA>  <NA>     1
#> 4:  b  bag   band brand     1

## unmelt to nested list
rrapply(paths_melt, how = "unmelt")
#> $a
#> $a$air
#> [1] 1
#> 
#> $a$aero
#> $a$aero$breath
#> [1] 1
#> 
#> 
#> 
#> $b
#> $b$boy
#> [1] 1
#> 
#> $b$bag
#> $b$bag$band
#> $b$bag$band$brand
#> [1] 1

1
刚刚看了一下这个包,好像你是作者。因此,非常感谢你。我使用R语言,已经编写了几个“扩展”基本的rapply函数的解决方案,现在知道你的包后,看起来将会是一个大的时间节省者! - Jonas

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接