使用R通过属性从XML中获取值

Question

使用R通过属性从XML中获取值

4

我正在尝试从以下类似的xml中获取值：

<data>
    <result name="r">
        <item>
            <str name="id">123</str>
            <str name="xxx">aaa</str>
        </item>
        <item>
            <str name="id">456</str>
            <str name="xxx">aaa</str>
        </item>
    </result>
</data>

目前，我可以通过以下方式获取id的值：

xmlfile <- xmlParse(url)
data <- xmlRoot(xmlfile) 
result <- xmltop[["result"]]
for (i in xmlSize(result)) {
  print(xmlValue(result[[i]][[1]]))
}

这种方法效率很低，而且只有在"id"存储在第一个子元素中才能使用。那么，有没有一种方法可以通过搜索属性（name）和值（id）来获取元素的值（123, 456）？

- cubil

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Dave2e · Accepted Answer

< p > xml2包非常适用于解决这类问题。

library(xml2)
page<-read_xml('<data>
    <result name="r">
               <item>
               <str name="id">123</str>
               <str name="xxx">aaa</str>
               </item>
               <item>
               <str name="id">456</str>
               <str name="xxx">aaa</str>
               </item>
               </result>
               </data>')

#find all str nodes
 nodes<-xml_find_all(page, ".//str")
#filter out the nodes where the attribute name=id
 nodes<-nodes[xml_attr(nodes, "name")=="id"]
#get values (as character strings)
 xml_text(nodes)

更新

使用XPath选择器，一切都可以在一行中完成。

#R verison >4.0
xml_find_all(page, ".//str[@name='id']") |> xml_text()

这是一个有用的 XPath 路径速查表链接： https://www.red-gate.com/simple-talk/development/dotnet-development/xpath-css-dom-and-selenium-the-rosetta-stone/