获取元素的子元素，但不包括文本节点

Question

获取元素的子元素，但不包括文本节点

10

我正在使用Ruby的Nokogiri解析XML文件的内容。我想要获得一个数组（或类似的数据结构），其中包含在我的例子中直接作为<where>子元素的所有元素。然而，我得到了各种文本节点（例如"\n\t\t\t"），这些节点我不需要。是否有任何方法可以删除或忽略它们？

@body = "
<xml>
  <request>
    <where>
      <username compare='e'>Admin</username>
      <rank compare='gt'>5</rank>
    </where>
  </request>
</xml>" #in my code, the XML contains tab-indentation, rather than spaces. It is edited here for display purposes.

@noko = Nokogiri::XML(@body)
xml_request = @noko.xpath("//xml/request")
where = xml_request.xpath("where")
c = where.children
p c

以上 Ruby 脚本输出:

[#<Nokogiri::XML::Text:0x100344c "\n\t\t\t">, #<Nokogiri::XML::Element:0x1003350 name="username" attributes=[#<Nokogiri::XML::Attr:0x10032fc name="compare" value="e">] children=[#<Nokogiri::XML::Text:0x1007580 "Admin">]>, #<Nokogiri::XML::Text:0x100734c "\n\t\t\t">, #<Nokogiri::XML::Element:0x100722c name="rank" attributes=[#<Nokogiri::XML::Attr:0x10071d8 name="compare" value="gt">] children=[#<Nokogiri::XML::Text:0x1006cec "5">]>, #<Nokogiri::XML::Text:0x10068a8 "\n\t\t">]

我想以某种方式获取以下对象:

[#<Nokogiri::XML::Element:0x1003350 name="username" attributes=[#<Nokogiri::XML::Attr:0x10032fc name="compare" value="e">] children=[#<Nokogiri::XML::Text:0x1007580 "Admin">]>, #Nokogiri::XML::Element:0x100722c name="rank" attributes=[#<Nokogiri::XML::Attr:0x10071d8 name="compare" value="gt">] children=[#<Nokogiri::XML::Text:0x1006cec "5">]>]

目前，我可以通过以下方式解决此问题:

c.each{|child|
  if !child.text?
    ...
  end
}

但是c.length == 5。如果有人能建议如何从c中排除直接子文本节点，使得c.length == 2，那将使我的生活更加轻松。

- SimonMayer

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Phrogz · Accepted Answer

您至少有三个选项可供选择：

使用c = where.element_children代替c = where.children。
仅选择直接的子元素：
c = xml_request.xpath('./where/*') 或者
c = where.xpath('./*')
过滤孩子的列表，只包括那些是元素的：仅这些是元素:
c = where.children.select(&:element?)