使用Ruby中的Nokogiri解析HTML

Question

使用Ruby中的Nokogiri解析HTML

6

使用以下HTML代码：

<div class="one">
  .....
</div>
<div class="one">
  .....
</div>
<div class="one">
  .....
</div>
<div class="one">
  .....
</div>

如何使用Nokogiri选择第二个或第三个class为one的div元素？

- Ozil Maq

2个回答

5

page.css('div.one')[1] # For the second
page.css('div.one')[2] # For the third

- Ismael

2

最初这个答案有CSS div#one。它找到一个id为one的div，但是HTML有一个名为one的类。这就是为什么我使用了CSS div.one。#选择ID，.选择类。 - Rory O'Kane

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Phrogz · Accepted Answer

你可以使用Ruby将大量的结果集缩小到特定的项目：

page.css('div.one')[1,2]  # Two items starting at index 1 (2nd item)
page.css('div.one')[1..2] # Items with indices between 1 and 2, inclusive

因为 Ruby 的索引从零开始，所以您必须小心选择您想要的项目。

另外，您可以使用 CSS 选择器来查找第n个项目：

# Second and third items from the set, jQuery-style
page.css('div.one:eq(2),div.one:eq(3)')

# Second and third children, CSS3-style
page.css('div.one:nth-child(2),div.one:nth-child(3)')

或者您可以使用XPath来获取特定的匹配项：

# Second and third children
page.xpath("//div[@class='one'][position()=2 or position()=3]")

# Second and third items in the result set
page.xpath("(//div[@class='one'])[position()=2 or position()=3]")

请注意，无论是CSS还是XPath替代方案，都需要注意以下内容：

Numbering starts at 1, not 0

You can use at_css and at_xpath instead to get back the first-such matching element, instead of a NodeSet.

# A NodeSet with a single element in it:
page.css('div.one:eq(2)')

# The second div element
page.at_css('div.one:eq(2)')

最后，需要注意的是，如果您使用XPath通过索引选择单个元素，可以使用更短的格式：

# First div.one seen that is the second child of its parent
page.at_xpath('//div[@class="one"][2]')

# Second div.one in the entire document
page.at_xpath('(//div[@class="one"])[2]')