如何通过Xpath从相对路径生成完整URL？

Question

如何通过Xpath从相对路径生成完整URL？

3

有一个网站，例如：

  http://example.com

使用这样的页面：

 <div id="topnews">
      <a href="/news/topnews1.html"> Top news1 </a>
      <a href="/news/topnews2.html"> Top news2 </a>
      <a href="http://sport.example.com/news/topnews3.html"> Top news complex </a>
 </div>

纯粹使用XPath是否能够获取这三个URL：

 http://example.com/news/topnews1.html
 http://example.com/news/topnews2.html
 http://sport.example.com/news/topnews3.html

为了提取相对URL，我们可以使用以下方法：

   //div/a/@href

但是，

  concat('http://example.com',  //div/a/@href)

只返回一行数据（第一行），而不是三个不同的值。

我不知道如何优雅地检测和处理最后一个完整的URL。

- GML-VS

不确定在XPath 1.0中是否可能，您是否接受XPath 2.0的解决方案？ - alecxe

是的，我认为我可以尝试。 - GML-VS

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- kjhughes · Accepted Answer

XPath 1.0

无法仅使用XPath实现。

XPath 2.0

这是一个XPath 2.0表达式，

for $h in //a/@href return
    if (starts-with($h, 'http:/'))
    then $h
    else concat('http://example.com',$h)

返回值

http://example.com/news/topnews1.html
http://example.com/news/topnews2.html
http://sport.example.com/news/topnews3.html

按照您的要求为您提供文档。