Delphi Xpath XML查询

5

我正在使用XPath查询,尝试找到以下XML文件中<Link role="self">的值:

<?xml version="1.0" encoding="utf-8"?>
<Response xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xmlns:xsd="http://www.w3.org/2001/XMLSchema"
          xmlns="http://schemas.microsoft.com/search/local/ws/rest/v1">
    <Copyright>Copyright © 2011 Microsoft and its suppliers. All rights reserved. This API cannot be accessed and the content and any results may not be used, reproduced or transmitted in any manner without express written permission from Microsoft Corporation.</Copyright>
    <BrandLogoUri>http://spatial.virtualearth.net/Branding/logo_powered_by.png</BrandLogoUri>
    <StatusCode>201</StatusCode>
    <StatusDescription>Created</StatusDescription>
    <AuthenticationResultCode>ValidCredentials</AuthenticationResultCode>
    <TraceId>ID|02.00.82.2300|</TraceId>
    <ResourceSets>
        <ResourceSet>
            <EstimatedTotal>1</EstimatedTotal>
            <Resources>
                <DataflowJob>
                    <Id>ID</Id>
                    <Link role="self">https://spatial.virtualearth.net/REST/v1/dataflows/Geocode/ID</Link>
                    <Status>Pending</Status>
                    <CreatedDate>2011-03-30T08:03:09.3551157-07:00</CreatedDate>
                    <CompletedDate xsi:nil="true" />
                    <TotalEntityCount>0</TotalEntityCount>
                    <ProcessedEntityCount>0</ProcessedEntityCount>
                    <FailedEntityCount>0</FailedEntityCount>
                </DataflowJob>
            </Resources>
        </ResourceSet>
    </ResourceSets>
</Response>

之前的帖子中,我看到了一个XPath查询,但是在下面的代码中,我一直得到一个未分配的iNode

function TForm1.QueryXMLData(XMLFilename, XMLQuery: string): string;
var
  iNode : IDOMNode;
  Sel: IDOMNodeSelect;
begin
  try
    XMLDoc.Active := False;
    XMLDoc.FileName := XMLFilename;
    XMLDoc.Active := True;

    Sel := XMLDoc.DOMDocument as IDomNodeSelect;

    Result := '';
    iNode := Sel.selectNode('Link[@role = "self"]');
    if Assigned(iNode) then
      if (not VarisNull(iNode.NodeValue)) then
        Result := iNode.NodeValue;

    XMLDoc.Active := False;

  Except on E: Exception do
    begin
      MessageDlg(E.ClassName + ': ' + E.Message, mtError, [mbOK], 0);
      LogEvent(E.Message);
    end;
  end;
end;

我该尝试什么来解决这个问题?

这些问题都与默认命名空间有关;删除 xmlns="http://schemas.microsoft.com/search/local/ws/rest/v1",然后这个就可以正常工作了://Link[@role="self"][1]/node()。我不知道你的示例中为什么有 [3],因为文档中只有一个 Link 节点;我不喜欢删除默认命名空间声明的解决方案,感觉很不专业... - Cosmin Prund
我已经从查询中删除了[3],因为那显然是我的错误。 - Pieter van Wyk
可能是Delphi/MSXML: XPath queries fail的重复问题。 - user357812
4个回答

12
如果你想要在整个文档中定位链接,你需要在链接前面添加//前缀,就像这样:
iNode := Sel.selectNode('//Link[@role = "self"][3]');

这将从文档根开始搜索,并遍历整个文档,查找符合指定条件的名为Link的节点。

更多操作符请参见此处:http://msdn.microsoft.com/en-us/library/ms256122.aspx

注意,正如Runner所建议的那样,您还可以查询完整的XML路径。这比使用//操作符更快,因为它不必盲目搜索每个节点。


编辑:为什么要请求第三个匹配节点([3]部分)?在我看来,只有一个;如果您的真实文档确实有更多,并且您确定需要第三个,则没问题。否则,请删除[3]查询。


另外,根据您使用的XML实现(供应商和版本),您还可能需要指定XML命名空间。在MSXML 4到6中(如果我没记错的话),您需要使用

XMLDoc.setProperty('SelectionNamespaces', 'xmlns:ns="http://schemas.microsoft.com/search/local/ws/rest/v1"');
这意味着在你的查询中也要使用该前缀:
iNode := Sel.selectNode('//ns:Link[@role = "self"][3]');

有没有一种方法可以确定XML的实现方式? - Pieter van Wyk
如果我没记错的话,TXMLDocument 有一个Vendor属性,或者有一个包含Vendor属性的XMLImplementation属性。这里暂时没有Delphi,所以我得等到以后再查一下。 - Martijn
我也认同setProperty的优点,但我真的很想看到一个可行的例子,因为我自己试过了却无法让它工作。我导入了MSXML 3.0类型库以便使用setProperty,但似乎找不到正确的语法。 - Cosmin Prund
目前时间不多,也许这个链接可以帮到你:https://dev59.com/DnVC5IYBdhLWcg3wihqv - Martijn
AV,也就是selectSingleNode返回nil。 - Cosmin Prund
显示剩余5条评论

6
你应该这样写:
iNode := Sel.selectNode('//Link[@role = "self"]');

这将获取文档中第一个带有属性 role="self" 的链接节点(即使有多个链接节点)。

或者您可以使用绝对路径:

iNode := Sel.selectNode('/Response/ResourceSets/ResourceSet/Resources/DataflowJob/Link[@role = "self"]');

甚至可以介于两者之间

iNode := Sel.selectNode('//Resources/DataflowJob/Link[@role = "self"]');

全面的XPath确实具有更好的性能,因为它不必在文档中“盲目”搜索。 - Martijn
你没有处理默认命名空间问题。 - Cosmin Prund
@Cosmin Prund,确实我只是提供了正确的XPath语法。我看到Martijn已经有一个包含正确命名空间的答案了。 - Runner
即使是提问者在问题中使用了正确的语法,但 Runner 仍然存在命名空间问题。问题的根源在于命名空间声明;从示例 XML 中删除命名空间声明,那么提问者的代码就可以正常工作了。 - Cosmin Prund
不,他没有。iNode:= Sel.selectNode('Link[@role = "self"]');不能选择正确的节点。它会在根节点的子节点中搜索Link节点。 - Runner
此外,这取决于他使用的XML解析器。例如,在OmniXML中,我的答案是有效的,因为OmniXML不支持命名空间,XPath会选择正确的节点。但我当然同意命名空间在这里是一个问题。但正如我已经说过的,Martinj已经写过了,我不打算重复。 - Runner

2
最终我使用了OmniXML和以下代码。
uses
    OmniXML, OmniXMLUtils, OmniXMLXPath;

  ...

    function GetResultsURL(Filename: string): string;
    var
      FXMLDocument: IXMLDocument;
      XMLElementList: IXMLNodeList;
      XMLNode: IXMLNode;
      XMLElement: IXMLElement;
      i: integer;
    begin
      //Create and load the XML document
      FXMLDocument := CreateXMLDoc;
      FXMLDocument.Load(Filename);

      //We are looking for: <Link role="output" name="failed">
      XMLElementList := FXMLDocument.GetElementsByTagName('Link');
      for i := 0 to Pred(XMLElementList.Length) do
        begin
          //Check each node and element
          XMLNode := XMLElementList.Item[i];
          XMLElement := XMLNode as IXMLElement;
          if XMLElement.GetAttribute('role') = 'output' then
            if Pos('failed', XMLNode.Text) > 0 then
                Result := XMLNode.Text;
        end;
    end;

收到的XML内容如下...
...

<DataflowJob>
  <Id>12345</Id>
  <Link role="self">https://spatial.virtualearth.net/REST/v1/dataflows/Geocode/12345</Link>
  <Link role="output" name="failed">https://spatial.virtualearth.net/REST/v1/dataflows/Geocode/12345/output/failed</Link>
  <Status>Completed</Status>
  <CreatedDate>2011-04-04T03:57:49.0534147-07:00</CreatedDate>
  <CompletedDate>2011-04-04T03:58:43.709725-07:00</CompletedDate>
  <TotalEntityCount>1</TotalEntityCount>
  <ProcessedEntityCount>1</ProcessedEntityCount>
  <FailedEntityCount>1</FailedEntityCount>
</DataflowJob>

...

1

Martijn在对他的回答的评论中提到了Vendor属性。

实际上,该属性被称为DOMVendor。

下面是一些示例代码,展示了它的工作原理。
示例代码依赖于一些辅助类,你可以在bo.codeplex.com上找到。

请注意,DOMVendor无法告诉您所使用的MSXML版本,但您可以询问它是否支持XPath。

旧版的MSXML(例如在普通的Windows 2003 Server安装中仍然存在)不支持XPath,但支持XSLPattern
它们可以执行您的查询,但有时会返回不同的结果,或者出错。

各种子版本的MSXML6中也存在一些微妙的错误
您需要6.30、6.20.1103。、6.20.2003.0或更高版本。6.3仅适用于Windows 7 / Windows 2008 Server。在Windows XP和Windows 2003 Server上,使用6.20版本。
找出哪些版本实际可用花费了我相当长的时间:-)

这显示了我安装的MSXML,在我的情况下是msxml6.dll:6.20.1103.0

procedure TMainForm.ShowMsxml6VersionClick(Sender: TObject);
begin
{
Windows 2003 with MSXML 3: msxml3.dll: 8.100.1050.0

windows XP with MSXML 4: msxml4.dll: 4.20.9818.0

Windows XP with MSXML 6 SP1: msxml6.dll: 6.10.1129.0

windows XP with MSXML 6 SP2 (latest):
------------------------
msxml6.dll: 6.20.1103.0

Windows 7 with MSXML 6 SP3:
--------------------------
msxml6.dll: 6.30.7600.16385
}
  try
    Logger.Log(TmsxmlFactory.msxmlBestFileVersion.ToString());
    TmsxmlFactory.AssertCompatibleMsxml6Version();
  except
    on E: Exception do
    begin
      Logger.Log('Error');
      Logger.Log(E);
    end;
  end;
end;

这里展示了DOMVendor代码,它使用了一些辅助类,你可以在此找到。

procedure TMainForm.FillDomVendorComboBox;
var
  DomVendorComboBoxItemsCount: Integer;
  Index: Integer;
  CurrentDomVendor: TDOMVendor;
  DefaultDomVendorIndex: Integer;
  CurrentDomVendorDescription: string;
const
  NoSelection = -1;
begin
  DomVendorComboBox.Clear;
  DefaultDomVendorIndex := NoSelection;
  for Index := 0 to DOMVendors.Count - 1 do
  begin
    CurrentDomVendor := DOMVendors.Vendors[Index];
    LogDomVendor(CurrentDomVendor);
    CurrentDomVendorDescription := CurrentDomVendor.Description;
    DomVendorComboBox.Items.Add(CurrentDomVendorDescription);
    if DefaultDOMVendor = CurrentDomVendorDescription then
      DefaultDomVendorIndex := DomVendorComboBox.Items.Count - 1;
  end;
  DomVendorComboBoxItemsCount := DomVendorComboBox.Items.Count;
  if (DefaultDomVendorIndex = NoSelection) then
  begin
    if DefaultDOMVendor = NullAsStringValue then
    begin
      if DomVendorComboBoxItemsCount > 0 then
        DefaultDomVendorIndex := 0;
    end
    else
      DefaultDomVendorIndex := DomVendorComboBoxItemsCount - 1;
  end;
  DomVendorComboBox.ItemIndex := DefaultDomVendorIndex;
end;

procedure TMainForm.LogDomVendor(const CurrentDomVendor: TDOMVendor);
var
  CurrentDomVendorDescription: string;
  DocumentElement: IDOMElement;
  DomDocument: IDOMDocument; // xmldom.IDOMDocument is the plain XML DOM
  XmlDocument: IXMLDocument; // XMLIntf.IXMLDocument is the enrichted XML interface to the TComponent wrapper, which has a DOMDocument: IDOMDocument poperty, and allows obtaining XML from different sources (text, file, stream, etc)
  XmlDocumentInstance: TXMLDocument; // unit XMLDoc

  DOMNodeEx: IDOMNodeEx;
  XMLDOMDocument2: IXMLDOMDocument2;
begin
  CurrentDomVendorDescription := CurrentDomVendor.Description;
  Logger.Log('DOMVendor', CurrentDomVendorDescription);

  XmlDocumentInstance := TXMLDocument.Create(nil);
  XmlDocumentInstance.DOMVendor := CurrentDomVendor;
  XmlDocument := XmlDocumentInstance;

  DomDocument := CurrentDomVendor.DOMImplementation.createDocument(NullAsStringValue, NullAsStringValue, nil);

  XmlDocument.DOMDocument := DomDocument;
  XmlDocument.LoadFromXML('<document/>');
  DomDocument := XmlDocument.DOMDocument; // we get another reference here, since we loaded some XML now

  DocumentElement := DomDocument.DocumentElement;
  if Assigned(DocumentElement) then
  begin
    DOMNodeEx := DocumentElement as IDOMNodeEx;
    Logger.Log(DOMNodeEx.xml);
  end;

  if IDomNodeHelper.GetXmlDomDocument2(DomDocument, XMLDOMDocument2) then
  begin
    // XSLPattern versus XPath
    // see https://dev59.com/m3RA5IYBdhLWcg3w6SRH
    // XSLPattern is 0 based, but XPath is 1 based.
    Logger.Log(IDomNodeHelper.SelectionLanguage, string(XMLDOMDocument2.getProperty(IDomNodeHelper.SelectionLanguage)));
    Logger.Log(IDomNodeHelper.SelectionNamespaces, string(XMLDOMDocument2.getProperty(IDomNodeHelper.SelectionNamespaces)));
  end;


  LogDomVendorFeatures(CurrentDomVendor,
    ['','1.0','2.0', '3.0'],
//http://www.w3.org/TR/DOM-Level-3-Core/introduction.html#ID-Conformance
//http://reference.sitepoint.com/javascript/DOMImplementation/hasFeature
['Core'
,'XML'
,'Events'
,'UIEvents'
,'MouseEvents'
,'TextEvents'
,'KeyboardEvents'
,'MutationEvents'
,'MutationNameEvents'
,'HTMLEvents'
,'LS'
,'LS-Async'
,'Validation'
,'XPath'
]);
end;


procedure TMainForm.LogDomVendorFeatures(const CurrentDomVendor: TDOMVendor; const Versions, Features: array of string);
var
  AllVersions: string;
  Feature: string;
  Line: string;
  Supported: Boolean;
  SupportedAll: Boolean;
  SupportedNone: Boolean;
  SupportedVersions: IStringListWrapper;
  Version: string;
begin
  SupportedVersions := TStringListWrapper.Create();
  for Version in Versions do
    AddSupportedVersion(Version, SupportedVersions);
  AllVersions := Format('All: %s', [SupportedVersions.CommaText]);
  for Feature in Features do
  begin
    SupportedAll := True;
    SupportedNone := True;
    SupportedVersions.Clear();
    for Version in Versions do
    begin
      Supported := CurrentDomVendor.DOMImplementation.hasFeature(Feature, Version);
      if Supported then
        AddSupportedVersion(Version, SupportedVersions);
      SupportedAll := SupportedAll and Supported;
      SupportedNone := SupportedNone and not Supported;
    end;
    if SupportedNone then
      Line := Format('None', [])
    else
    if SupportedAll then
      Line := Format('%s', [AllVersions])
    else
      Line := Format('%s', [SupportedVersions.CommaText]);
    Logger.Log('  ' + Feature, Line);
  end;
end;

Delphi XE 会显示这些内容:

DOMVendor:MSXML
<document/>
SelectionLanguage:XPath
SelectionNamespaces:
  Core:None
  XML:Any,1.0
  Events:None
  UIEvents:None
  MouseEvents:None
  TextEvents:None
  KeyboardEvents:None
  MutationEvents:None
  MutationNameEvents:None
  HTMLEvents:None
  LS:None
  LS-Async:None
  Validation:None
  XPath:Any,1.0
DOMVendor:ADOM XML v4
?<document></document>

  Core:None
  XML:None
  Events:None
  UIEvents:None
  MouseEvents:None
  TextEvents:None
  KeyboardEvents:None
  MutationEvents:None
  MutationNameEvents:None
  HTMLEvents:None
  LS:None
  LS-Async:None
  Validation:None
  XPath:None

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接