如何使用NSXMLParser解决内部声明的XML实体引用问题

8

我有一个使用内部声明实体的XML文件。例如:

<?xml version="1.0" encoding="UTF-8"?>

...

<!ENTITY my_symbol "my symbol value">

...

<my_element>
    <my_next_element>foo&my_symbol;bar</my_next_element>
</my_element>

...

使用NSXMLParser类,我该如何解决my_symbol实体引用?
经过试验,parser:foundInternalEntityDeclarationWithName:value:代理方法将被调用以获取my_symbol实体声明,并返回值为“my symbol value”。然后,在到达my_next_element元素时,NSXMLParser将调用parser:didStartElement:namespaceURI:qualifiedName:attributes:代理方法。
在parser:didEndElement:namespaceURI:qualifiedName:被调用之前,parser:foundCharacters:代理方法将两次被调用,字符串分别为:
1. "foo" 2. "bar"
my_symbol实体引用将被忽略。为了解析实体引用,需要什么条件?
编辑:
从DTD中删除my_symbol的ENTITY声明将导致NSXMLParserUndeclaredEntityError。这表明当实体声明存在并且在中被引用时,它会被注意到。出于某种原因,它只是没有被解析为它所代表的字符串。
此外,如果在元素内使用&,则解析器将正确地将其解析为"&",并在调用parser:foundCharacters:代理方法时传递该字符串。

你能用XPath找到它吗? - Kaiser Advisor
我在'my_element'上执行了XPath查询,但是"foo"和"bar"都没有被解析。像我这样引用内部声明的实体是正确的吗? - Ben Lever
说实话,我不知道它是否正确,但肯定是不寻常的。我不完全清楚为什么你不只创建两个子实体,“foo”和“bar”。然后你就可以使用XPath了。 - Kaiser Advisor
上面的XML是我为了测试我的问题而准备的一个示例。如果“foo”和“bar”是单独的子实体,XPath将找到它们。问题在于“&my_symbol;”没有被解析。XPath无法找到它。 - Ben Lever
我现在遇到了这个问题。你解决了吗? - Nico Prananta
1个回答

3

我查看了NSXMLParser.h文件,该文件列出了代理需要支持的以下定义方法:

@interface NSObject (NSXMLParserDelegateEventAdditions)
// Document handling methods
- (void)parserDidStartDocument:(NSXMLParser *)parser;
    // sent when the parser begins parsing of the document.
- (void)parserDidEndDocument:(NSXMLParser *)parser;
    // sent when the parser has completed parsing. If this is encountered, the parse was successful.

// DTD handling methods for various declarations.
- (void)parser:(NSXMLParser *)parser foundNotationDeclarationWithName:(NSString *)name publicID:(NSString *)publicID systemID:(NSString *)systemID;

- (void)parser:(NSXMLParser *)parser foundUnparsedEntityDeclarationWithName:(NSString *)name publicID:(NSString *)publicID systemID:(NSString *)systemID notationName:(NSString *)notationName;

- (void)parser:(NSXMLParser *)parser foundAttributeDeclarationWithName:(NSString *)attributeName forElement:(NSString *)elementName type:(NSString *)type defaultValue:(NSString *)defaultValue;

- (void)parser:(NSXMLParser *)parser foundElementDeclarationWithName:(NSString *)elementName model:(NSString *)model;

- (void)parser:(NSXMLParser *)parser foundInternalEntityDeclarationWithName:(NSString *)name value:(NSString *)value;

- (void)parser:(NSXMLParser *)parser foundExternalEntityDeclarationWithName:(NSString *)name publicID:(NSString *)publicID systemID:(NSString *)systemID;

- (void)parser:(NSXMLParser *)parser didStartElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName attributes:(NSDictionary *)attributeDict;
    // sent when the parser finds an element start tag.
    // In the case of the cvslog tag, the following is what the delegate receives:
    //   elementName == cvslog, namespaceURI == http://xml.apple.com/cvslog, qualifiedName == cvslog
    // In the case of the radar tag, the following is what's passed in:
    //    elementName == radar, namespaceURI == http://xml.apple.com/radar, qualifiedName == radar:radar
    // If namespace processing >isn't< on, the xmlns:radar="http://xml.apple.com/radar" is returned as an attribute pair, the elementName is 'radar:radar' and there is no qualifiedName.

- (void)parser:(NSXMLParser *)parser didEndElement:(NSString *)elementName namespaceURI:(NSString *)namespaceURI qualifiedName:(NSString *)qName;
    // sent when an end tag is encountered. The various parameters are supplied as above.

- (void)parser:(NSXMLParser *)parser didStartMappingPrefix:(NSString *)prefix toURI:(NSString *)namespaceURI;
    // sent when the parser first sees a namespace attribute.
    // In the case of the cvslog tag, before the didStartElement:, you'd get one of these with prefix == @"" and namespaceURI == @"http://xml.apple.com/cvslog" (i.e. the default namespace)
    // In the case of the radar:radar tag, before the didStartElement: you'd get one of these with prefix == @"radar" and namespaceURI == @"http://xml.apple.com/radar"

- (void)parser:(NSXMLParser *)parser didEndMappingPrefix:(NSString *)prefix;
    // sent when the namespace prefix in question goes out of scope.

- (void)parser:(NSXMLParser *)parser foundCharacters:(NSString *)string;
    // This returns the string of the characters encountered thus far. You may not necessarily get the longest character run. The parser reserves the right to hand these to the delegate as potentially many calls in a row to -parser:foundCharacters:

- (void)parser:(NSXMLParser *)parser foundIgnorableWhitespace:(NSString *)whitespaceString;
    // The parser reports ignorable whitespace in the same way as characters it's found.

- (void)parser:(NSXMLParser *)parser foundProcessingInstructionWithTarget:(NSString *)target data:(NSString *)data;
    // The parser reports a processing instruction to you using this method. In the case above, target == @"xml-stylesheet" and data == @"type='text/css' href='cvslog.css'"

- (void)parser:(NSXMLParser *)parser foundComment:(NSString *)comment;
    // A comment (Text in a <!-- --> block) is reported to the delegate as a single string

- (void)parser:(NSXMLParser *)parser foundCDATA:(NSData *)CDATABlock;
    // this reports a CDATA block to the delegate as an NSData.

- (NSData *)parser:(NSXMLParser *)parser resolveExternalEntityName:(NSString *)name systemID:(NSString *)systemID;
    // this gives the delegate an opportunity to resolve an external entity itself and reply with the resulting data.

- (void)parser:(NSXMLParser *)parser parseErrorOccurred:(NSError *)parseError;
    // ...and this reports a fatal error to the delegate. The parser will stop parsing.

- (void)parser:(NSXMLParser *)parser validationErrorOccurred:(NSError *)validationError;
    // If validation is on, this will report a fatal validation error to the delegate. The parser will stop parsing.
@end

根据文件中条目的顺序,发现声明的方法应该在找到元素之前出现(正如您所发现的)。我建议尝试处理所有这些方法并查看是否有任何方法被执行,但它们似乎都是为其他用途设计的。

我想知道是否有一种方法能够记录所有发送到您委托的未处理消息,以防文档/接口不完整。


1
我按照您的建议实现了所有委托并重新运行。然而,当解析器到达“&my_symbol;”时,仍然没有调用任何委托。如上所述,删除ENTITY声明将调用“resolveExternalEntityName”方法,这表明它被识别为实体引用。由于某种原因,当存在(并被识别)ENTITY声明时,它只是无法解析对实体值的引用。 - Ben Lever
1
继续我之前的希望,希望这只是一个未记录的方法调用,我找到了这个页面,讲述如何创建代理对象。你可以将其放在NSXMLParser委托的前面,并查找任何未处理的内容:http://borkware.com/rants/agentm/elegant-delegation/ - Epsilon Prime
但是即使它被证明是未记录的消息,你需要处理的话,看起来你需要向苹果报告一个错误。无论是需要记录还是需要实现并记录。 - Epsilon Prime
1
我在/ Developer / Platforms / iPhoneOS.platform / Developer / SDKs / iPhoneOS3.1.2.sdk / System / Library / Frameworks / Foundation.framework / Foundation上运行了nm和strings,但无法找到有关委托的任何不同之处,并且我没有看到任何与文档不同的新解析选项(或设置)。我唯一发现的是类似库有两个调用,一个处理实体,另一个跳过它们。 我相信您已经尝试将所有setShould*值都设置为YES,但效果不佳。 - Epsilon Prime
1
我实施了使用代理对象作为委托的建议,以验证NSXMLParser未传递任何未记录的信息。不幸的是,所有预期的委托方法都被调用。我现在会将此报告给苹果,并查看其回复。 - Ben Lever

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接