在Java中解析XML字符串的最佳方法是什么？

Question

在Java中解析XML字符串的最佳方法是什么？

4

我正在使用javax.xml.parsers.DocumentBuilder在Java中解析字符串。然而，没有直接解析字符串的函数，所以我改为这样做：

public static Document parseText(String zText) {
    try
    {
        DocumentBuilderFactory dbFactory = DocumentBuilderFactory.newInstance();
        DocumentBuilder dBuilder = dbFactory.newDocumentBuilder();
        Document doc = dBuilder.parse(new InputSource(new StringReader(zText)));
        doc.getDocumentElement().normalize();
        return doc;
    }
    catch (Exception e) {
            e.printStackTrace();
    }
    return null;
}

这是最好的方法吗？我感觉一定有更简单的方法...谢谢！

- Soren Johnson

5个回答

2

我个人更喜欢dom4j。看看他们的快速入门，非常简单。

- javamonkey79

1

我同意aperkins的观点，这是我的javax助手：

/**
 * Returns a {@code Document} from the specified XML {@code String}.
 * 
 * @param xmlDocumentString a well-formed XML {@code String}
 * @return a {@code org.w3c.dom.Document}
 */
public static Document getDomDocument(String xmlDocumentString)
{
    if(StringUtility.isNullOrEmpty(xmlDocumentString)) return null;

    InputStream s = null;

    try
    {
        s = new ByteArrayInputStream(xmlDocumentString.getBytes("UTF-8"));
    }
    catch(UnsupportedEncodingException e)
    {
        throw new RuntimeException("UnsupportedEncodingException: " + e.getMessage());
    }

    return XmlDomUtility.getDomDocument(s);
}

这个辅助程序依赖于另一个辅助程序：

/**
 * Returns a {@code Document} from the specified {@code InputStream}.
 * 
 * @param input the {@code java.io.InputStream}
 * @return a {@code org.w3c.dom.Document}
 */
public static Document getDomDocument(InputStream input)
{
    Document document = null;
    try
    {
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        factory.setNamespaceAware(true);
        DocumentBuilder builder = factory.newDocumentBuilder();
        document = builder.parse(input);
    }
    catch(ParserConfigurationException e)
    {
        throw new RuntimeException("ParserConfigurationException: " + e.getMessage());
    }
    catch(SAXException e)
    {
        throw new RuntimeException("SAXException: " + e.getMessage());
    }
    catch(IOException e)
    {
        throw new RuntimeException("IOException: " + e.getMessage());
    }

    return document;
}

更新：这是我的导入项：

import java.io.ByteArrayInputStream;
import java.io.File;
import java.io.IOException;
import java.io.InputStream;
import java.io.UnsupportedEncodingException;

import javax.xml.parsers.DocumentBuilder;
import javax.xml.parsers.DocumentBuilderFactory;
import javax.xml.parsers.ParserConfigurationException;

import org.w3c.dom.Document;
import org.w3c.dom.Node;
import org.xml.sax.SAXException;

- rasx

Rasx：StringUtility和XmlDomUtility是从哪里导入的？ - Jim Ferrans

我正在使用标准的JavaSE javax库：import java.io.ByteArrayInputStream; import java.io.File; import java.io.IOException; import java.io.InputStream; import java.io.UnsupportedEncodingException;import javax.xml.parsers.DocumentBuilder; import javax.xml.parsers.DocumentBuilderFactory; import javax.xml.parsers.ParserConfigurationException;import org.w3c.dom.Document; import org.w3c.dom.Node; import org.xml.sax.SAXException; - rasx

1

如果我很匆忙或者不在意的话，我就不会进行规范化。当你需要时，你可以仅对节点进行规范化。

- srini.venigalla

0

你可以尝试另一种选择，那就是使用 Castor，我认为它会让事情变得更简单：

http://www.castor.org/

- Aitor

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- aperkins · Accepted Answer

直接回答你的问题 - 据我所知，没有更好的方法。使用输入源是因为它更通用，可以处理来自文件、字符串或网络的输入。

你还可以尝试使用SAX Xml解析器 - 它比较基础，使用访问者模式，但可以完成任务，并且对于小型数据集和简单的XML模式来说，它相当容易使用。 SAX也包含在核心JRE中。