ORACLE解析XML文件

5

在我的一个过程中,我使用REST调用(在APEX中)解析远程存储的XML文件,并尝试找到包含特定术语的节点。

这是文件的简化示例结构。此示例中的搜索词为“cloud”:

 <map id="12343">
      <topic id="23498">
        <title>Topic title</title>
        <p id="24334"> some sample text with term 'cloud' </p>
        <ul id = "34334">
          <li id="38743">List item without the term </li>
          <li id="38438">List item with term 'Cloud'</li>
        </ul>
      </topic>
      <topic id="23498">
        <title>Title for this topic</title>
        <p id="24334"> some sample text with term 'cloud' </p>
        <ul id = "34334">
          <li id="38743">List item without the term </li>
          <li id="38438">List item without term'</li>
        </ul>
      </topic>
      <topic id="23498">
        <title>Title for this topic with term 'CLOUD' in caps</title>
        <p id="24334"> some sample text with term 'Cloud' </p>
        <ul id = "34334">
          <li id="38743">List item without the term </li>
          <li id="38438">List item without term'</li>
        </ul>
      </topic>
    </map>

这段代码的作用是解析该文件并找出包含词语“cloud”的节点ID,无论该节点中的文本在何处。

我正在使用existnode来查找,但是结果不正确:

declare
sourceXML clob;
begin
delete from result_table;
for f in (select file_id, files_path from my_table)
  loop
  /*Get the contents of the file in the sourceXML*/
    sourceXML := APEX_WEB_SERVICE.MAKE_REST_REQUEST(
    p_url => f.file_path,
    p_http_method => 'GET');

    if instr(sourceXML,'<?xml version') != 0 then /* verify if it's valid xml file */
      for t in (select topic_id
                      FROM xmltable('//map/topic' passing XMLTYPE(sourceXML)
                          columns topic_id VARCHAR2(10) PATH './@id')
                      where XMLExists('//text()[ora:contains(.,"sales cloud")]' passing XMLTYPE(sourceXML)))
      loop
         insert into result_table (file,topic) values (f.file_id, t.topic_id);
      end loop;
    end if;
  end loop;
end;

我无法弄清楚我的错误在哪里。


“不得到正确的结果”是什么意思? - Rohit
有时我会得到带有术语的主题ID,但有时我会得到主题ID,而这些术语在任何地方都不存在。 - Sejal Parikh
你的问题非常有趣,但如果您能提供一个测试用例,我们可以运行并模拟出错误结果,同时提供相同样本数据的预期结果样本,那将非常有帮助。 - Francisco Sitja
1个回答

1
我将简单标签与列表标签分开,并在两个循环中搜索每一个标签:
DECLARE
   V_XML         VARCHAR2 (4096) := '<map id="12343">
<topic id="23498">
<title>Topic title</title>
<p id="24334"> some sample text with term ''cloud''</p>
<ul id = "34334">
<li id="38743">List item without the term</li>
<li id="38438">List item with term ''Cloud''</li>
</ul>
</topic>
<topic id="23498">
<title>Title for this topic</title>
<p id="24334"> some sample text with term ''cloud''</p>
<ul id = "34334">
<li id="38743">List item without the term</li>
<li id="38438">List item without term''</li>
</ul>
</topic>
<topic id="23498">
<title>Title for this topic with term ''CLOUD'' in caps</title>
<p id="24334"> some sample text with term ''Cloud''</p>
<ul id = "34334">
<li id="38743">List item without the term</li>
<li id="38438">List item without term''</li>
</ul>
</topic>
</map>';
   V_XML_CHILD   VARCHAR2 (4096);
   V_TEXT        VARCHAR2 (4096);
   V_ID          VARCHAR2 (4096);
   V_NAME        VARCHAR2 (4096);
   V_PARENT_ID   VARCHAR2 (4096);
   V_CNT         NUMBER;
BEGIN
   DBMS_OUTPUT.PUT_LINE (
      '-------Looking in simple tags for each topic--------------');

   FOR REC IN (SELECT COLUMN_VALUE VAL
               FROM XMLTABLE (
                              '//map/topic'
                              PASSING XMLTYPE (V_XML)
                             ))
   LOOP
      V_CNT := 0;
      V_XML_CHILD := REC.VAL.GETSTRINGVAL ();

      SELECT TAG_ID
      INTO V_PARENT_ID
      FROM XMLTABLE (
              '*'
              PASSING XMLTYPE (V_XML_CHILD)
              COLUMNS TAG_NAME VARCHAR2 (100) PATH 'name()',
                      TAG_ID VARCHAR2 (100) PATH '@id');

      FOR R_LINE
         IN (SELECT TAG_NAME, TAG_ID, TAG_VALUE
             FROM XMLTABLE (
                     'topic/*'
                     PASSING XMLTYPE (V_XML_CHILD)
                     COLUMNS TAG_NAME VARCHAR2 (100) PATH 'name()',
                             TAG_VALUE VARCHAR2 (100) PATH 'text()',
                             TAG_ID VARCHAR2 (100) PATH '@id'))
      LOOP
         V_CNT := V_CNT + 1;
         V_ID := NVL (R_LINE.TAG_ID, V_PARENT_ID);--nvl here 
         V_NAME := R_LINE.TAG_NAME;               
         V_TEXT := R_LINE.TAG_VALUE;              

         --DBMS_OUTPUT.PUT_LINE (V_CNT || '- id['||V_ID||'] - Name['||V_NAME||'] Text:' || V_TEXT);

         IF V_ID <> 'ul' AND INSTR (UPPER (V_TEXT), 'CLOUD') > 1
         THEN
            DBMS_OUTPUT.PUT_LINE (
                  'Found: Tag Id['
               || V_ID
               || '] - Tag Name['
               || V_NAME
               || '] Text:'
               || V_TEXT);
         END IF;
      END LOOP;
   END LOOP;

   DBMS_OUTPUT.PUT_LINE ('---------------------');
   DBMS_OUTPUT.PUT_LINE (
      '-------Looking in list tags for each topic--------------');

   FOR REC
      IN (SELECT CHILDS VAL
          FROM XMLTABLE (
                         '//map/topic'
                         PASSING XMLTYPE (V_XML)
                         COLUMNS CHILDS XMLTYPE PATH 'ul'
                        ))
   LOOP
      V_CNT := 0;

      FOR LINE
         IN (SELECT *
             FROM XMLTABLE (
                     'ul/*'
                     PASSING XMLTYPE (REC.VAL.GETSTRINGVAL ())
                     COLUMNS TAG_NAME VARCHAR2 (100) PATH 'name()',
                             TAG_VALUE VARCHAR2 (100) PATH 'text()',
                             TAG_ID VARCHAR2 (100) PATH '@id'))
      LOOP
         V_CNT := V_CNT + 1;
         V_ID := LINE.TAG_ID;                     
         V_NAME := LINE.TAG_NAME;                 
         V_TEXT := LINE.TAG_VALUE;                

         --DBMS_OUTPUT.PUT_LINE (V_CNT || '- id['||V_ID||'] - Name['||V_NAME||'] Text:' || V_TEXT);

         IF V_ID <> 'ul' AND INSTR (UPPER (V_TEXT), 'CLOUD') > 1
         THEN
            DBMS_OUTPUT.PUT_LINE (
                  'Found: Tag Id['
               || V_ID
               || '] - Tag Name['
               || V_NAME
               || '] Text:'
               || V_TEXT);
         END IF;
      END LOOP;
   END LOOP;

   DBMS_OUTPUT.PUT_LINE ('---------------------');
END;

输出结果为:
-------Looking in simple tags for each topic--------------
Found: Tag Id[24334] - Tag Name[p] Text: some sample text with term 'cloud'
Found: Tag Id[24334] - Tag Name[p] Text: some sample text with term 'cloud'
Found: Tag Id[23498] - Tag Name[title] Text:Title for this topic with term 'CLOUD' in caps
Found: Tag Id[24334] - Tag Name[p] Text: some sample text with term 'Cloud'
---------------------
-------Looking in list tags for each topic--------------
Found: Tag Id[38438] - Tag Name[li] Text:List item with term 'Cloud'
---------------------

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接