PHP SimpleXML选择元素

3
我将尝试通过SimpleXML从ncx文件中提取信息。 文件的格式如下:
<?xml version="1.0" encoding="UTF-8"?>
<ncx xmlns="http://www.daisy.org/z3986/2005/ncx/" version="2005-1">
<head>
    <meta name="dtb:uid" content="http://www.hxa7241.org/articles/content/epup-guide_hxa7241_2007_1.epub"/>
</head>
<docTitle>
    <text>Der Weg der Könige</text>
</docTitle>
<navMap>
    <navPoint id="toc1" playOrder="1">
        <navLabel>
            <text>Widmung</text>
        </navLabel>
        <content src="e9783641059446_ded01.html"/>
    </navPoint>
    <navPoint id="toc2" playOrder="2">
        <navLabel>
            <text>Inhaltsverzeichnis</text>
        </navLabel>
        <content src="e9783641059446_toc01.html"/>
    </navPoint>
    <navPoint id="toc3" playOrder="3">
        <navLabel>
            <text>PRÄLUDIUM</text>
        </navLabel>
        <content src="e9783641059446_fm02.html"/>
    </navPoint>
    <navPoint id="toc4" playOrder="4">
        <navLabel>
            <text>4500 JAHRE SPÄTER</text>
        </navLabel>
        <content src="e9783641059446_fm03.html"/>
    </navPoint>
    <navPoint id="toc5" playOrder="5">
        <navLabel>
            <text>PROLOG - TÖTEN</text>
        </navLabel>
        <content src="e9783641059446_fm04.html"/>
    </navPoint>
    <navPoint id="toc6" playOrder="6">
        <navLabel>
            <text>ERSTER TEIL - Über dem Schweigen</text>
        </navLabel>
        <content src="e9783641059446_p01.html"/>
        <navPoint id="toc7" playOrder="7">
            <navLabel>
                <text>1 - STURMGESEGNET</text>
            </navLabel>
            <content src="e9783641059446_c01.html"/>
        </navPoint>
        <navPoint id="toc8" playOrder="8">
            <navLabel>
                <text>2 - DIE EHRE IST TOT</text>
            </navLabel>
            <content src="e9783641059446_c02.html"/>
        </navPoint>
        <navPoint id="toc9" playOrder="9">
            <navLabel>
                <text>3 - DIE STADT DER GLOCKEN</text>
            </navLabel>
            <content src="e9783641059446_c03.html"/>
        </navPoint>
        <navPoint id="toc10" playOrder="10">
            <navLabel>
                <text>4 - DIE ZERBROCHENE. EBENE</text>
            </navLabel>
            <content src="e9783641059446_c04.html"/>
        </navPoint>
        <navPoint id="toc11" playOrder="11">
            <navLabel>
                <text>5 - HÄRETISCH</text>
            </navLabel>
            <content src="e9783641059446_c05.html"/>
        </navPoint>
        <navPoint id="toc12" playOrder="12">
            <navLabel>
                <text>6 - BRÜCKE VIER</text>
            </navLabel>
            <content src="e9783641059446_c06.html"/>
        </navPoint>
        <navPoint id="toc13" playOrder="13">
            <navLabel>
                <text>7 - ALLES, WAS VERNÜNFTIG IST</text>
            </navLabel>
            <content src="e9783641059446_c07.html"/>
        </navPoint>
        <navPoint id="toc14" playOrder="14">
            <navLabel>
                <text>8 - NÄHER ZUR FLAMME</text>
            </navLabel>
            <content src="e9783641059446_c08.html"/>
        </navPoint>
        <navPoint id="toc15" playOrder="15">
            <navLabel>
                <text>9 - VERDAMMNIS</text>
            </navLabel>
            <content src="e9783641059446_c09.html"/>
        </navPoint>
        <navPoint id="toc16" playOrder="16">
            <navLabel>
                <text>10 - GESCHICHTEN ÜBER CHIRURGEN</text>
            </navLabel>
            <content src="e9783641059446_c10.html"/>
        </navPoint>
        <navPoint id="toc17" playOrder="17">
            <navLabel>
                <text>11 - TROPFEN</text>
            </navLabel>
            <content src="e9783641059446_c11.html"/>
        </navPoint>
    </navPoint>
    <navPoint id="toc18" playOrder="18">
        <navLabel>
            <text>ZWISCHENSPIELE</text>
        </navLabel>
        <content src="e9783641059446_p02.html"/>
        <navPoint id="toc19" playOrder="19">
            <navLabel>
                <text>Z-1 - ISCHIKK</text>
            </navLabel>
            <content src="e9783641059446_c12.html"/>
        </navPoint>
        <navPoint id="toc20" playOrder="20">
            <navLabel>
                <text>Z-2 - NAN BALAT</text>
            </navLabel>
            <content src="e9783641059446_c13.html"/>
        </navPoint>
        <navPoint id="toc21" playOrder="21">
            <navLabel>
                <text>Z-3 - DER SEGEN DER UNWISSENHEIT</text>
            </navLabel>
            <content src="e9783641059446_c14.html"/>
        </navPoint>
    </navPoint>
    <navPoint id="toc22" playOrder="22">
        <navLabel>
            <text>ZWEITER TEIL - Die leuchtenden Stürme</text>
        </navLabel>
        <content src="e9783641059446_p03.html"/>
        <navPoint id="toc23" playOrder="23">
            <navLabel>
                <text>12 - EINHEIT</text>
            </navLabel>
            <content src="e9783641059446_c15.html"/>
        </navPoint>
        <navPoint id="toc24" playOrder="24">
            <navLabel>
                <text>13 - ZEHN HERZSCHLÄGE</text>
            </navLabel>
            <content src="e9783641059446_c16.html"/>
        </navPoint>
        <navPoint id="toc25" playOrder="25">
            <navLabel>
                <text>14 - ZAHLTAG</text>
            </navLabel>
            <content src="e9783641059446_c17.html"/>
        </navPoint>
        <navPoint id="toc26" playOrder="26">
            <navLabel>
                <text>15 - DER KÖDER</text>
            </navLabel>
            <content src="e9783641059446_c18.html"/>
        </navPoint>
        <navPoint id="toc27" playOrder="27">
            <navLabel>
                <text>16 - KOKONS</text>
            </navLabel>
            <content src="e9783641059446_c19.html"/>
        </navPoint>
        <navPoint id="toc28" playOrder="28">
            <navLabel>
                <text>17 - EIN BLUTROTER SONNENUNTERGANG</text>
            </navLabel>
            <content src="e9783641059446_c20.html"/>
        </navPoint>
        <navPoint id="toc29" playOrder="29">
            <navLabel>
                <text>18 - DER GROSSPRINZ DES KRIEGES</text>
            </navLabel>
            <content src="e9783641059446_c21.html"/>
        </navPoint>
        <navPoint id="toc30" playOrder="30">
            <navLabel>
                <text>19 - DER STURZ DER STERNE</text>
            </navLabel>
            <content src="e9783641059446_c22.html"/>
        </navPoint>
        <navPoint id="toc31" playOrder="31">
            <navLabel>
                <text>20 - SCHARLACHROT</text>
            </navLabel>
            <content src="e9783641059446_c23.html"/>
        </navPoint>
        <navPoint id="toc32" playOrder="32">
            <navLabel>
                <text>21 - WARUM MENSCHEN LÜGEN</text>
            </navLabel>
            <content src="e9783641059446_c24.html"/>
        </navPoint>
        <navPoint id="toc33" playOrder="33">
            <navLabel>
                <text>22 - AUGEN, HÄNDE ODER KUGELN?</text>
            </navLabel>
            <content src="e9783641059446_c25.html"/>
        </navPoint>
        <navPoint id="toc34" playOrder="34">
            <navLabel>
                <text>23 - VIELSEITIG</text>
            </navLabel>
            <content src="e9783641059446_c26.html"/>
        </navPoint>
        <navPoint id="toc35" playOrder="35">
            <navLabel>
                <text>24 - DIE GALERIE DER LANDKARTEN</text>
            </navLabel>
            <content src="e9783641059446_c27.html"/>
        </navPoint>
        <navPoint id="toc36" playOrder="36">
            <navLabel>
                <text>25 - DER SCHLÄCHTER</text>
            </navLabel>
            <content src="e9783641059446_c28.html"/>
        </navPoint>
        <navPoint id="toc37" playOrder="37">
            <navLabel>
                <text>26 - STILLE</text>
            </navLabel>
            <content src="e9783641059446_c29.html"/>
        </navPoint>
        <navPoint id="toc38" playOrder="38">
            <navLabel>
                <text>27 - KLUFTDIENST</text>
            </navLabel>
            <content src="e9783641059446_c30.html"/>
        </navPoint>
        <navPoint id="toc39" playOrder="39">
            <navLabel>
                <text>28 - ENTSCHEIDUNG</text>
            </navLabel>
            <content src="e9783641059446_c31.html"/>
        </navPoint>
    </navPoint>
    <navPoint id="toc40" playOrder="40">
        <navLabel>
            <text>ZWISCHENSPIELE</text>
        </navLabel>
        <content src="e9783641059446_p04.html"/>
        <navPoint id="toc41" playOrder="41">
            <navLabel>
                <text>Z-4 - RYSN</text>
            </navLabel>
            <content src="e9783641059446_c32.html"/>
        </navPoint>
        <navPoint id="toc42" playOrder="42">
            <navLabel>
                <text>Z-5 - DER SAMMLER AXIES</text>
            </navLabel>
            <content src="e9783641059446_c33.html"/>
        </navPoint>
        <navPoint id="toc43" playOrder="43">
            <navLabel>
                <text>Z-6 - EIN KUNSTWERK</text>
            </navLabel>
            <content src="e9783641059446_c34.html"/>
        </navPoint>
    </navPoint>
    <navPoint id="toc44" playOrder="44">
        <navLabel>
            <text>DRITTER TEIL - Sterben</text>
        </navLabel>
        <content src="e9783641059446_p05.html"/>
        <navPoint id="toc45" playOrder="45">
            <navLabel>
                <text>29 - IRRMASSUNG</text>
            </navLabel>
            <content src="e9783641059446_c35.html"/>
        </navPoint>
        <navPoint id="toc46" playOrder="46">
            <navLabel>
                <text>30 - UNSICHTBARE FINSTERNIS</text>
            </navLabel>
            <content src="e9783641059446_c36.html"/>
        </navPoint>
        <navPoint id="toc47" playOrder="47">
            <navLabel>
                <text>31 - UNTER DER HAUT</text>
            </navLabel>
            <content src="e9783641059446_c37.html"/>
        </navPoint>
        <navPoint id="toc48" playOrder="48">
            <navLabel>
                <text>32 - SEITENTRAGEN</text>
            </navLabel>
            <content src="e9783641059446_c38.html"/>
        </navPoint>
        <navPoint id="toc49" playOrder="49">
            <navLabel>
                <text>33 - CYMATIK</text>
            </navLabel>
            <content src="e9783641059446_c39.html"/>
        </navPoint>
        <navPoint id="toc50" playOrder="50">
            <navLabel>
                <text>34 - STURMWAND</text>
            </navLabel>
            <content src="e9783641059446_c40.html"/>
        </navPoint>
        <navPoint id="toc51" playOrder="51">
            <navLabel>
                <text>35 - EIN LICHT ZU SEHEN</text>
            </navLabel>
            <content src="e9783641059446_c41.html"/>
        </navPoint>
        <navPoint id="toc52" playOrder="52">
            <navLabel>
                <text>36 - DIE LEKTION</text>
            </navLabel>
            <content src="e9783641059446_c42.html"/>
        </navPoint>
    </navPoint>
    <navPoint id="toc53" playOrder="53">
        <navLabel>
            <text>SCHLUSSBEMERKUNG</text>
        </navLabel>
        <content src="e9783641059446_bm01.html"/>
    </navPoint>
    <navPoint id="toc54" playOrder="54">
        <navLabel>
            <text>ARS ARCANUM</text>
        </navLabel>
        <content src="e9783641059446_bm02.html"/>
    </navPoint>
    <navPoint id="toc55" playOrder="55">
        <navLabel>
            <text>DANKSAGUNG</text>
        </navLabel>
        <content src="e9783641059446_ack01.html"/>
    </navPoint>
    <navPoint id="toc56" playOrder="56">
        <navLabel>
            <text>Die Sturmlicht-Chroniken werden fortgesetzt in:</text>
        </navLabel>
        <content src="e9783641059446_tea01.html"/>
    </navPoint>
    <navPoint id="toc57" playOrder="57">
        <navLabel>
            <text>Copyright</text>
        </navLabel>
        <content src="e9783641059446_cop01.html"/>
    </navPoint>
</navMap>
</ncx>

我想获取所有位于这个元素中的HTML文件:

<content src="e9783641059446_cop01.html"/>

目前我正在尝试这种方法:

$ncx = simplexml_load_file($file);
$items = $ncx->navMap->children();
foreach ($items as $it) {
    echo $it->content['src'];
}

问题在于内容节点与您可能注意到的不在同一深度级别。有人知道如何解决吗?


我无法确切理解你的问题。你是想获取e9783641059446_cop01.html文件中的所有标签吗? - Mohammad
我想获取所有<content />标签以获取src =“”属性。 搜索的标记始终如下所示: <content src =“{一些HTML文件}”/> - Curunir
我尝试了这种方式,但数组为空:$ncx = simplexml_load_file($file); $items = $ncx->xpath("//content"); echo count($items);它显示为0。 - Curunir
啊,那是因为它有一个命名空间。 - Gordon
@Gordon,我该如何在查询中包含命名空间? - Curunir
显示剩余3条评论
2个回答

1

XML具有命名空间。请尝试使用

$ncx = simplexml_load_file('test.xml');
$ncx->registerXPathNamespace('x', 'http://www.daisy.org/z3986/2005/ncx/');
foreach ($ncx->xpath('//x:content/@src') as $src) {
    echo $src, PHP_EOL;
}

没有XPath:

$ncx = simplexml_load_file('test.xml', "SimpleXmlElement", 0, 'http://www.daisy.org/z3986/2005/ncx/', false);
foreach ($ncx->navMap->navPoint as $np) {
    echo $np->content->attributes()->src, PHP_EOL;
}

0

内容节点不在同一深度级别

此外,您可以轻松使用DOMDocument类来查找目标标签。请注意,getElementsByTagName()按名称获取标签,而getAttribute()获取属性值。

$dom = DOMDocument::load($file);
$contents = $dom->getElementsByTagName("content");
foreach($content as $content){
    echo $content->getAttribute("src");
}

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接