使用Python读取Turtle/N3 RDF文件

Question

使用Python读取Turtle/N3 RDF文件

pythondebuggingsemantic-webrdflibturtle-rdf

8

我正在尝试以Turtle格式对一些植物数据进行编码，并使用RDFLib从Python中读取这些数据。然而，我遇到了问题，我不确定是因为我的Turtle格式有误还是我错误地使用了RDFLib。

我的测试数据如下：

@PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@PREFIX p: <http://www.myplantdomain.com/plant/description> .
p:description a rdfs:Property .
p:name a rdfs:Property .
p:language a rdfs:Property .
p:value a rdfs:Property .
p:gender a rdfs:Property .
p:inforescence a rdfs:Property .
p:color a rdfs:Property .
p:sense a rdfs:Property .
p:type a rdfs:Property .
p:fruit a rdfs:Property .
p:flower a rdfs:Property .
p:dataSource a rdfs:Property .
p:degree a rdfs:Property .
p:date a rdfs:Property .
p:person a rdfs:Property .
p:c2a7b9a3-c54a-41f5-a3b2-155351b3590f
    p:description [
        p:name [
            p:kingdom "Plantae" ;
            p:division "Pinophyta" ;
            p:class "Pinopsida" ;
            p:order "Pinales" ;
            p:family "Pinaceae" ;
            p:genus "Abies" ;
            p:species "A. alba" ;
            p:language "latin" ;
            p:given_by [
                p:person p:source/Philip_Miller ;
                p:start_date "1923-1-2"^^<http://www.w3.org/2001/XMLSchema#date>
            ]
        ] ;
        p:name [
            p:language "english" ;
            p:value "silver fir"
        ] ;
        p:flower [
            p:gender "male"@en ;
            p:inflorescence "catkin"@en ;
            p:color "brown"@en ;
            p:color "yellow"@en ;
            p:sense "straight"@en
        ] ;
        p:flower [
            p:gender "female"@en ;
            p:inflorescence "catkin"@en ;
            p:color "pink"@en ;
            p:color "yellow"@en ;
            p:sense "straight"@en
        ] ;
        p:fruit [
            p:type "cone"@en ;
            p:color "brown"@en
        ]
    ] .

我的Python是：

import rdflib
g = rdflib.Graph()
#result = g.parse('trees.ttl') 
#result = g.parse('trees.ttl', format='ttl')
result = g.parse('trees.ttl', format='n3')
print len(g)
for stmt in g:
    print stmt

这给了我错误：

ValueError: Found @PREFIX when expecting a http://www.w3.org/2000/10/swap/grammar/n3#document . todoStack=[['http://www.w3.org/2000/10/swap/grammar/n3#document', []]]

我尝试过更改parse()参数，但是每次都会出错。我几乎找不到有关如何解析Turtle的示例。我做错了什么？

- Cerin

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Alex Martelli · Accepted Answer

我认为第一个问题是关于大写PREFIX-- 如果你将它们改成小写，它就会通过该点。不确定这是rdflib还是Turtle .ttl的错误，但Turtle Validator在线演示似乎认为这是.ttl的问题（说："Validation failed: The @PREFIX directive is not supported, line 1 col 0."，但如果你将它们改成小写，这个问题就消失了）。

一旦你跨过了这个障碍，两个解析器都不喜欢p:given_by [周围的部分："Bad syntax (']' expected) at ^ in:"...按照rdflib的说法; Turtle Validator则说

Validation failed: Expecting a period, semicolon, comma, close-bracket, or close-brace but found '/', line 31 col 33.

它特别不喜欢p:source/Philip_Miller这一部分。

通过这两个问题（谁知道是否还有其他问题...！），我认为您可以得出结论，即此N3来源（您发布的.ttl文件）已经损坏，并将注意力转向首次创建此文件的任何系统，以及为什么会以如此多样化的方式创建它。