使用Jena阅读嵌套的RDF三元组

5

我有一个问题

CONSTRUCT { ?highValForeignTran ?hvFTPred ?hvFTObj . }
WHERE { ?highValForeignTran vocab:accounttransactions_transactionCurrency "USD" .
?highValForeignTran vocab:accounttransactions_transactionValue ?tranValue .
?highValForeignTran vocab:accounttransactions_transactionDate ?tranDate .
?highValForeignTran ?hvFTPred ?hvFTObj .
FILTER ( ?tranValue > 10000) .
FILTER (  ?tranDate >= "2013-11-23"^^xsd:date  && ?tranDate <= "2013-11-23"^^xsd:date) .
}

返回结果的函数:

<rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:vocab="http://localhost:2020/resource/vocab/"
xmlns:owl="http://www.w3.org/2002/07/owl#"
xmlns:db="http://localhost:2020/resource/"
xmlns:xsd="http://www.w3.org/2001/XMLSchema#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:map="http://localhost:2020/resource/#">
<vocab:accounttransactions rdf:about="http://localhost:2020/resource/accounttransactions/1">
<vocab:accounttransactions_id rdf:datatype="http://www.w3.org/2001/XMLSchema#integer"
>1</vocab:accounttransactions_id>
<vocab:accounttransactions_transactionCurrency>USD</vocab:accounttransactions_transactionCurrency>
<vocab:accounttransactions_originAccountNumber>DB48939239</vocab:accounttransactions_originAccountNumber>
<vocab:accounttransactions_transactionType>Cr</vocab:accounttransactions_transactionType>
    <vocab:accounttransactions_transactionDate rdf:datatype="http://www.w3.org/2001/XMLSchema#date"
>2013-11-23</vocab:accounttransactions_transactionDate>
<vocab:accounttransactions_destinationAccountId rdf:resource="http://localhost:2020/resource/bankaccounts/1"/>
<vocab:accounttransactions_transactionValue rdf:datatype=
"http://www.w3.org/2001/XMLSchema#decimal">12000</vocab:accounttransactions_transactionValue>
<rdfs:label>accounttransactions #1</rdfs:label>
<vocab:accounttransactions_destinationAccountNumber>47321896544567</vocab:accounttransactions_destinationAccountNumber>
</vocab:accounttransactions>
</rdf:RDF>

当我尝试使用Jena解析它时,我只得到一个表示外部accountTransactions三元组的三元组。
{"http://localhost:2020/resource/accounttransactions/1":
 {"subject":"http://localhost:2020/resource/accounttransactions/1",
  "predicate":"http://www.w3.org/1999/02/22-rdf-syntax-ns#type",
  "object":"http://localhost:2020/resource/vocab/accounttransactions"}
}

我不知道为什么其他的三元组会嵌套在里面,但我真的需要能够解析它们并将它们发送为JSON。以下是我的代码:

try {
Model result = qexec.execConstruct();

    JSONObject jsonShell = new JSONObject();

StmtIterator stmtIter = model.listStatements();
    while ( stmtIter.hasNext() ) {
        Statement stmt = stmtIter.nextStatement();
        JSONObject innerJson = new JSONObject();
        innerJson.put("subject", stmt.getSubject().visitWith(rdfVisitor));
        innerJson.put("predicate", stmt.getPredicate().visitWith(rdfVisitor));
        innerJson.put("object", stmt.getObject().visitWith(rdfVisitor));

        jsonShell.put(String.valueOf(stmt.getSubject().visitWith(rdfVisitor)), innerJson);
    }
    System.out.println(resultJson.toString());
    }
    finally {
        qexec.close();
    }

RDFVisitor rdfVisitor = new RDFVisitor() {

    @Override
    public Object visitURI(Resource r, String uri) {
        return uri;
    }

    @Override
    public Object visitLiteral(Literal l) {
        return l.getLexicalForm();
    }

    @Override
    public Object visitBlank(Resource r, AnonId id) {
        return id.getLabelString();
    }
};

我想知道 Statement.getProperty() 是否能解决问题,但是找不到创建 Property 实例的方法。


1
其他三元组并没有“嵌套在内部”。这只是RDF/XML编写的一种方式。在Turtle中,RDF看起来像http://pastebin.com/yU80Kqf8。你得到了好的数据。 - Joshua Taylor
1
我之前没用过JSONObjects,但是如果我执行jsonShell.put("X", innerJson)的话,它会把很多"X"元素添加到JSON里面吗?还是只会覆盖之前的一个?如果你在这个模型中所有的三元组都有相同的主语,那么如果你执行jsonShell.put(<subject-uri>,innerJson)并且覆盖了一个已存在的条目,你最终只会得到一个。 - Joshua Taylor
同时提及你使用的JSON库会更好。我有一个已加载Jena的项目,但是JSONObject没有任何自动完成提示... - Joshua Taylor
@JoshuaTaylor:这是Douglas Crockford的JSON-java库。你关于put会覆盖的判断可能是正确的。我只是假设由于三元组嵌套,我的代码无法访问它们。我的错误。明天一早回到项目后会验证这个问题。非常感谢。 - Nikhil Silveira
1个回答

5

你代码中的问题(以及导致的JSON问题)

你的数据中所有三元组的主语都相同(这没问题)。在更易读的Turtle格式或每行一个三元组的N-Triples格式中,可能更容易看到这一点。我已经在答案末尾包含了它们。由于所有三元组的主语都相同,我怀疑发生的情况是

jsonShell.put(String.valueOf(stmt.getSubject().visitWith(rdfVisitor)), innerJson);
//            |-----------------------------------------------------|
//                         same every time

每次迭代都会覆盖上一次的结果,因为如前所述,每次迭代的键是相同的。如果您在循环中添加一些打印语句,我预计您会看到您实际上正在迭代模型中的每个三元组。

我不能告诉您应该在那里使用什么样的键,因为我不确定该键如何有助于,由于三元组的主题已经编码在输出中,似乎您想要某种语句ID,因此您可以使用语句的字符串表示形式或其他方式。

使用Jena和RDF/JSON的替代方法

我想指出,Jena可以将模型序列化为RDF/JSON,如果这是您需要的内容,那可能是您获得JSON的更简单的方法。当然,结构将与您生成的结构不同,但这可能不是一个大问题。例如,在此处:/jsonoutput.ttl是您数据的本地副本,下面的代码将编写JSON。

import com.hp.hpl.jena.rdf.model.Model;
import com.hp.hpl.jena.rdf.model.ModelFactory;

public class JSONObjectTest {
    public static void main(String[] args) {
        Model model = ModelFactory.createDefaultModel();
        model.read( JSONObjectTest.class.getResourceAsStream( "/jsonoutput.ttl"), null, "N3" );
        model.write( System.out, "RDF/JSON" );
    }
}

生成的JSON结果如下:
{ 
  "http://localhost:2020/resource/accounttransactions/1" : { 
    "http://localhost:2020/resource/vocab/accounttransactions_transactionDate" : [ { 
      "type" : "literal" ,
      "value" : "2013-11-23" ,
      "datatype" : "http://www.w3.org/2001/XMLSchema#date"
    }
     ] ,
    "http://localhost:2020/resource/vocab/accounttransactions_transactionValue" : [ { 
      "type" : "literal" ,
      "value" : "12000" ,
      "datatype" : "http://www.w3.org/2001/XMLSchema#decimal"
    }
     ] ,
    "http://localhost:2020/resource/vocab/accounttransactions_id" : [ { 
      "type" : "literal" ,
      "value" : "1" ,
      "datatype" : "http://www.w3.org/2001/XMLSchema#integer"
    }
     ] ,
    "http://localhost:2020/resource/vocab/accounttransactions_destinationAccountNumber" : [ { 
      "type" : "literal" ,
      "value" : "47321896544567"
    }
     ] ,
    "http://www.w3.org/1999/02/22-rdf-syntax-ns#type" : [ { 
      "type" : "uri" ,
      "value" : "http://localhost:2020/resource/vocab/accounttransactions"
    }
     ] ,
    "http://localhost:2020/resource/vocab/accounttransactions_transactionCurrency" : [ { 
      "type" : "literal" ,
      "value" : "USD"
    }
     ] ,
    "http://www.w3.org/2000/01/rdf-schema#label" : [ { 
      "type" : "literal" ,
      "value" : "accounttransactions #1"
    }
     ] ,
    "http://localhost:2020/resource/vocab/accounttransactions_transactionType" : [ { 
      "type" : "literal" ,
      "value" : "Cr"
    }
     ] ,
    "http://localhost:2020/resource/vocab/accounttransactions_destinationAccountId" : [ { 
      "type" : "uri" ,
      "value" : "http://localhost:2020/resource/bankaccounts/1"
    }
     ] ,
    "http://localhost:2020/resource/vocab/accounttransactions_originAccountNumber" : [ { 
      "type" : "literal" ,
      "value" : "DB48939239"
    }
     ]
  }
}

您的数据以不同的格式呈现

Turtle / N3 格式的数据

@prefix db:    <http://localhost:2020/resource/> .
@prefix rdfs:  <http://www.w3.org/2000/01/rdf-schema#> .
@prefix owl:   <http://www.w3.org/2002/07/owl#> .
@prefix xsd:   <http://www.w3.org/2001/XMLSchema#> .
@prefix map:   <http://localhost:2020/resource/#> .
@prefix rdf:   <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix vocab: <http://localhost:2020/resource/vocab/> .

<http://localhost:2020/resource/accounttransactions/1>
        a                             vocab:accounttransactions ;
        rdfs:label                    "accounttransactions #1" ;
        vocab:accounttransactions_destinationAccountId
                <http://localhost:2020/resource/bankaccounts/1> ;
        vocab:accounttransactions_destinationAccountNumber
                "47321896544567" ;
        vocab:accounttransactions_id  1 ;
        vocab:accounttransactions_originAccountNumber
                "DB48939239" ;
        vocab:accounttransactions_transactionCurrency
                "USD" ;
        vocab:accounttransactions_transactionDate
                "2013-11-23"^^xsd:date ;
        vocab:accounttransactions_transactionType
                "Cr" ;
        vocab:accounttransactions_transactionValue
                "12000"^^xsd:decimal .

N-Triples中的数据

<http://localhost:2020/resource/accounttransactions/1> <http://localhost:2020/resource/vocab/accounttransactions_id> "1"^^<http://www.w3.org/2001/XMLSchema#integer> .
<http://localhost:2020/resource/accounttransactions/1> <http://localhost:2020/resource/vocab/accounttransactions_transactionCurrency> "USD" .
<http://localhost:2020/resource/accounttransactions/1> <http://localhost:2020/resource/vocab/accounttransactions_originAccountNumber> "DB48939239" .
<http://localhost:2020/resource/accounttransactions/1> <http://localhost:2020/resource/vocab/accounttransactions_transactionType> "Cr" .
<http://localhost:2020/resource/accounttransactions/1> <http://localhost:2020/resource/vocab/accounttransactions_transactionDate> "2013-11-23"^^<http://www.w3.org/2001/XMLSchema#date> .
<http://localhost:2020/resource/accounttransactions/1> <http://localhost:2020/resource/vocab/accounttransactions_destinationAccountId> <http://localhost:2020/resource/bankaccounts/1> .
<http://localhost:2020/resource/accounttransactions/1> <http://localhost:2020/resource/vocab/accounttransactions_transactionValue> "12000"^^<http://www.w3.org/2001/XMLSchema#decimal> .
<http://localhost:2020/resource/accounttransactions/1> <http://www.w3.org/2000/01/rdf-schema#label> "accounttransactions #1" .
<http://localhost:2020/resource/accounttransactions/1> <http://localhost:2020/resource/vocab/accounttransactions_destinationAccountNumber> "47321896544567" .
<http://localhost:2020/resource/accounttransactions/1> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://localhost:2020/resource/vocab/accounttransactions> .

你是对的,由于主题相同,而且JSON对象是一个关联数组,每次使用put方法都会覆盖先前的键(主题)条目。感谢您对RDF JSON的建议。目标是使用JSON来呈现图形,所以对我来说,使用RDFDataMgr.write(baos,result.getGraph(),Lang.RDFJSON);也可以工作。 - Nikhil Silveira

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接