使用斯坦福解析器(CoreNLP)查找短语头

Question

使用斯坦福解析器(CoreNLP)查找短语头

8

我将使用Stanford Corenlp 2013查找短语头。我看到了这个帖子。但是，答案对我来说不太清楚，我无法添加任何评论以继续该线程。因此，很抱歉出现了重复。目前我拥有一个句子的解析树（使用Stanford Corenlp）（我还尝试了由Stanford Corenlp创建的CONLL格式）。我需要的正是名词短语的头部。我不知道如何使用依存关系和解析树提取名词短语的头部。我知道的是，如果我有nsubj(x,y)，y是主语的头部。如果我有dobj(x,y)，y是直接宾语的头部。如果我有iobj(x,y)，y是间接宾语的头部。然而，我不确定这种方式是否是查找所有短语头的正确方式。如果是，我应该添加哪些规则才能获取所有名词短语的头部？也许值得一提的是，我需要在Java代码中使用名词短语的头部。

- Alice1989

2个回答

4

你可以提取感兴趣的短语，使其成为类Tree的对象。然后，你可以使用实现接口HeadFinder的任何类中的determineHead(Tree t)方法。

- Chaitanya Shivade

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- TheGT · Accepted Answer

由于我无法对Chaitanya提供的答案进行评论，在此为他的答案添加更多内容。

斯坦福CoreNLP套件实现了Collins头查找启发式和语义头查找启发式，具体如下：

CollinsHeadFinder
ModCollinsHeadFinder
SemanticHeadFinder

您只需要实例化其中之一并执行以下操作即可。

Tree tree = sentence.get(TreeCoreAnnotations.TreeAnnotation.class);
headFinder.determineHead(tree).pennPrint(out);

你可以遍历树的节点，在需要时确定头词。

PS：我的答案基于20140104发布的StanfordCoreNLP套件。

这是一个简单的深度优先搜索算法，可让您提取句子中所有名词短语的头词。

public static void dfs(Tree node, Tree parent, HeadFinder headFinder) {
      if (node == null || node.isLeaf()) {
         return;
      }
      //if node is a NP - Get the terminal nodes to get the words in the NP      
      if(node.value().equals("NP") ) {

         System.out.println(" Noun Phrase is ");
         List<Tree> leaves = node.getLeaves();

         for(Tree leaf : leaves) {
            System.out.print(leaf.toString()+" ");

         }
         System.out.println();

         System.out.println(" Head string is ");
         System.out.println(node.headTerminal(headFinder, parent));

    }

    for(Tree child : node.children()) {
         dfs(child, node, headFinder);
    }

 }