斯坦福解析器(http://nlp.stanford.edu/software/lex-parser.shtml)生成上下文无关短语结构树,如下所示。有没有最好的方法来提取像所有名词短语(NP)和动词短语(NP)在树中一样的东西?是否有任何Python(或Java)库可以让我读取这些结构?谢谢。
(ROOT
(S
(S
(NP
(NP (DT The) (JJS strongest) (NN rain))
(VP
(ADVP (RB ever))
(VBN recorded)
(PP (IN in)
(NP (NNP India)))))
(VP
(VP (VBD shut)
(PRT (RP down))
(NP
(NP (DT the) (JJ financial) (NN hub))
(PP (IN of)
(NP (NNP Mumbai)))))
(, ,)
(VP (VBD snapped)
(NP (NN communication) (NNS lines)))
(, ,)
(VP (VBD closed)
(NP (NNS airports)))
(CC and)
(VP (VBD forced)
(NP
(NP (NNS thousands))
(PP (IN of)
(NP (NNS people))))
(S
(VP (TO to)
(VP
(VP (VB sleep)
(PP (IN in)
(NP (PRP$ their) (NNS offices))))
(CC or)
(VP (VB walk)
(NP (NN home))
(PP (IN during)
(NP (DT the) (NN night))))))))))
(, ,)
(NP (NNS officials))
(VP (VBD said)
(NP-TMP (NN today)))
(. .)))