我可以在Python中使用Spacy来查找具有特定邻居的名词短语吗?我想要从我的文本中获取在其前后有动词的名词短语。
分析依存句法树,并查看相邻标记的POS。
>>> import spacy
>>> nlp = spacy.load('en')
>>> sent = u'run python program run, to make this work'
>>> parsed = nlp(sent)
>>> list(parsed.noun_chunks)
[python program]
>>> for noun_phrase in list(parsed.noun_chunks):
... noun_phrase.merge(noun_phrase.root.tag_, noun_phrase.root.lemma_, noun_phrase.root.ent_type_)
...
python program
>>> [(token.text,token.pos_) for token in parsed]
[(u'run', u'VERB'), (u'python program', u'NOUN'), (u'run', u'VERB'), (u',', u'PUNCT'), (u'to', u'PART'), (u'make', u'VERB'), (u'this', u'DET'), (u'work', u'NOUN')]
通过分析相邻标记的词性,您可以得到所需的名词短语。
(u'run', u'VERB'), (u'python program', u'NOUN'), (u'run', u'VERB')
这一行代码表示关于"python程序"的什么信息? - DhruvPathak来自 https://spacy.io/usage/linguistic-features#dependency-parse
您可以使用名词短语。名词短语是“基本名词短语”——具有名词作为其头部的平面短语。您可以将名词短语视为名词加上描述名词的单词,例如,“奢华的绿草”或“世界最大的技术基金”。要获取文档中的名词短语,只需迭代Doc.noun_chunks
。
In:
import spacy
nlp = spacy.load('en_core_web_sm')
doc = nlp(u"Autonomous cars shift insurance liability toward manufacturers")
for chunk in doc.noun_chunks:
print(chunk.text)
Out:
Autonomous cars
insurance liability
manufacturers
import spacy
nlp = spacy.load('en_core_web_sm')
nlp.add_pipe(nlp.create_pipe('merge_noun_chunks'))
doc = nlp(u"Autonomous cars shift insurance liability toward manufacturers")
for token in doc:
print(token.text)
输出将是:
Autonomous cars
shift
insurance liability
toward
manufacturers