将复数名词转换为单数NLP

8

我有一个复数名词列表,例如苹果、橙子等。我想把它们全部转换为单数名词。是否有此目的的工具?最好使用Java或Python。

2个回答

10

例如,有一个名为inflect的库。

示例:

import inflect
p = inflect.engine()

words = ["apples", "sheep", "oranges", "cats", "people", "dice", "pence"]

for word in words:
    print("The singular of ", word, " is ", p.singular_noun(word))

输出:

('The singular of ', 'apples', ' is ', 'apple')
('The singular of ', 'sheep', ' is ', 'sheep')
('The singular of ', 'oranges', ' is ', 'orange')
('The singular of ', 'cats', ' is ', 'cat')
('The singular of ', 'people', ' is ', 'person')
('The singular of ', 'dice', ' is ', 'die')
('The singular of ', 'pence', ' is ', 'pence')

Sources:


我尝试过这个模块,总体上它是可行的,但并不完美。例如,如果你传递'as'或'does'、'means',它会输出'a'和'doe'、'mean',这可能不是我们想要的结果。 - StayFoolish

2
您可以使用Java库SimpleNLG(https://github.com/simplenlg/simplenlg),或者使用其Python封装PyNLG(https://github.com/mapado/pynlg)(pip install pynlg)。
它具有广泛的词库集合,可以识别许多对象的数量形式。您可以设置其特征并打印出其单数形式。对于简单的任务,它的效果非常好。

Lexicon lexicon = Lexicon.getDefaultLexicon();

NLGFactory nlgFactory = new NLGFactory(lexicon);

NPPhraseSpec subject = nlgFactory.createNounPhrase("apples"); subject.setFeature(Feature.NUMBER, NumberAgreement.SINGULAR);

将会得到“Apple”。默认情况下,SimpleNLG会将其识别为名词短语并转换为单数形式。

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接