如何从nltk WordNet Python获取同义词

41

WordNet非常好用,但我在nltk中获取同义词方面遇到了困难。如果您搜索类似于单词'small'的这里,它将显示所有同义词。

基本上,我只需要知道以下内容:wn.synsets('word')[i].option() 其中选项可以是上位词和反义词,但用于获取同义词的选项是什么?


2
同义词集已经是一个同义词列表。如果您查看wn.synsets('small'),它具有与网页完全相同的顶级成员。 - abarnert
另外,wn.synsets('word')[i].hypernyms 只会返回一个绑定方法;我认为你想要在末尾加上 () - abarnert
抱歉,让我更具体一些,我想获得第一个相似形容词的类似选项。一些词包括:原子、亚原子、班姆。 - user2758113
好的,Wordnet(和NLTK)在其术语上非常谨慎。如果您想要除了同义词之外的其他内容,那么搜索同义词是没有帮助的。 - abarnert
另请参见:https://dev59.com/kmIk5IYBdhLWcg3wWtBz#19383914 - alvas
8个回答

63

如果您想要获取词义集中的同义词(也就是构成该集合的词元),可以使用lemma_names()方法:

>>> for ss in wn.synsets('small'):
>>>     print(ss.name(), ss.lemma_names())

small.n.01 ['small']
small.n.02 ['small']
small.a.01 ['small', 'little']
minor.s.10 ['minor', 'modest', 'small', 'small-scale', 'pocket-size',  'pocket-sized']
little.s.03 ['little', 'small']
small.s.04 ['small']
humble.s.01 ['humble', 'low', 'lowly', 'modest', 'small']    
...

8
楼主真应该将这个回答标记为正确答案。 - user82216

21
你可以使用wordnet.synsetlemmas来获取所有的同义词:
例子:
from itertools import chain
from nltk.corpus import wordnet

synonyms = wordnet.synsets(text)
lemmas = set(chain.from_iterable([word.lemma_names() for word in synonyms]))

演示:

>>> synonyms = wordnet.synsets('change')
>>> set(chain.from_iterable([word.lemma_names() for word in synonyms]))
set([u'interchange', u'convert', u'variety', u'vary', u'exchange', u'modify', u'alteration', u'switch', u'commute', u'shift', u'modification', u'deepen', u'transfer', u'alter', u'change'])

1
第一个导入应该是 'from itertools import chain'。 - Saradhi
不要忘记:from nltk.corpus import wordnet - Jason
当lemma_names()返回嵌套列表时,它无法工作。例如,对于synonyms = wordnet.synsets('test')会失败。 - Johan
@ Johan 这是另一个问题,你可以按照这里解释的方式来解决 https://dev59.com/j14b5IYBdhLWcg3wYAf6#29244327 - Mazdak

13

你可能会对一个Synset感兴趣:

>>> wn.synsets('small')
[Synset('small.n.01'),
 Synset('small.n.02'),
 Synset('small.a.01'),
 Synset('minor.s.10'),
 Synset('little.s.03'),
 Synset('small.s.04'),
 Synset('humble.s.01'),
 Synset('little.s.07'),
 Synset('little.s.05'),
 Synset('small.s.08'),
 Synset('modest.s.02'),
 Synset('belittled.s.01'),
 Synset('small.r.01')]

这是与网页界面提供给您的相同的顶级词条列表。

如果您还想要“类似于”列表,那么这与同义词并不相同。为此,您需要在每个Synset上调用similar_tos()

因此,要显示与网站相同的信息,请从以下类似内容开始:

for ss in wn.synsets('small'):
    print(ss)
    for sim in ss.similar_tos():
        print('    {}'.format(sim))

当然,该网站还会打印出每个同义词集的词性(sim.pos)、词形列表(sim.lemma_names)、定义(sim.definition)和例句(sim.examples),并在两个层面上按词性对它们进行分组。此外,它还添加了链接到其他可跟随的内容等等。但这应该足以让您入门。


18
这篇文章的建议是错误的,即 wn.synsets('word') 返回“word”的同义词。相反,这个函数返回“word”不同语义概念的列表。一个概念或同义词集的同义词可以通过 wn.synsets('word')[i].lemmas() 获取。 - char bugs
2
@charbugs,我同意:这个答案是错误的。例如,“wiz”是“whiz”的一个义项的同义词,即它是一个拼写不同但含义相同的单词。如果我们正在评论的答案是正确的,那么wn.synsets('whiz')的输出将包括"wiz",但实际上并不包括。然而,for synset in wn.synsets('whiz'): print synset.lemma_names()的输出确实包括"wiz"。 - user82216
这个答案似乎比被采纳的答案更好。包含 similar_tos 可以得到额外的输出,正如原问题所要求的那样。 - Richard Shepherd

4
打印给定单词同义词的最简程序。
from nltk.corpus import wordnet

for syn in wordnet.synsets("good"):
    for name in syn.lemma_names():
        print(name)

1
这里有一些辅助函数,可以使NLTK更易于使用,并且有两个示例说明如何使用这些函数。
def download_nltk_dependencies_if_needed():
    try:
        nltk.word_tokenize('foobar')
    except LookupError:
        nltk.download('punkt')
    try:
        nltk.pos_tag(nltk.word_tokenize('foobar'))
    except LookupError:
        nltk.download('averaged_perceptron_tagger')

def get_some_word_synonyms(word):
    word = word.lower()
    synonyms = []
    synsets = wordnet.synsets(word)
    if (len(synsets) == 0):
        return []
    synset = synsets[0]
    lemma_names = synset.lemma_names()
    for lemma_name in lemma_names:
        lemma_name = lemma_name.lower().replace('_', ' ')
        if (lemma_name != word and lemma_name not in synonyms):
            synonyms.append(lemma_name)
    return synonyms

def get_all_word_synonyms(word):
    word = word.lower()
    synonyms = []
    synsets = wordnet.synsets(word)
    if (len(synsets) == 0):
        return []
    for synset in synsets:
        lemma_names = synset.lemma_names()
        for lemma_name in lemma_names:
            lemma_name = lemma_name.lower().replace('_', ' ')
            if (lemma_name != word and lemma_name not in synonyms):
                synonyms.append(lemma_name)
    return synonyms

示例1:get_some_word_synonyms

这种方法通常会返回最相关的同义词,但是某些单词(如“angry”)可能不会返回任何同义词。

download_nltk_dependencies_if_needed()

words = ['dog', 'fire', 'erupted', 'throw', 'sweet', 'center', 'said', 'angry', 'iPhone', 'ThisIsNotARealWorddd', 'awesome', 'amazing', 'jim dandy', 'change']

for word in words:
    print('Synonyms for {}:'.format(word))
    synonyms = get_some_word_synonyms(word)
    for synonym in synonyms:
        print("    {}".format(synonym))

例1输出:
Synonyms for dog:
    domestic dog
    canis familiaris
Synonyms for fire:
Synonyms for erupted:
    erupt
    break out
Synonyms for throw:
Synonyms for sweet:
    henry sweet
Synonyms for center:
    centre
    middle
    heart
    eye
Synonyms for said:
    state
    say
    tell
Synonyms for angry:
Synonyms for iPhone:
Synonyms for ThisIsNotARealWorddd:
Synonyms for awesome:
    amazing
    awe-inspiring
    awful
    awing
Synonyms for amazing:
    amaze
    astonish
    astound
Synonyms for jim dandy:
Synonyms for change:
    alteration
    modification

示例 2: get_all_word_synonyms

这种方法将返回所有可能的同义词,但有些可能不是非常相关。

download_nltk_dependencies_if_needed()

words = ['dog', 'fire', 'erupted', 'throw', 'sweet', 'center', 'said', 'angry', 'iPhone', 'ThisIsNotARealWorddd', 'awesome', 'amazing', 'jim dandy', 'change']

for word in words:
    print('Synonyms for {}:'.format(word))
    synonyms = get_some_word_synonyms(word)
    for synonym in synonyms:
        print("    {}".format(synonym))

例子2输出:
Synonyms for dog:
    domestic dog
    canis familiaris
    frump
    cad
    bounder
    blackguard
    hound
    heel
    frank
    frankfurter
    hotdog
    hot dog
    wiener
    wienerwurst
    weenie
    pawl
    detent
    click
    andiron
    firedog
    dog-iron
    chase
    chase after
    trail
    tail
    tag
    give chase
    go after
    track
Synonyms for fire:
    firing
    flame
    flaming
    ardor
    ardour
    fervor
    fervour
    fervency
    fervidness
    attack
    flak
    flack
    blast
    open fire
    discharge
    displace
    give notice
    can
    dismiss
    give the axe
    send away
    sack
    force out
    give the sack
    terminate
    go off
    arouse
    elicit
    enkindle
    kindle
    evoke
    raise
    provoke
    burn
    burn down
    fuel
Synonyms for erupted:
    erupt
    break out
    irrupt
    flare up
    flare
    break open
    burst out
    ignite
    catch fire
    take fire
    combust
    conflagrate
    come out
    break through
    push through
    belch
    extravasate
    break
    burst
    recrudesce
Synonyms for throw:
    stroke
    cam stroke
    shed
    cast
    cast off
    shake off
    throw off
    throw away
    drop
    thrust
    give
    flip
    switch
    project
    contrive
    bewilder
    bemuse
    discombobulate
    hurl
    hold
    have
    make
    confuse
    fox
    befuddle
    fuddle
    bedevil
    confound
Synonyms for sweet:
    henry sweet
    dessert
    afters
    confection
    sweetness
    sugariness
    angelic
    angelical
    cherubic
    seraphic
    dulcet
    honeyed
    mellifluous
    mellisonant
    gratifying
    odoriferous
    odorous
    perfumed
    scented
    sweet-scented
    sweet-smelling
    fresh
    unfermented
    sugared
    sweetened
    sweet-flavored
    sweetly
Synonyms for center:
    centre
    middle
    heart
    eye
    center field
    centerfield
    midpoint
    kernel
    substance
    core
    essence
    gist
    heart and soul
    inwardness
    marrow
    meat
    nub
    pith
    sum
    nitty-gritty
    center of attention
    centre of attention
    nerve center
    nerve centre
    snapper
    plaza
    mall
    shopping mall
    shopping center
    shopping centre
    focus on
    center on
    revolve around
    revolve about
    concentrate on
    concentrate
    focus
    pore
    rivet
    halfway
    midway
Synonyms for said:
    state
    say
    tell
    allege
    aver
    suppose
    read
    order
    enjoin
    pronounce
    articulate
    enounce
    sound out
    enunciate
    aforesaid
    aforementioned
Synonyms for angry:
    furious
    raging
    tempestuous
    wild
Synonyms for iPhone:
Synonyms for ThisIsNotARealWorddd:
Synonyms for awesome:
    amazing
    awe-inspiring
    awful
    awing
Synonyms for amazing:
    amaze
    astonish
    astound
    perplex
    vex
    stick
    get
    puzzle
    mystify
    baffle
    beat
    pose
    bewilder
    flummox
    stupefy
    nonplus
    gravel
    dumbfound
    astonishing
    awe-inspiring
    awesome
    awful
    awing
Synonyms for jim dandy:
Synonyms for change:
    alteration
    modification
    variety
    alter
    modify
    vary
    switch
    shift
    exchange
    commute
    convert
    interchange
    transfer
    deepen

0
也许这些不是在WordNet术语上的同义词。但我希望我的函数返回所有类似的单词,比如'weeny'、'flyspeck'等。你可以在作者link所提供的单词'small'中看到它们。我使用了以下代码:
from nltk.corpus import wordnet as wn

def get_all_synonyms(word):
    synonyms = []
    for ss in wn.synsets(word):
        synonyms.extend(ss.lemma_names())
        for sim in ss.similar_tos():
            synonyms_batch = sim.lemma_names()
            synonyms.extend(synonyms_batch)
    synonyms = set(synonyms)
    if word in synonyms:
        synonyms.remove(word)
    synonyms = [synonym.replace('_',' ') for synonym in synonyms]
    return synonyms

get_all_synonyms('small')

0

这对我有用

wordnet.synsets('change')[0].hypernyms()[0].lemma_names()


0

我最近编写了一个同义词词典查询程序,我使用了以下函数:

def find_synonyms(keyword) :

    synonyms = []
    for synset in wordnet.synsets(keyword):
        for lemma in synset.lemmas():
            synonyms.append(lemma.name())

    return str(synonyms)

但是如果你更喜欢托管自己的词典,你可能会对我在Github页面上关于离线同义词字典查找的项目感兴趣:

https://github.com/syauqiex/offline_english_synonym_dictionary


网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接