WordNet非常好用,但我在nltk中获取同义词方面遇到了困难。如果您搜索类似于单词'small'的这里,它将显示所有同义词。
基本上,我只需要知道以下内容:wn.synsets('word')[i].option()
其中选项可以是上位词和反义词,但用于获取同义词的选项是什么?
如果您想要获取词义集中的同义词(也就是构成该集合的词元),可以使用lemma_names()
方法:
>>> for ss in wn.synsets('small'):
>>> print(ss.name(), ss.lemma_names())
small.n.01 ['small']
small.n.02 ['small']
small.a.01 ['small', 'little']
minor.s.10 ['minor', 'modest', 'small', 'small-scale', 'pocket-size', 'pocket-sized']
little.s.03 ['little', 'small']
small.s.04 ['small']
humble.s.01 ['humble', 'low', 'lowly', 'modest', 'small']
...
wordnet.synset
和lemmas来获取所有的同义词:from itertools import chain
from nltk.corpus import wordnet
synonyms = wordnet.synsets(text)
lemmas = set(chain.from_iterable([word.lemma_names() for word in synonyms]))
演示:
>>> synonyms = wordnet.synsets('change')
>>> set(chain.from_iterable([word.lemma_names() for word in synonyms]))
set([u'interchange', u'convert', u'variety', u'vary', u'exchange', u'modify', u'alteration', u'switch', u'commute', u'shift', u'modification', u'deepen', u'transfer', u'alter', u'change'])
from nltk.corpus import wordnet
- Jasonsynonyms = wordnet.synsets('test')
会失败。 - Johan你可能会对一个Synset
感兴趣:
>>> wn.synsets('small')
[Synset('small.n.01'),
Synset('small.n.02'),
Synset('small.a.01'),
Synset('minor.s.10'),
Synset('little.s.03'),
Synset('small.s.04'),
Synset('humble.s.01'),
Synset('little.s.07'),
Synset('little.s.05'),
Synset('small.s.08'),
Synset('modest.s.02'),
Synset('belittled.s.01'),
Synset('small.r.01')]
这是与网页界面提供给您的相同的顶级词条列表。
如果您还想要“类似于”列表,那么这与同义词并不相同。为此,您需要在每个Synset
上调用similar_tos()
。
因此,要显示与网站相同的信息,请从以下类似内容开始:
for ss in wn.synsets('small'):
print(ss)
for sim in ss.similar_tos():
print(' {}'.format(sim))
当然,该网站还会打印出每个同义词集的词性(sim.pos
)、词形列表(sim.lemma_names
)、定义(sim.definition
)和例句(sim.examples
),并在两个层面上按词性对它们进行分组。此外,它还添加了链接到其他可跟随的内容等等。但这应该足以让您入门。
wn.synsets('word')
返回“word”的同义词。相反,这个函数返回“word”不同语义概念的列表。一个概念或同义词集的同义词可以通过 wn.synsets('word')[i].lemmas()
获取。 - char bugswn.synsets('whiz')
的输出将包括"wiz",但实际上并不包括。然而,for synset in wn.synsets('whiz'): print synset.lemma_names()
的输出确实包括"wiz"。 - user82216from nltk.corpus import wordnet
for syn in wordnet.synsets("good"):
for name in syn.lemma_names():
print(name)
def download_nltk_dependencies_if_needed():
try:
nltk.word_tokenize('foobar')
except LookupError:
nltk.download('punkt')
try:
nltk.pos_tag(nltk.word_tokenize('foobar'))
except LookupError:
nltk.download('averaged_perceptron_tagger')
def get_some_word_synonyms(word):
word = word.lower()
synonyms = []
synsets = wordnet.synsets(word)
if (len(synsets) == 0):
return []
synset = synsets[0]
lemma_names = synset.lemma_names()
for lemma_name in lemma_names:
lemma_name = lemma_name.lower().replace('_', ' ')
if (lemma_name != word and lemma_name not in synonyms):
synonyms.append(lemma_name)
return synonyms
def get_all_word_synonyms(word):
word = word.lower()
synonyms = []
synsets = wordnet.synsets(word)
if (len(synsets) == 0):
return []
for synset in synsets:
lemma_names = synset.lemma_names()
for lemma_name in lemma_names:
lemma_name = lemma_name.lower().replace('_', ' ')
if (lemma_name != word and lemma_name not in synonyms):
synonyms.append(lemma_name)
return synonyms
示例1:get_some_word_synonyms
这种方法通常会返回最相关的同义词,但是某些单词(如“angry”)可能不会返回任何同义词。
download_nltk_dependencies_if_needed()
words = ['dog', 'fire', 'erupted', 'throw', 'sweet', 'center', 'said', 'angry', 'iPhone', 'ThisIsNotARealWorddd', 'awesome', 'amazing', 'jim dandy', 'change']
for word in words:
print('Synonyms for {}:'.format(word))
synonyms = get_some_word_synonyms(word)
for synonym in synonyms:
print(" {}".format(synonym))
Synonyms for dog:
domestic dog
canis familiaris
Synonyms for fire:
Synonyms for erupted:
erupt
break out
Synonyms for throw:
Synonyms for sweet:
henry sweet
Synonyms for center:
centre
middle
heart
eye
Synonyms for said:
state
say
tell
Synonyms for angry:
Synonyms for iPhone:
Synonyms for ThisIsNotARealWorddd:
Synonyms for awesome:
amazing
awe-inspiring
awful
awing
Synonyms for amazing:
amaze
astonish
astound
Synonyms for jim dandy:
Synonyms for change:
alteration
modification
示例 2: get_all_word_synonyms
这种方法将返回所有可能的同义词,但有些可能不是非常相关。
download_nltk_dependencies_if_needed()
words = ['dog', 'fire', 'erupted', 'throw', 'sweet', 'center', 'said', 'angry', 'iPhone', 'ThisIsNotARealWorddd', 'awesome', 'amazing', 'jim dandy', 'change']
for word in words:
print('Synonyms for {}:'.format(word))
synonyms = get_some_word_synonyms(word)
for synonym in synonyms:
print(" {}".format(synonym))
Synonyms for dog:
domestic dog
canis familiaris
frump
cad
bounder
blackguard
hound
heel
frank
frankfurter
hotdog
hot dog
wiener
wienerwurst
weenie
pawl
detent
click
andiron
firedog
dog-iron
chase
chase after
trail
tail
tag
give chase
go after
track
Synonyms for fire:
firing
flame
flaming
ardor
ardour
fervor
fervour
fervency
fervidness
attack
flak
flack
blast
open fire
discharge
displace
give notice
can
dismiss
give the axe
send away
sack
force out
give the sack
terminate
go off
arouse
elicit
enkindle
kindle
evoke
raise
provoke
burn
burn down
fuel
Synonyms for erupted:
erupt
break out
irrupt
flare up
flare
break open
burst out
ignite
catch fire
take fire
combust
conflagrate
come out
break through
push through
belch
extravasate
break
burst
recrudesce
Synonyms for throw:
stroke
cam stroke
shed
cast
cast off
shake off
throw off
throw away
drop
thrust
give
flip
switch
project
contrive
bewilder
bemuse
discombobulate
hurl
hold
have
make
confuse
fox
befuddle
fuddle
bedevil
confound
Synonyms for sweet:
henry sweet
dessert
afters
confection
sweetness
sugariness
angelic
angelical
cherubic
seraphic
dulcet
honeyed
mellifluous
mellisonant
gratifying
odoriferous
odorous
perfumed
scented
sweet-scented
sweet-smelling
fresh
unfermented
sugared
sweetened
sweet-flavored
sweetly
Synonyms for center:
centre
middle
heart
eye
center field
centerfield
midpoint
kernel
substance
core
essence
gist
heart and soul
inwardness
marrow
meat
nub
pith
sum
nitty-gritty
center of attention
centre of attention
nerve center
nerve centre
snapper
plaza
mall
shopping mall
shopping center
shopping centre
focus on
center on
revolve around
revolve about
concentrate on
concentrate
focus
pore
rivet
halfway
midway
Synonyms for said:
state
say
tell
allege
aver
suppose
read
order
enjoin
pronounce
articulate
enounce
sound out
enunciate
aforesaid
aforementioned
Synonyms for angry:
furious
raging
tempestuous
wild
Synonyms for iPhone:
Synonyms for ThisIsNotARealWorddd:
Synonyms for awesome:
amazing
awe-inspiring
awful
awing
Synonyms for amazing:
amaze
astonish
astound
perplex
vex
stick
get
puzzle
mystify
baffle
beat
pose
bewilder
flummox
stupefy
nonplus
gravel
dumbfound
astonishing
awe-inspiring
awesome
awful
awing
Synonyms for jim dandy:
Synonyms for change:
alteration
modification
variety
alter
modify
vary
switch
shift
exchange
commute
convert
interchange
transfer
deepen
from nltk.corpus import wordnet as wn
def get_all_synonyms(word):
synonyms = []
for ss in wn.synsets(word):
synonyms.extend(ss.lemma_names())
for sim in ss.similar_tos():
synonyms_batch = sim.lemma_names()
synonyms.extend(synonyms_batch)
synonyms = set(synonyms)
if word in synonyms:
synonyms.remove(word)
synonyms = [synonym.replace('_',' ') for synonym in synonyms]
return synonyms
get_all_synonyms('small')
这对我有用
wordnet.synsets('change')[0].hypernyms()[0].lemma_names()
我最近编写了一个同义词词典查询程序,我使用了以下函数:
def find_synonyms(keyword) :
synonyms = []
for synset in wordnet.synsets(keyword):
for lemma in synset.lemmas():
synonyms.append(lemma.name())
return str(synonyms)
但是如果你更喜欢托管自己的词典,你可能会对我在Github页面上关于离线同义词字典查找的项目感兴趣:
https://github.com/syauqiex/offline_english_synonym_dictionary
wn.synsets('small')
,它具有与网页完全相同的顶级成员。 - abarnertwn.synsets('word')[i].hypernyms
只会返回一个绑定方法;我认为你想要在末尾加上()
… - abarnert