如何在Python中根据键值计算字典的频率?

4
假设我有一个字典,其中包含单词和短语的形式如下。
{
    ('The brown fox',): [0], ('the race',): [0], ('Apple',): [1], 
    ('a company Apple',): [1], ('iphone',): [1], ('Paris',): [2],
    ('Delhi',): [2], ('London',): [2], ('world cities',): [2], 
    ('home',): [3, 4], ('order delivery food',): [3], ('simple voice command',): [3], 
    ('dinner',): [3], ('a long day',): [3], ('work',): [3], 
    ('teams',): [4], ('goal home',): [4], ('fox world',): [5], 
    ('a world class company',): [5], ('A geyser heating system',): [6], ('a lot',): [7], 
    ('the book Python',): [7], ('an amazing language',): [7], ('i',): [8], 
    ('a good boy',): [8], ('Team Performance',): [9], ('Revolv central automation device',): [10], 
    ('the switch way',): [11], ('play children',): [12]
}

我希望能够根据给定的关键值计算所有单词/短语的频率。
例如:仅单词“home”的频率需要为2(因为它在3和4个关键值中都出现了)。其余所有单词/短语的频率均为1。
我尝试使用
Counter(index.values()).most_common()
是否存在一种用Python实现这种计算的方法?
2个回答

1
Mishra。您可以尝试。
frequencies = []
for key in your_dictionary.keys():
    frequencies.append(len(your_dictionary[key]))

如果您只想将频率分别列在列表中。
或者,如果您希望能够从单词或短语中获取频率:
frequency_from_phrase = {}
for key in your_dictionary.keys():
    frequency_from_phrase[key] = len(your_dictionary[key])

非常感谢 @TechPerson。 - M S

1
你可以使用字典推导式来获取一个以短语为键,计数为值的字典。
d = {('The brown fox',): [0], ('the race',): [0], ('Apple',): [1], ('a company Apple',): [1], ('iphone',): [1], ('Paris',): [2], ('Delhi',): [2], ('London',): [2], ('world cities',): [2], ('home',): [3, 4], ('order delivery food',): [3], ('simple voice command',): [3], ('dinner',): [3], ('a long day',): [3], ('work',): [3], ('teams',): [4], ('goal home',): [4], ('fox world',): [5], ('a world class company',): [5], ('A geyser heating system',): [6], ('a lot',): [7], ('the book Python',): [7], ('an amazing language',): [7], ('i',): [8], ('a good boy',): [8], ('Team Performance',): [9], ('Revolv central automation device',): [10], ('the switch way',): [11], ('play children',): [12]}

frequency = {k[0]: len(v) for k, v in d.items()}

print(frequency)
# {'The brown fox': 1, 'the race': 1, 'Apple': 1, 'a company Apple': 1, 'iphone': 1, 'Paris': 1, 'Delhi': 1, 'London': 1, 'world cities': 1, 'home': 2, 'order delivery food': 1, 'simple voice command': 1, 'dinner': 1, 'a long day': 1, 'work': 1, 'teams': 1, 'goal home': 1, 'fox world': 1, 'a world class company': 1, 'A geyser heating system': 1, 'a lot': 1, 'the book Python': 1, 'an amazing language': 1, 'i': 1, 'a good boy': 1, 'Team Performance': 1, 'Revolv central automation device': 1, 'the switch way': 1, 'play children': 1}

非常感谢,这很有帮助。 - M S
是否有可能在保持键值的情况下同时打印频率和键值?例如对于单词“home”,键值为[1,2],频率为2。如何按排序顺序获取频率? - M S
1
@MishraS - 在Python 3.7中,字典是有序的,因此您可以执行类似于{k: (len(v), v) for k, v in sorted(d.items(), key=lambda x: len(x[1]), reverse=True)}的操作,以返回一个包含原始键和元组(包含计数和值列表,按计数降序排序)的字典。在早期的Python版本中,您可以使用类似于OrderedDict的方法。 - benvc

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接