我有以下代码来查找两个单词短语的频率。我需要为三个单词的短语做同样的事情。但是下面的代码似乎对于三个单词的短语不起作用。
from collections import Counter
import re
sentence = "I love TV show makes me happy, I love also comedy show makes me feel like flying"
words = re.findall(r'\w+', sentence)
two_words = [' '.join(ws) for ws in zip(words, words[1:])]
wordscount = {w:f for w, f in Counter(two_words).most_common() if f > 1}
wordscount
{'show makes': 2, 'makes me': 2, 'I love': 2}
str.join
应该延迟到最后的最小计数步骤过滤之前。 - jppnwise(words, 3)
直接输入到计数器中,并在需要时进行str.join
。 - L3viathan