在字符串中计算字母频率(Python)

14

我正在尝试计算单词中每个字母出现的次数

word = input("Enter a word")

Alphabet=['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']

for i in range(0,26):
    print(word.count(Alphabet[i]))

当前代码输出每个字母出现的次数,包括那些没有出现过的。

如何垂直列出每个字母及其频率,例如下面这样?

word="Hello"

H 1

E 1

L 2

O 1


2
30秒的搜索就可以发现你可以使用collections.Counter - Pythonista
3
这似乎是一个作业问题,所以你可能需要阅读这些指南,了解如何在SO上提出这样的问题。我马上会发布几个答案。 - evadeflow
13个回答

37
from collections import Counter
counts=Counter(word) # Counter({'l': 2, 'H': 1, 'e': 1, 'o': 1})
for i in word:
    print(i,counts[i])

尝试使用Counter,它将创建一个字典,其中包含集合中所有项目的频率。

否则,您可以对当前代码进行条件判断,仅在word.count(Alphabet[i])大于0时print,但这会更慢。


13
def char_frequency(str1):
    dict = {}
    for n in str1:
        keys = dict.keys()
        if n in keys:
            dict[n] += 1
        else:
            dict[n] = 1
    return dict
print(char_frequency('google.com'))

10
感谢您提供这段代码片段,它可能会在短期内提供一些有限的帮助。通过展示为什么这是一个好的解决方案,适当的解释将极大地提高它的长期价值,并使它对未来有其他类似问题的读者更有用。请编辑您的答案,添加一些解释,包括您所做的假设。 - Toby Speight
1
只需遍历字符串并在字典中形成新出现元素的键,或者如果元素已经出现,则将其值增加1。 - noob_coder
仅使用dict.keys()进行'in'测试是毫无意义的。 - Tony Suffolk 66
请注意,您不应将字典命名为“dict”,因为这是一个保留字(它是字典的构造函数术语!) - duhaime
ctrl c&ctrl v。从这里开始 - > https://www.w3resource.com/python-exercises/string/python-data-type-string-exercise-2.php - Lord-shiv

9

正如Pythonista所说的那样,这是collections.Counter的工作:

from collections import Counter
print(Counter('cats on wheels'))

这将打印:

{'s': 2, ' ': 2, 'e': 2, 't': 1, 'n': 1, 'l': 1, 'a': 1, 'c': 1, 'w': 1, 'h': 1, 'o': 1}

2

2
s = input()
t = s.lower()

for i in range(len(s)):
    b = t.count(t[i])
    print("{} -- {}".format(s[i], b))

1

针对LMc所说的内容, 您的代码已经非常接近于功能实现。您只需要对结果集进行后处理,以去除“不相关”的输出。下面是让您的代码正常工作的一种方法:

#!/usr/bin/env python
word = raw_input("Enter a word: ")

Alphabet = [
    'a','b','c','d','e','f','g','h','i','j','k','l','m',
    'n','o','p','q','r','s','t','u','v','w','x','y','z'
]

hits = [
    (Alphabet[i], word.count(Alphabet[i]))
    for i in range(len(Alphabet))
    if word.count(Alphabet[i])
]

for letter, frequency in hits:
    print letter.upper(), frequency

然而使用 collections.Counter 的解决方案更加优雅/Pythonic。


1

在编程中,包含所有字母可能是有意义的。例如,如果您想计算单词分布之间的余弦差异,通常需要所有字母。

您可以使用以下方法:

from collections import Counter 

def character_distribution_of_string(pass_string):
  letters = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"]
  chars_in_string = Counter(pass_string)
  res = {}
  for letter in letters:
    if(letter in chars_in_string):
      res[letter] = chars_in_string[letter]
    else: 
      res[letter] = 0 
  return(res)

使用方法:

character_distribution_of_string("This is a string that I want to know about")

完整字符分布

{'a': 4,
 'b': 1,
 'c': 0,
 'd': 0,
 'e': 0,
 'f': 0,
 'g': 1,
 'h': 2,
 'i': 3,
 'j': 0,
 'k': 1,
 'l': 0,
 'm': 0,
 'n': 3,
 'o': 3,
 'p': 0,
 'q': 0,
 'r': 1,
 's': 3,
 't': 6,
 'u': 1,
 'v': 0,
 'w': 2,
 'x': 0,
 'y': 0,
 'z': 0}

你可以轻松地提取字符向量:
list(character_distribution_of_string("This is a string that I want to know about").values())

giving...

[4, 1, 0, 0, 0, 0, 1, 2, 3, 0, 1, 0, 0, 3, 3, 0, 0, 1, 3, 6, 1, 0, 2, 0, 0, 0]

0
将来参考:当你有一个包含所有想要的单词的列表,比如wordlist,它就非常简单了。
for numbers in range(len(wordlist)):
    if wordlist[numbers][0] == 'a':
        print(wordlist[numbers])

0
import string
word = input("Enter a word:  ")
word = word.lower()

Alphabet=list(string.ascii_lowercase)
res = []

for i in range(0,26): 
    res.append(word.count(Alphabet[i]))

for i in range (0,26):
    if res[i] != 0:
        print(str(Alphabet[i].upper()) + " " + str(res[i]))

0
如果要避免使用库或内置函数,则以下代码可能会有所帮助:
s = "aaabbc"  # Sample string
dict_counter = {}  # Empty dict for holding characters
                   # as keys and count as values
for char in s:  # Traversing the whole string
                # character by character
    if not dict_counter or char not in dict_counter.keys(): # Checking whether the dict is
                                                            # empty or contains the character
        dict_counter.update({char: 1}) # If not then adding the
                                       # character to dict with count = 1
    elif char in dict_counter.keys(): # If the character is already
                                      # in the dict then update count
        dict_counter[char] += 1
for key, val in dict_counter.items(): # Looping over each key and
                                      # value pair for printing
    print(key, val)

输出:

a 3
b 2
c 1

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接