在字符串中计算字母频率（Python）

Question

在字符串中计算字母频率（Python）

14

我正在尝试计算单词中每个字母出现的次数

word = input("Enter a word")

Alphabet=['a','b','c','d','e','f','g','h','i','j','k','l','m','n','o','p','q','r','s','t','u','v','w','x','y','z']

for i in range(0,26):
    print(word.count(Alphabet[i]))

当前代码输出每个字母出现的次数，包括那些没有出现过的。

如何垂直列出每个字母及其频率，例如下面这样？

word="Hello"

H 1

E 1

L 2

O 1

- Kelvin San

2

30秒的搜索就可以发现你可以使用collections.Counter。 - Pythonista

3

这似乎是一个作业问题，所以你可能需要阅读这些指南，了解如何在SO上提出这样的问题。我马上会发布几个答案。 - evadeflow

13个回答

13

def char_frequency(str1):
    dict = {}
    for n in str1:
        keys = dict.keys()
        if n in keys:
            dict[n] += 1
        else:
            dict[n] = 1
    return dict
print(char_frequency('google.com'))

- Uzam Hashmi

10

感谢您提供这段代码片段，它可能会在短期内提供一些有限的帮助。通过展示为什么这是一个好的解决方案，适当的解释将极大地提高它的长期价值，并使它对未来有其他类似问题的读者更有用。请编辑您的答案，添加一些解释，包括您所做的假设。 - Toby Speight

1

只需遍历字符串并在字典中形成新出现元素的键，或者如果元素已经出现，则将其值增加1。 - noob_coder

仅使用dict.keys()进行'in'测试是毫无意义的。 - Tony Suffolk 66

请注意，您不应将字典命名为“dict”，因为这是一个保留字（它是字典的构造函数术语！） - duhaime

ctrl c＆ctrl v。从这里开始 - > https://www.w3resource.com/python-exercises/string/python-data-type-string-exercise-2.php - Lord-shiv

9

正如Pythonista所说的那样，这是collections.Counter的工作：

from collections import Counter
print(Counter('cats on wheels'))

这将打印：

{'s': 2, ' ': 2, 'e': 2, 't': 1, 'n': 1, 'l': 1, 'a': 1, 'c': 1, 'w': 1, 'h': 1, 'o': 1}

- duhaime

2

一种不需要使用外部库的简单易行的解决方案：

string = input()
f = {}
for i in string:
  f[i] = f.get(i,0) + 1
print(f)

这是关于get()的链接: https://docs.quantifiedcode.com/python-anti-patterns/correctness/not_using_get_to_return_a_default_value_from_a_dictionary.html，该链接解释了为什么应该使用get()来从字典中返回默认值。

- soheshdoshi

2

s = input()
t = s.lower()

for i in range(len(s)):
    b = t.count(t[i])
    print("{} -- {}".format(s[i], b))

- Swetank

1

针对LMc所说的内容, 您的代码已经非常接近于功能实现。您只需要对结果集进行后处理，以去除“不相关”的输出。下面是让您的代码正常工作的一种方法：

#!/usr/bin/env python
word = raw_input("Enter a word: ")

Alphabet = [
    'a','b','c','d','e','f','g','h','i','j','k','l','m',
    'n','o','p','q','r','s','t','u','v','w','x','y','z'
]

hits = [
    (Alphabet[i], word.count(Alphabet[i]))
    for i in range(len(Alphabet))
    if word.count(Alphabet[i])
]

for letter, frequency in hits:
    print letter.upper(), frequency

然而使用 collections.Counter 的解决方案更加优雅/Pythonic。

- evadeflow

1

在编程中，包含所有字母可能是有意义的。例如，如果您想计算单词分布之间的余弦差异，通常需要所有字母。

您可以使用以下方法：

from collections import Counter 

def character_distribution_of_string(pass_string):
  letters = ["a","b","c","d","e","f","g","h","i","j","k","l","m","n","o","p","q","r","s","t","u","v","w","x","y","z"]
  chars_in_string = Counter(pass_string)
  res = {}
  for letter in letters:
    if(letter in chars_in_string):
      res[letter] = chars_in_string[letter]
    else: 
      res[letter] = 0 
  return(res)

使用方法：

character_distribution_of_string("This is a string that I want to know about")

完整字符分布

{'a': 4,
 'b': 1,
 'c': 0,
 'd': 0,
 'e': 0,
 'f': 0,
 'g': 1,
 'h': 2,
 'i': 3,
 'j': 0,
 'k': 1,
 'l': 0,
 'm': 0,
 'n': 3,
 'o': 3,
 'p': 0,
 'q': 0,
 'r': 1,
 's': 3,
 't': 6,
 'u': 1,
 'v': 0,
 'w': 2,
 'x': 0,
 'y': 0,
 'z': 0}

你可以轻松地提取字符向量：

list(character_distribution_of_string("This is a string that I want to know about").values())

giving...

[4, 1, 0, 0, 0, 0, 1, 2, 3, 0, 1, 0, 0, 3, 3, 0, 0, 1, 3, 6, 1, 0, 2, 0, 0, 0]

- Cybernetic

0

将来参考：当你有一个包含所有想要的单词的列表，比如wordlist，它就非常简单了。

for numbers in range(len(wordlist)):
    if wordlist[numbers][0] == 'a':
        print(wordlist[numbers])

- Stanley

0

import string
word = input("Enter a word:  ")
word = word.lower()

Alphabet=list(string.ascii_lowercase)
res = []

for i in range(0,26): 
    res.append(word.count(Alphabet[i]))

for i in range (0,26):
    if res[i] != 0:
        print(str(Alphabet[i].upper()) + " " + str(res[i]))

- Anaam

0

如果要避免使用库或内置函数，则以下代码可能会有所帮助：

s = "aaabbc"  # Sample string
dict_counter = {}  # Empty dict for holding characters
                   # as keys and count as values
for char in s:  # Traversing the whole string
                # character by character
    if not dict_counter or char not in dict_counter.keys(): # Checking whether the dict is
                                                            # empty or contains the character
        dict_counter.update({char: 1}) # If not then adding the
                                       # character to dict with count = 1
    elif char in dict_counter.keys(): # If the character is already
                                      # in the dict then update count
        dict_counter[char] += 1
for key, val in dict_counter.items(): # Looping over each key and
                                      # value pair for printing
    print(key, val)

输出：

a 3
b 2
c 1

- Shahriar Rahman Zahin

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- LMc · Accepted Answer

from collections import Counter
counts=Counter(word) # Counter({'l': 2, 'H': 1, 'e': 1, 'o': 1})
for i in word:
    print(i,counts[i])

尝试使用Counter，它将创建一个字典，其中包含集合中所有项目的频率。

否则，您可以对当前代码进行条件判断，仅在word.count(Alphabet[i])大于0时print，但这会更慢。