统计字符串中每个字母出现的次数

Question

统计字符串中每个字母出现的次数

4

我该如何统计字符串中每个字母（不区分大小写）在c语言中出现的次数？输出格式为letter: # number of occurences，我已经有了统计单个字母出现次数的代码，但是如何统计字符串中每个字母的出现次数呢？

{
    char
    int count = 0;
    int i;

    //int length = strlen(string);

    for (i = 0; i < 20; i++)
    {
        if (string[i] == ch)
        {
            count++;
        }
    }

    return count;
}

输出：

a : 1
b : 0
c : 2
etc...

- user1786283

15个回答

2

接受答案后

符合这些规范的方法：（在我看来，其他答案并不完全符合）

当char具有广泛的范围时，这是实用/高效的。例如：CHAR_BIT为16或32，因此不使用bool Used[1 << CHAR_BIT];
适用于非常长的字符串（使用size_t而不是int）。
不依赖ASCII。（使用Upper[]）

定义了当char<0时的行为。is...()函数针对EOF和unsigned char进行了定义。

static const char Upper[] = "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
static const char Lower[] = "abcdefghijklmnopqrstuvwxyz";

void LetterOccurrences(size_t *Count, const char *s) {
  memset(Count, 0, sizeof *Count * 26);
  while (*s) {
    unsigned char ch = *s;
    if (isalpha(ch)) {
      const char *caseset = Upper;
      char *p = strchr(caseset, ch);
      if (p == NULL) {
        caseset = Lower;
        p = strchr(caseset, ch);
      }
      if (p != NULL) {
        Count[p - caseset]++;
      }
    }
    s++;
  }
}

// 示例用法
char *s = foo();
size_t Count[26];
LetterOccurrences(Count, s);
for (int i=0; i<26; i++)
  printf("%c : %zu\n", Upper[i], Count[i]);
}

- chux - Reinstate Monica

1

像这样：

int counts[26];
memset(counts, 0, sizeof(counts));
char *p = string;
while (*p) {
    counts[tolower(*p++) - 'a']++;
}

此代码假定字符串以空字符结尾，并且仅包含介于a和z或A和Z（包括）之间的字符。

要理解其工作原理，请回想一下，在将每个字母转换为小写字母后，每个字母都有一个介于a和z之间的代码，而这些代码是连续的。因此，tolower(*p) - 'a' 的结果是介于 0 和 25 之间的数字，包括两端，表示该字母在字母表中的顺序编号。

此代码结合了++和*p以缩短程序。

- Sergey Kalinichenko

哎呀，实际上我们都没有完全注意到 OP 的问题；-) 他的意思是“忽略大小写”。 - user529758

@H2CO3 你说得对，谢谢！我添加了 tolower，并扩展了假设。非常感谢！ - Sergey Kalinichenko

如果字符串中包含非字母字符，则会出现未定义行为。 - chqrlie

1

一个简单的方法是创建一个包含26个整数的数组，每个整数代表字母a-z的计数：

int alphacount[26] = {0}; //[0] = 'a', [1] = 'b', etc

然后循环遍历字符串，并为每个字母增加计数：

for(int i = 0; i<strlen(mystring); i++)      //for the whole length of the string
    if(isalpha(mystring[i]))
        alphacount[tolower(mystring[i])-'a']++;  //make the letter lower case (if it's not)
                                                 //then use it as an offset into the array
                                                 //and increment

这是一个简单的想法，适用于 A-Z、a-z。如果您想按大写字母分隔，只需将计数器设置为 52 并减去正确的 ASCII 偏移量即可。

- Mike

负的 char 值会导致未定义行为，循环测试表达式效率低下，代码假定小写字母在字符集中是连续的。 - chqrlie

1

#include <stdio.h>
#include <string.h>
void main()
{
    printf("PLEASE ENTER A STRING\n");
    printf("GIVE ONLY ONE SPACE BETWEEN WORDS\n");
    printf("PRESS ENETR WHEN FINISHED\n");

    char str[100];
    int arr[26]={0};
    char ch;
    int i;

    gets(str);
    int n=strlen(str);

    for(i=0;i<n;i++)
    {
        ch=tolower(str[i]);
        if(ch>=97 && ch<=122)   
        {
            arr[ch-97]++;
        }
    }
    for(i=97;i<=122;i++)
        printf("%c OCCURS %d NUMBER OF TIMES\n",i,arr[i-97]);   
    return 0;
}

- mrc_03

这个函数可以返回给定字符串中某个字符出现的次数。 - mrc_03

0

这是带有用户自定义函数的 C 代码：

/* C Program to count the frequency of characters in a given String */

#include <stdio.h>
#include <string.h>

const char letters[] = "abcdefghijklmnopqrstuvwxzy";

void find_frequency(const char *string, int *count);

int main() {
    char string[100];
    int count[26] = { 0 };
    int i;

    printf("Input a string: ");
    if (!fgets(string, sizeof string, stdin))
        return 1;

    find_frequency(string, count);

    printf("Character Counts\n");

    for (i = 0; i < 26; i++) {
        printf("%c\t%d\n", letters[i], count[i]);
    }
    return 0;
}

void find_frequency(const char *string, int *count) {
    int i;
    for (i = 0; string[i] != '\0'; i++) {
        p = strchr(letters, string[i]);
        if (p != NULL) {
            count[p - letters]++;
        }
    }
}

- Pratik Patil

如果本地字符中的小写字母不连续，例如在EBCDIC中的情况下，将会出现未定义行为。gets()已经过时且存在安全隐患。 - chqrlie

0

您可以使用以下代码。

main()
{
    int i = 0,j=0,count[26]={0};
    char ch = 97;
    char string[100]="Hello how are you buddy ?";
    for (i = 0; i < 100; i++)
    {
        for(j=0;j<26;j++)
            {
            if (tolower(string[i]) == (ch+j))
                {
                    count[j]++;
                }
        }
    }
    for(j=0;j<26;j++)
        {

            printf("\n%c -> %d",97+j,count[j]);

    }

}

希望这有所帮助。

- CCoder

1

对于字符串 The quick brown fox jumps over the lazy dog.，它给出了以下结果：

a -> 0 b -> 1 c -> 1 d -> 0 e -> 1 f -> 1 g -> 0 h -> 1 i -> 1 j -> 0 k -> 1 l -> 0 m -> 0 n -> 1 o -> 2 p -> 0 q -> 1 r -> 1 s -> 0 t -> 1 u -> 1 v -> 0 w -> 1 x -> 1 y -> 0 z -> 0

，但是每个字母都应该是1。 - user1786283

@user1786283，你确定你没写“跳跃”吗？ - user529758

负的char值会出现未定义的行为，检测小写字母的方法效率低下，硬编码了ASCII码，格式缩进有误... - chqrlie

0

#include<stdio.h>

void frequency_counter(char* str)
{
    int count[256] = {0};  //partial initialization
    int i;

    for(i=0;str[i];i++)
        count[str[i]]++;

    for(i=0;str[i];i++) {
        if(count[str[i]]) {
            printf("%c %d \n",str[i],count[str[i]]);
            count[str[i]]=0;
        }
    }
}

void main()
{
    char str[] = "The quick brown fox jumped over the lazy dog.";
    frequency_counter(str);
}

- kamalnayan242

frequency_counter 是什么？ - eyllanesc

欢迎写一篇小文章来解释这段代码。 - joce

char values must be cast as unsigned char when used as indexx values: count[(unsigned char)str[i]]++; - chqrlie

0

for (int i=0;i<word.length();i++){
         int counter=0;
         for (int j=0;j<word.length();j++){
             if(word.charAt(i)==word.charAt(j))
             counter++;
             }// inner for
             JOptionPane.showMessageDialog( null,word.charAt(i)+" found "+ counter +" times");
         }// outer for

- mero

0

#include<stdio.h>
#include<string.h>

#define filename "somefile.txt"

int main()
{
    FILE *fp;
    int count[26] = {0}, i, c;  
    char ch;
    char alpha[27] = "abcdefghijklmnopqrstuwxyz";
    fp = fopen(filename,"r");
    if(fp == NULL)
        printf("file not found\n");
    while( (ch = fgetc(fp)) != EOF) {
        c = 0;
        while(alpha[c] != '\0') {

            if(alpha[c] == ch) {
                count[c]++; 
            }
            c++;
        }
    }
    for(i = 0; i<26;i++) {
        printf("character %c occured %d number of times\n",alpha[i], count[i]);
    }
    return 0;
}

- Megharaj

如果文件无法打开，因为您没有退出函数，会导致未定义的行为。ch必须定义为int以正确检测EOF。 - chqrlie

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- user529758 · Accepted Answer

假设您的系统中char是8位，并且您要计算的所有字符都使用非负数进行编码。在这种情况下，您可以编写以下代码：

const char *str = "The quick brown fox jumped over the lazy dog.";

int counts[256] = { 0 };

int i;
size_t len = strlen(str);

for (i = 0; i < len; i++) {
    counts[(int)(str[i])]++;
}

for (i = 0; i < 256; i++) {
    if ( count[i] != 0) {
        printf("The %c. character has %d occurrences.\n", i, counts[i]);
    }
}

请注意，这将计算字符串中的所有字符。如果您百分之百确定您的字符串中只有字母（没有数字、空格或标点符号），那么1. 要求“不区分大小写”开始有意义，2. 您可以将条目数减少到英文字母数量（即26个），并且您可以编写类似以下内容的代码：

#include <ctype.h>
#include <string.h>
#include <stdlib.h>

const char *str = "TheQuickBrownFoxJumpedOverTheLazyDog";

int counts[26] = { 0 };

int i;
size_t len = strlen(str);

for (i = 0; i < len; i++) {
    // Just in order that we don't shout ourselves in the foot
    char c = str[i];
    if (!isalpha(c)) continue;

    counts[(int)(tolower(c) - 'a')]++;
}

for (i = 0; i < 26; i++) {
    printf("'%c' has %2d occurrences.\n", i + 'a', counts[i]);
}