如何使用C语言压缩字符串并替换重复项为它们的数量?

4
我有一个大字符串 char myStr="AAAABBBCCCCCCDDDEFGHHIJJ"。 我要将这个字符串传递给我的字符串压缩函数,该函数应以以下格式返回字符串 myStr ="A4B3C6D3EFGH2IJ2"。 此外,新的字符串替换应在同一传递的字符串中进行。 不能创建临时数组。
下面是我的函数,我无法弄清楚如何删除重复项并在同一字符串中用其计数进行替换。
 #include<stdio.h>
 #include<string.h>


 char* StrCompress(char myStr[])
 {
char *s = myStr;
int len = strlen(myStr);
char *in = myStr;
int count =0;
int i=0;


while(*(s) != '\0')
{
    if(*(s)==*(s+1))
    {
        count++;

        if(count == 1)
        {
            in = s;
        }
        s++;

    }
    else
    {
        //myStr[count-1]=count;
        memcpy(in+1,s+1,count);
        s=in;
        count =0;

    }
    i++;
}

return myStr;



}

int main(){

char myStr[] ="AAAABBBCCCCCEEFGIIJJJKLMNNNNOOO";

printf("Compressed String is : %s\n",StrCompress(&myStr));

return 0;

}

2
看起来你正在尝试执行一种行程长度编码(RLE),但是计数+数据元组被颠倒了。这正确吗?另外,我们可以假设你的字符串从不包含数字,因为它们会完全破坏你的算法? - WhozCraig
13个回答

4
一份稍作修改的版本:
char* StrCompress(char myStr[])
{
  char *s, *in;
  for (s = myStr, in = myStr; *s; s++) {
    int count = 1;
    in[0] = s[0]; in++;
    while (s[0] == s[1]) {
      count++;
      s++;
    }   
    if (count > 1) {
      int len = sprintf(in, "%d", count);
      in += len;
    }   
  }
  in[0] = 0;
  return myStr;
}

此外,在使用数组名调用时,不应使用取地址运算符:
StrCompress(myStr); // not StrCompress(&myStr)

如果您认为一个字符不能重复超过9次,那么您可以使用in[0] = '0' + count而不是sprintf来实现:

if (count > 1) {
  in[0] = '0' + count;
  in++;
}   

抱歉,没注意到...现在我更喜欢你的答案了.. :) - Spooferman
1
@faezshingeri已更新多位数版本以删除缓冲区。 - perreal
@perreal 在同一个数组中是否可以采用相反的方式?或者我们应该使用另一个数组? 例如:A4B3DFG2 变为 AAAABBBDFGG。 - Spooferman
@faezshingeri,这可能是可能的,但数组应该有足够的空间来容纳扩展版本。您可以从字符串的末尾开始扩展,但首先要计算扩展版本的大小。 - perreal
2
正如@Alun所指出的那样,这个解决方案存在漏洞,带有额外缓冲区的版本是正确的。具有两个连续字符(例如“AABBBCCA”)的输入只会输出A2,因为sprintf将1\0复制到“in”,指针“*s”认为已经到达字符串的末尾并立即退出for循环。 - Laurence
显示剩余2条评论

2
#include<stdio.h>

char* StrCompress(char myStr[])
{
    char *s = myStr;
    char *r, *p;
    int count, i;

    while (*s)
    {
        /*initially only 1 character of a kind is present*/
        count = 1;

        /*we check whether current character matches the next one*/
        while (*s && *s == *(s+1))
        {
            /*if yes,then increase the count due to the match 
            and increment the string pointer to next */
            count++;
            s++;
        }

        if (count > 1) /*if more than one character of a kind is present*/
        {
            /*assign the value of count to second occurence of a particular character*/
            *(s - count + 2) = count + '0';

            /*delete all other occurences except the first one and second one using array shift*/
            for (i = 0; i < count - 2; i++)
            {
                p = s + 1;
                r = s;

                while (*r)
                    *r++ = *p++;

                s--;
            }
        }
        s++;
    }

    return myStr;
}

int main()
{
    char myStr[] = "AAAABBBCCCCCCDDDEFGHHIJJ";

    printf("Compressed String is : %s\n", StrCompress(myStr));

    return 0;
}

1
public static void main(String...args) {
    Scanner sc=new Scanner(System.in);
    System.out.println("Enter the String:");
    String str=sc.next();
    int count=1;
    for(int i=0;i<str.length()-1;i++) {
        Character ch1=str.charAt(i);
        Character ch2=str.charAt(i+1);
        if(ch1.equals(ch2)) {
            count++;
        }
        else
        {
            System.out.print((char)(str.charAt(i)));
            if(count>1) {
                System.out.print(count);
            }
            count=1;
        }
        if(i==(str.length()-2)) 
        {
            if(ch1.equals(ch2))
            {System.out.print(ch1+""+count);}
            else {System.out.print(ch2);}
        }
    }
}

请提供更多详细信息。 - Farbod Ahmadian

0
public class StringCompression {
    public static String compress(String str) {
        StringBuilder result = new StringBuilder();
        int i;
        int count = 0;
        for(i=0; i< str.length() - 1;i++,count++) {
            if (str.charAt(i) != str.charAt(i + 1)) {
                result.append(str.charAt(i)).append(count);
                count = 0;
            }
        }

        result.append(str.charAt(i)).append(count);
        return result.toString();
    }

    public static void main(String[] args) {
        String string = "aaassssdddaaaggghhhfgreeeeeeedrrrrr";
        String x= compress(string);
        System.err.println(x);
    }
}

2
避免只提供代码的答案。添加一些信息以回答问题,为什么问题首先存在或其他相关内容。 - Fabulous

0
以下是另一种实现方式,供有需要的人参考。FYI,这种方法称为行程长度编码。
#include <iostream>

void CompressString (std::string str)
{
    //count will keep track of the number of occurences of any given character
    unsigned int count = 1;

    //new string to store the values from the original string
    std::string str2 = "";

    //store the first letter of the string initially
    char ch = str[0];

    //run a loop from the second character of the string since first character if stored in "ch"
    for (unsigned int i = 1; i < str.length(); i++)
    {
        if (str[i] == ch)
            count++;
        else
        {
            str2 = str2 + ch + std::to_string (count);
            ch = str[i];
            count = 1;
        }
    }

    //for cases like aabbb
    str2 = str2 + ch + std::to_string (count);

    //check if after compression, the length of the string reduces or not
    if (str.length() > str2.length())
        std::cout << str2 << std::endl;
    else
        std::cout << str << std::endl;
}

int main ()
{
    std::cout << "Enter a string to compress: ";
    std::string str;
    getline (std::cin, str);

    std::cout << "Compressed string is: ";
    CompressString (str);
    return 0;
}

0

这是另一个Java原地程序。我们可以使用StringBuilder而不是String。

public static void main(String[] args) {

    String a = "aaabbccaaaddj";


        for(int i=0;i<a.length();i++){
            int c=i+1;
            int duplicateCharCount=1;
            while(c<a.length()&&a.charAt(c)==a.charAt(i)){
                ++c;
                ++duplicateCharCount;
            }

                a=a.substring(0,i+1)+duplicateCharCount+a.substring(i+duplicateCharCount);
                i++;


        }
        System.out.println(a);
    }

0
public static void main(String[] args) {
    // TODO Auto-generated method stub
    System.out.print("enter the string");
    String s=(new Scanner(System.in)).nextLine();
    String s2=new String("");
    int count=0;

    for(int i=0;i<s.length();i++)
    {
        count=1;



        s2=s2+(s.charAt(i));

        while(i+1<s.length() && s.charAt(i+1)==s.charAt(i)  )
        {
            count++;

            i++;

        }

        s2=s2.concat(count+"");

        }

        System.out.print(s2);
    }

}

0
public static String compress(String str) {
    StringBuilder result = new StringBuilder();
    int i = 0;
    int count = 0;
    while(i < str.length() - 1) {
        count++;
        if (str.charAt(i) != str.charAt(i + 1)) {
            result.append(str.charAt(i)).append(count);
            count = 0;
        }
        i++;
    }
    result.append(str.charAt(i)).append(count + 1);
    return result.toString();
}

0

我进行了两个假设并编写了这段代码:

  1. 我们的空间是字符串大小的两倍,即,假设我们正在编码"ab",那么分配的空间至少应为4个字节。

  2. 连续的字母串最多可以达到999个。如果有可能在相邻的位置上有1000个相同的字符,则必须相应地增加“count_str”字符数组的大小。

    #include <stdio.h>
    #include <string.h>

    char *compress(char *input)  {

        int i = 0;

        int count = 1;

        int k = 0;

        int j = 0;

        int len = 0; 

        int digits_in_count = 0;

        char count_str[3];

        int m = 0;

        for(i = 0; i < strlen(input); i++) {

            j = i+1;

            m = 0;

            count = 1;

            len = strlen(input);

            printf("\niteration: %d, string = %s",i, input);

            while((input[j] != '\0') && (input[j] == input[i])) {
                count++;
                j++;
            }

            sprintf(count_str, "%d", count);

            digits_in_count = strlen(count_str);

            //this means we have reaced last alphabet in the string
            if(input[j] == '\0' && count == 1) {

                k = k+1;
                goto count_append;

            }

            input[k++] =  input[i];

            // we are assuming that we have enough space in the end, to move string.
            // we are memmove for remaining portion of the string.
            // if the string is "aaab", then we have to move 'b' one step ahead 
            // and it will look like "aab", later in the end we are adding count,
            // and making it as "a3b".
            // if the string is "ab", then we have to move 'b' one step away,
            // to make space for adding 'count'.
            // and the new string after memmove will looklike "abb",
            // in the end we are adding count and making it as "a1b"
            // memmove will not hit for last character in the string, because there
            // is already enough space for appending 'count'.
            memmove((input+i+digits_in_count+1) , input+j, len-j+1);

            i = i+digits_in_count;

            count_append:
            {
                while(digits_in_count) {

                    input[k++] =  *(count_str+m);

                    m = m+1;
                    digits_in_count--;

                }
            }

        }   

        return input;

    }

    void main()
    {
        char arr[50] = "aaab";
        printf("\n%s\n", compress(arr));
    }

0
void stringCompression(char a[]) {
    int i, count=1,j=0;
    for(i=0;a[i]!='\0';i++){
        if(a[i]==a[i+1]){
            count++;
        }
        else if(a[i]!=a[i+1]){
            if(count>1){
                a[j++]=a[i];
                a[j++]=(char)(48+count);
            }
            else if(count==1){
                a[j++]=a[i];
            }
            count=1;
        }
    }
    a[j]='\0';
}

1
通常情况下,如果答案包含代码的意图和解决问题的原因,而不会引入其他问题,那么这些答案会更有帮助。 - DCCoder
@DCCoder是正确的。当回答一个有七年历史,已经有十二个答案(包括一个被接受的答案)的问题时,这更加重要。什么使你的方法与其他答案不同?你提供了什么新信息?为什么读者应该尝试这种方法而不是其他方法? - Jeremy Caney

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接