C语言中递归练习

Question

C语言中递归练习

6

如何解决以下涉及递归的问题？

实现一个带有原型的函数 char *repeat(char *s, int n)，使其创建并返回一个字符串，该字符串由输入字符串 s 的 n 次重复组成。例如：如果输入是“Hello”和 3，则输出为“HelloHelloHello”。只使用递归结构。

我的解决方案对我来说似乎非常丑陋，我正在寻找更简洁的解决方案。以下是我的代码：

char *repeat(char *s, int n) {
  if(n==0) {
     char *ris = malloc(1);
     ris[0] = '\0';
     return ris;
  }
  int a = strlen(s);
  char *ris = malloc(n*a+1);
  char *ris_pre = repeat(s,n-1);
  strcpy(ris,ris_pre);
  strcpy(ris+(n-1)*a,s);
  free(ris_pre);
  return ris;
}

- random

5个回答

3

尝试使用这种方法，只分配一次字符串：

char *repeat(char *s, int n) {
   int srcLength = strlen(s);
   int destLength = srcLength * n + 1;      
   char *result = malloc(destLength);
   result[0] = '\0'; // This is for strcat calls to work properly

   return repeatInternal(s, result, n);
}

char *repeatInternal(char *s, char *result, int n) {
  if(n==0) {
     return result;
  }

  strcat(s, result);  
  return repeat(result, s, n-1);
}

第二个repeat方法应该只被第一个方法使用。(第一个方法是原型方法)

注意: 我没有编译/测试过它，但这应该可以工作。

- giorashc

难道它不是说“输入字符串s的n 重复次数”吗？因此，0次重复将是空字符串。 - Asherah

你应该分配 srcLength * n + 1 字节。你当前的计算不会为空终止符留出空间。 - interjay

@interjay 你是对的，这也会解决在开头烦人的 if。 - giorashc

C语言没有函数重载，你需要给第二个函数改个名字。顺便说一下，你可以把第一个函数中的if语句去掉，简化一下代码。 - interjay

谢谢，我也更新了。（很久没写 C 代码了 :)) - giorashc

显示剩余2条评论

2

这是第一条：

char *repeat (char *str, int n)
{
  char *ret_str, *new_str;

  if (n == 0)
  {
    ret_str = strdup ("");
    return ret_str;
  }
  ret_str = repeat (str, n-1);
  new_str = malloc (sizeof (char) * strlen (str) * (n + 1));
  new_str[0] = '\0';
  strcpy (new_str, ret_str);
  strcat (new_str, str);
  free (ret_str);
  return new_str;
}

使用realloc()可以得到更整洁的代码。

char *repeat (char *str, int n)
{
  char *ret_str;

  if (n == 0)
  {
    ret_str = strdup ("");
    return ret_str;
  }
  ret_str = repeat (str, n-1);
  ret_str = realloc (ret_str, sizeof (char) * strlen (str) * (n + 1));
  strcat (ret_str, str);
  return ret_str;
}

编辑1

好的，这个更加精简。

char *repeat (char *str, int n)
{
  static char *ret_str;
  static int n_top = -1;

  if (n >= n_top)
    ret_str = calloc (sizeof (char), strlen (str) * n + 1);
  if (n <= 0)
    return ret_str;

  n_top = n;

  return strcat (repeat (str, n-1), str);
}

我们使用静态缓冲区来保存最终字符串，因此在所有递归级别中都使用单个缓冲区。 static int n_top 保存了递归调用的上一个 n 值。初始化为 -1，以处理当使用 n = 0 调用时，返回一个空字符串的情况（并使用 calloc 进行初始化为 0）。在第一次递归调用时，值为 -1，因此仅在顶层时 n > n_top 为真（因为 n 总是在减少），在这种情况下整个缓冲区被分配给 ret_str。否则，我们找到底部条件，即当 n 变为 0 时。此时，当 n = 0 时，我们将预分配的静态缓冲区 ret_str 的地址返回给递归树中的父调用者。然后每个递归级别都使用附加的 str 并移交给上一个级别使用该单个缓冲区，直到它到达 main。

编辑 2：

更紧凑的版本，但不太好看。

char *repeat (char *str, int n)
{
  static int n_top;
  n_top = (n_top == 0)? n: n_top;
  return (n <= 0)?(n=n_top,n_top=0,calloc (sizeof (char), strlen (str) * n + 1)):strcat (repeat (str, n-1), str);
}

上一个紧凑的代码存在问题，如果您使用repeat (str, n); repeat (str, 0);调用，则会出现问题。这种实现克服了这个问题，并且它甚至更加紧凑，也仅使用一个函数。

请注意，这里有一个丑陋的(n=n_top,n_top=0,calloc(sizeof(char),strlen(str)*n+1))。在此，我们确保在向上回滚时使用n_top的值来分配内存，然后将n_top重置为0，以便该函数在下一次从main()或其他主调用程序（非递归）中调用时具有n_top设置为0。这可以用更易读的方式完成，但这看起来很酷。我建议坚持使用更易读的方法。 编辑3 疯狂版本 这克服了重复的strlen()调用。只调用一次strlen()，然后使用当前深度中字符串长度的值以及n的值来查找offset值，该值指示正在返回的最终字符串的结尾（其地址未存储在任何中间变量中，仅返回并传递）。将字符串传递给memcpy时，我们添加偏移量，并通过将offset添加到立即下一个深度返回的答案字符串中来将源内存位置提供给memcpy。这实际上为memcpy提供了紧接着字符串结尾之后的位置，之后memcpy复制长度为str_len的str。请注意，memcpy将返回它传递的目标地址，即此深度的答案字符串结束地址，但我们需要实际的起始地址，这是通过从返回值回退offset来实现的，因此在返回之前减去offset。

请注意，这仍然使用单个函数:D

char *repeat (char *str, int n)
{
  static int n_top, str_len;
  int offset = 0;

  (n_top == 0)?(n_top = n,str_len = strlen (str)):(offset = str_len * (n_top-n));
  return (n <= 0)?(n=n_top,n_top=0,malloc (str_len * n + 1)):(memcpy (repeat (str, n-1) + offset, str, str_len) - offset);
}

一些注意事项：

我们可能已经执行了offset = str_len * (n-1)，在这种情况下，在第一层中，str将被复制到偏移量为0的位置，从后续递归深度开始，它会将字符串从反向复制到答案字符串中。
当执行memcpy时，我们告诉它要复制n个字节，这不包括\0。但是，由于我们使用calloc为最终目的地内存分配了终止字符`'\0'的空间，因此它被初始化为0。因此，最终字符串将以'\0'结尾。
char的sizeof始终为1
为了使其看起来更紧凑和神秘，可以删除offset计算，并在最后一个return表达式中直接计算偏移量。
请勿在现实生活中使用此代码。

- phoxis

1

具有静态缓冲区的版本仅适用于一次：您不能执行例如 repeat("Hello") 然后执行 repeat("Cruel World") - 它会崩溃！ - anatolyg

对于紧凑版本的建议：将ret_str初始化为NULL，并在if (n > n_top)情况下，在重新分配空间之前，如果不为空，则释放它，或者realloc并memset为0。 - slashmais

@slashmais：好建议，但我避免这样做是为了使它模块化，因为释放字符串的责任在于调用者，就像其他字符串库函数一样。如果调用者不释放，那么我们就会有一个泄漏。该函数不应该关心它是否在之前或之后被调用。 - phoxis

是的，尽管我觉得更“正确”的repeat()函数原型应该包括第三个参数，即由调用者分配足够大小的字符串（'strlen(s) * n +1'）。 - slashmais

1

@phoxis，我确实注意到你是唯一一个使用单个函数完成它的人。这值得认可。 - aps2012

显示剩余8条评论

0

这里有一个需要更多代码，但是时间复杂度为 O(log n) 而不是 O(n) 的解决方案：

// Return a string containing 'n' copies of 's'
char *repeat(int n, char *s) {
  return concat((n-1) * strlen(s), strdup(s));
}

// Append 'charsToAdd' characters from 's' to 's', charsToAdd >= 0
char *concat(int charsToAdd, char *s) {
  int oldLen = strlen(s);
  if (charsToAdd <= n) {  // Copy only part of the original string.
    char *longerString = malloc((oldLen + charsToAdd + 1) * sizeof(char));
    strcpy(longerString, s);
    strncat(longerString, s, charsToAdd);
    return longerString;
  } else { // Duplicate s and recurse.
    char *longerString = malloc((2 * oldLen + 1) * sizeof(char));
    strcpy(longerString, s);
    strcat(longerString, s);
    free(s);  // Free the old string; the recusion will allocate a new one.
    return concat(charsToAdd - oldLen, longerString);
  }
}

- Adam Liss

它需要在堆栈上额外的O(log n)空间，而不是时间。 - anatolyg

@anatolyg - 不确定你的意思。每次调用concat()都会使字符串长度加倍，因此例如重复计数为64只需要6次调用，而所有其他解决方案都需要63或64次。因此，这个解决方案既节省时间又节省堆栈空间。 - Adam Liss

@AdamLiss，实际上，你的算法运行时间复杂度为O(n)。你的执行时间需要O(log n)次递归，这点是没错的，但是在这种情况下，n 指的是结果的长度（即结果中字节的数量）。你的实现仍然为(n * strlen(s))字节分配内存并复制(n * strlen(s))字节以完成操作。它不会通过复制和仅分配(log(n) * strlen(s))字节来创建一个(n * strlen(s))字节的结果，因此不能称之为O(log n)的运行时间。 - aps2012

@aps2012 感谢澄清。我猜程序的运行时间取决于它在strcat()和strcpy()中花费的大部分时间（在这种情况下，执行时间将为O(n)），或者在malloc()中花费的时间以及malloc()的执行时间与正在分配的字节数之间的关系。我倾向于认为它会花费大量时间来复制数据，这确实会使其在O(n)中执行。 - Adam Liss

@AdamLiss，你的分析是正确的，因为它在strcat()和strlen()中花费了大部分时间。然而，我意识到这实际上使得运行时间为O(n^2)，而不是O(n)。原因是strcat()是在结果上执行的（每次迭代都在增长），而不是在原始字符串上执行。我曾经使用过strcat()，但现在我已经将其优化掉，以获得真正的O(n)。看一下吧。 - aps2012

0

一个可能的解决方案：

#include <stdio.h>
#include <stdlib.h>
#include <string.h>


char *repeat(char *s, int n)
{
    static char *sret=NULL;
    static int isnew=1;

    if (!s || !s[0])
    {
        if (sret) { free(sret); sret=NULL; }
        return "";
    }

    if (n<=0) return "";

    if (isnew)
    {
        int nbuf = strlen(s)*n + 1;
        sret = (char*)realloc(sret, nbuf);
        memset(sret, 0, nbuf);
        isnew=0;
    }

    strcat(sret,s);
    repeat(s, n-1);
    isnew = 1;
    return sret;
}

int main()
{
    char *s = repeat("Hello",50);
    printf("%s\n", s);

    s = repeat("Bye",50);
    printf("%s\n", s);

    repeat(NULL,0); /* this free's the static buffer in repeat() */

    s = repeat("so long and farewell",50);
    printf("%s\n", s);

    return 0;
}

[编辑]
aps2012的解决方案的变体，使用单个函数，但带有静态整数：

char *repeat(char *s, int n)
{
    static int t=0;
    return (n > 0) 
        ? (t += strlen(s),strcat(repeat(s, n - 1), s)) 
        : strcpy(malloc(t + 1), "");
}

调用者必须使用free()释放返回的字符串，以避免内存泄漏。

- slashmais

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- aps2012 · Accepted Answer

一个更整洁、更优雅的解决方案（我称其为“基本解决方案”）如下所示：

基本解决方案

char *internalRepeat(char *s, int n, size_t total)
{
    return (n > 0)
        ? strcat(internalRepeat(s, n - 1, total + strlen(s)), s)
        : strcpy(malloc(total + 1), "");
}

char *repeat(char *s, int n)
{
    return internalRepeat(s, n, 0);
}

这就是递归的美妙之处。解决方案的关键在于使用递归逐步构建结果的长度。参数total完成此操作（不包括NUL终止符）。当递归终止时，结果缓冲区被一次性分配（包括NUL终止符），然后我们使用递归展开将每个s的副本附加到结果中。基本解决方案的行为如下：

对于空字符串的任意重复次数，返回零长度字符串。
对于非空字符串的零或负迭代，返回零长度字符串。
对于非零正数重复次数的非空字符串，返回非零长度字符串。

如果您根据上述函数创建一个程序，则以下语句：

printf("Repeat \"\" 0 times: [%s]\n", repeat("", 0));
printf("Repeat \"\" 3 times: [%s]\n", repeat("", 3));
printf("Repeat \"abcde\" 0 times: [%s]\n", repeat("abcde", 0));
printf("Repeat \"abcde\" 1 times: [%s]\n", repeat("abcde", 1));
printf("Repeat \"abcde\" 4 times: [%s]\n", repeat("abcde", 4));

将会产生以下输出：

Repeat "" 0 times: []
Repeat "" 3 times: []
Repeat "abcde" 0 times: []
Repeat "abcde" 1 times: [abcde]
Repeat "abcde" 4 times: [abcdeabcdeabcdeabcde]

编辑: 优化解决方案如下。如果您对优化技术感兴趣，请继续阅读。

这里所有的其他提议主要都是以O(n^2)为主，并在每次迭代时分配内存。尽管基本解决方案非常优雅，仅使用单个malloc()，并且只需要两个语句，但出人意料的是，基本解决方案的运行时间也是O(n^2)。如果字符串s很长，这将使它非常低效，并且意味着基本解决方案不比这里的任何其他提议更有效。

优化解决方案

以下是一个实际上以O(n)运行的最优解决方案：

char *internalRepeat(char *s, int n, size_t total, size_t len)
{
    return (n > 0)
        ? strcpy(internalRepeat(s, n - 1, total, len), s) + len
        : strcpy(malloc(total + 1), "");
}

char *repeat(char *s, int n)
{
    int len = strlen(s);

    return internalRepeat(s, n, n * len, len) - (n * len);
}

正如您所见，现在它有三个语句，并使用一个名为len的参数来缓存s的长度。它递归地使用len来计算结果缓冲区中第n个s副本的位置，从而使我们能够将每次添加s到结果的strcat()替换为strcpy()。这给出了一个实际运行时间为O(n)，而不是O(n^2)。

基本和优化方案之间有什么区别呢？

所有其他解决方案都至少使用了n次strcat()在字符串s上，以将n个s的副本附加到结果。这就是问题所在，因为strcat()的实现隐藏了一种低效性。在内部，strcat()可以被认为是：

strcat = strlen + strcpy

即，在追加时，您必须先找到要追加的字符串的末尾然后才能进行追加本身。这种隐藏开销意味着，实际上创建字符串的n个副本需要n个长度检查和n个物理复制操作。然而，真正的问题在于对于每个s的副本，我们追加的结果变得更长了。这意味着在strcat()中每个连续的长度检查都在结果上变得更长了。如果我们现在将这两种解决方案使用“我们必须扫描或复制s的次数”作为比较基础，我们可以看到这两种解决方案的差异所在。

对于字符串s的n个副本，基本解决方案的执行情况如下：

strlen's/iteration: 2
strcpy's/iteration: 1

Iteration | Init | 1 | 2 | 3 | 4 | ... | n |   Total    |
----------+------+---+---+---+---+-----+---+------------+
Scan "s"  |   0  | 1 | 2 | 3 | 4 | ... | n | (n+1)(n/2) |
Copy "s"  |   0  | 1 | 1 | 1 | 1 | ... | 1 |     n      |

而优化后的解决方案则像这样表现：

strlen's/iteration: 0
strcpy's/iteration: 1

Iteration | Init | 1 | 2 | 3 | 4 | ... | n |    Total   |
----------+------+---+---+---+---+-----+---+------------+
Scan "s"  |   1  | 0 | 0 | 0 | 0 | ... | 0 |      1     |
Copy "s"  |   0  | 1 | 1 | 1 | 1 | ... | 1 |      n     |

如表所示，由于 strcat() 中内置的长度检查，基本解决方案对我们的字符串执行 (n^2 + n)/2 次扫描，而优化解决方案始终执行 (n + 1) 次扫描。这就是为什么基本解决方案（以及所有依赖于 strcat() 的其他解决方案）的执行时间为 O(n^2)，而优化解决方案的执行时间为 O(n)。

在实际情况下，O(n) 和 O(n^2) 的运行时间差异非常大，特别是当使用大型字符串时。例如，让我们取一个 1MB 的字符串 s 并创建它的 1,000 个副本（== 1GB）。如果我们有一颗每个时钟周期可以扫描或复制 1 字节的 1GHz CPU，那么将生成 1,000 个 s 副本如下：

注意：上面的性能表中取出的 n 表示对 s 进行单次扫描。

Basic:  (n + 1) * (n / 2) + n = (n ^ 2) / 2 + (3n / 2)
                              = (10^3 ^ 2) / 2 + (3 * 10^3) / 2
                              = (5 * 10^5) + (1.5 * 10^2)
                              = ~(5 * 10^5) (scans of "s")
                              = ~(5 * 10^5 * 10^6) (bytes scanned/copied)
                              = ~500 seconds (@1GHz, 8 mins 20 secs).

Optimised: (n + 1)            = 10^3 + 1
                              = ~10^3 (scans of "s")
                              = ~10^3 * 10^6 (bytes scanned/copied)
                              = 1 second (@1Ghz)

如您所见，优化后的解决方案几乎即时完成，而基本解决方案需要近10分钟才能完成。然而，如果您认为将字符串s缩小会有帮助，那么下一个结果可能会让您感到震惊。同样，在一个处理器每个时钟周期处理1字节的1GHz计算机上，我们将s设置为1KB（比原来小1000倍），并复制1,000,000次（总共==1GB，与之前相同）。这给出：

Basic:  (n + 1) * (n / 2) + n = (n ^ 2) / 2 + (3n / 2)
                              = (10^6 ^ 2) / 2 + (3 * 10^6) / 2
                              = (5 * 10^11) + (1.5 * 10^5)
                              = ~(5 * 10^11) (scans of "s")
                              = ~(5 * 10^11 * 10^3) (bytes scanned/copied)
                              = ~50,000 seconds (@1GHz, 833 mins)
                              = 13hrs, 53mins, 20 secs

Optimised: (n + 1)            = 10^6 + 1
                              = ~10^6 (scans of "s")
                              = ~10^6 * 10^3 (bytes scanned/copied)
                              = 1 second (@1Ghz)

这是一个非常惊人的差异。优化后的解决方案在写入的数据总量相同时与之前的表现相同。然而，基本解决方案需要停顿超过半天才能构建结果。这就是O(n)和O(n^2)运行时间之间的差异。