

考虑使用C语言为一些不太明显的算法编写实现。例如,让我们考虑递归快速排序,我在K.N. King的《C程序设计:现代方法》第二版书中找到了这个算法实现,可以从这里下载。最有趣的部分由以下两个定义组成:

void quicksort(int a[], int low, int high)
    int middle;

    if (low >= high)

    middle = split(a, low, high);
    quicksort(a, low, middle - 1);
    quicksort(a, middle + 1, high);

int split(int a[], int low, int high)
    int part_element = a[low];

    for (;;) {
       while (low < high && part_element <= a[high])
       if (low >= high)
       a[low++] = a[high];

       while (low < high && a[low] <= part_element)
       if (low >= high)
       a[high--] = a[low];

    a[high] = part_element;
    return high;

通过移除low < high的测试,可以优化两个while循环:

for (;;) {
    while (part_element < a[high])
    if (low >= high)
    a[low++] = a[high];
    a[high] = part_element;

    while (a[low] <= part_element)
    if (low >= high)
    a[high--] = a[low];
    a[low] = part_element;


  • 手动使用gdb调试一些实际数据
  • 将源代码传递给静态分析工具,如splitcppcheck
  • valgrind使用--tool=exp-sgcheck开关

例如,有一个五个元素的数组{8, 1, 2, 3, 4}

#define N 5

int main(void)
    int a[N] = {8, 1, 2, 3, 4}, i;

    quicksort(a, 0, N - 1);

    printf("After sort:");
    for (i = 0; i < N; i++)
        printf(" %d", a[i]);

    return 0;


After sort: 1 1 2 4 8

1. GDB

(gdb) p low
$1 = 3
(gdb) p high
$2 = 4
(gdb) p a[low]
$3 = 1
(gdb) p part_element
$4 = 8
(gdb) s
47              low++;
(gdb) s
46          while (a[low] <= part_element)
(gdb) s
47              low++;
(gdb) s
46          while (a[low] <= part_element)
(gdb) p low
$5 = 5
(gdb) p high
$6 = 4
(gdb) bt full
#0  split (a=0x7fffffffe140, low=5, high=4) at qsort.c:46
        part_element = 8
#1  0x00000000004005df in quicksort (a=0x7fffffffe140, low=0, high=4) at qsort.c:30
        middle = <value optimized out>
#2  0x0000000000400656 in main () at qsort.c:14
        a = {4, 1, 2, 1, 8}
        i = <value optimized out>


(gdb) p low
$5 = 5

2. 静态分析工具

$ splint -retvalint -exportlocal qsort.c 
Splint 3.1.2 --- 07 Feb 2011

Finished checking --- no warnings

$ cppcheck qsort.c 
Checking qsort.c...

3. 使用 --tool=exp-sgcheck 进行 Valgrind 检查

$ valgrind --tool=exp-sgcheck ./a.out 
==5480== exp-sgcheck, a stack and global array overrun detector
==5480== NOTE: This is an Experimental-Class Valgrind Tool
==5480== Copyright (C) 2003-2012, and GNU GPL'd, by OpenWorks Ltd et al.
==5480== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==5480== Command: ./a.out
==5480== Invalid read of size 4
==5480==    at 0x4005A0: split (qsort.c:46)
==5480==    by 0x4005DE: quicksort (qsort.c:30)
==5480==    by 0x400655: main (qsort.c:14)
==5480==  Address 0x7ff000114 expected vs actual:
==5480==  Expected: stack array "a" of size 20 in frame 2 back from here
==5480==  Actual:   unknown
==5480==  Actual:   is 0 after Expected
After sort: 1 1 2 4 8
==5480== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)

位置0x4005A0:split(qsort.c:46)与我手动在gdb 中找到的位置相匹配。

我通常依赖于valgrind来调试内存问题。 - Pavel
电子围栏(-lefence)也可以帮助解决这个问题。 - Will
Valgrind 在动态内存方面非常有帮助(我几乎可以肯定它保证会找到每一个非法的内存操作),但在堆栈方面就不那么明显了。尽管如此,我认为它是目前最好的工具,虽然它并不总能找到问题。gcc-fstack-protector-all 有时也会在成本很低的情况下提供帮助。 - keltar
谢谢keltar。我在Valgrind文档中发现,对于堆栈数组,我需要使用exp-sgcheck,这仍然是实验性的。我知道C语言不提供数组边界检查,而且在将数组传递给函数后,sizeof信息会丢失,因此跟踪它并不容易。除此之外,我认为静态代码分析工具在这里也可能很有用。 - Grzegorz Szpetkowski

如何确保每次对堆栈分配的数组进行访问或写入时都是有效的(即不会导致未定义的行为)?如果在Linux上使用clang选项-fsanitize=address-fsanitize=undefined怎么样?它也可用于gcchttp://gcc.gnu.org/gcc-4.8/changes.htmlclang选项-fsanitize=undefined 这是一个例子:
#include <stdlib.h>

#define N 5

int main(int argc, char *argv[])
  int a[N] = {8, 1, 2, 3, 4}, i;

  int r =0;
  int end = atoi(argv[1]);
  for (int i = 0; i != end; ++i)
    r += a[i];

  return r;


clang -fno-omit-frame-pointer -fsanitize=undefined -g out_boundary.c -o out_boundary_clang

$ ./out_boundary_clang 5
$ ./out_boundary_clang 6
out_boundary.c:12:10: runtime error: index 5 out of bounds for type 'int [5]'
Illegal instruction (core dumped)


Program terminated with signal 4, Illegal instruction.
#0  main (argc=2, argv=0x7fff3a1c28c8) at out_boundary.c:12
12          r += a[i];
(gdb) p i
$1 = 5

clang 使用选项 -fsanitize=address


The tool can detect the following types of bugs:

* Out-of-bounds accesses to heap, stack and globals
* Use-after-free
* Use-after-return (to some extent)
* Double-free, invalid free
* Memory leaks (experimental)

clang -fno-omit-frame-pointer -fsanitize=address -g out_boundary.c -o out_boundary_clang


$ ./out_boundary_clang 6 2>&1 | asan_symbolize.py
==9634==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7fff91bb2ad4 at pc 0x459c67 bp 0x7fff91bb2910 sp 0x7fff91bb2908
READ of size 4 at 0x7fff91bb2ad4 thread T0
    #0 0x459c66 in main out_boundary.c:12
    #1 0x3a1d81ed1c in __libc_start_main ??:0
    #2 0x4594ac in _start ??:0
Address 0x7fff91bb2ad4 is located in stack of thread T0 at offset 244 in frame
    #0 0x45957f in main out_boundary.c:6
  This frame has 8 object(s):
    [32, 36) ''
    [96, 100) ''
    [160, 168) ''
    [224, 244) 'a'
    [288, 292) 'i'
    [352, 356) 'r'
    [416, 420) 'end'
    [480, 484) 'i1'
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
      (longjmp and C++ exceptions *are* supported)
Shadow bytes around the buggy address:
  0x10007236e500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007236e510: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007236e520: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007236e530: 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1
  0x10007236e540: 04 f4 f4 f4 f2 f2 f2 f2 04 f4 f4 f4 f2 f2 f2 f2
=>0x10007236e550: 00 f4 f4 f4 f2 f2 f2 f2 00 00[04]f4 f2 f2 f2 f2
  0x10007236e560: 04 f4 f4 f4 f2 f2 f2 f2 04 f4 f4 f4 f2 f2 f2 f2
  0x10007236e570: 04 f4 f4 f4 f2 f2 f2 f2 04 f4 f4 f4 f3 f3 f3 f3
  0x10007236e580: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007236e590: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x10007236e5a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07
  Heap left redzone:     fa
  Heap right redzone:    fb
  Freed heap region:     fd
  Stack left redzone:    f1
  Stack mid redzone:     f2
  Stack right redzone:   f3
  Stack partial redzone: f4
  Stack after return:    f5
  Stack use after scope: f8
  Global redzone:        f9
  Global init order:     f6
  Poisoned by user:      f7
  ASan internal:         fe


@skwllsp: 在写作时,Cygwin使用gcc 5.2.0,但未启用-fsanitize=address-fsanitize=undefined。由于我不知道在哪里下载和编译libasan和libusan,因此无法启用它们。 - user2284570
注意:使用gcc时,您需要链接libasan库,例如:gcc -fsanitize=address myprog.c -o myprog -lasan - mondaugen
补充一下@mondaugen的内容,对于g++也是一样的,例如:g++ -fsanitize=address myprog.c -o myprog -lasan。它也与gdb兼容。 - VectorVortec

网页内容由stack overflow 提供, 点击上面的