考虑使用C语言为一些不太明显的算法编写实现。例如,让我们考虑递归快速排序,我在K.N. King的《C程序设计:现代方法》第二版书中找到了这个算法实现,可以从这里下载。最有趣的部分由以下两个定义组成:
void quicksort(int a[], int low, int high)
{
int middle;
if (low >= high)
return;
middle = split(a, low, high);
quicksort(a, low, middle - 1);
quicksort(a, middle + 1, high);
}
int split(int a[], int low, int high)
{
int part_element = a[low];
for (;;) {
while (low < high && part_element <= a[high])
high--;
if (low >= high)
break;
a[low++] = a[high];
while (low < high && a[low] <= part_element)
low++;
if (low >= high)
break;
a[high--] = a[low];
}
a[high] = part_element;
return high;
}
通过移除low < high
的测试,可以优化两个while
循环:
for (;;) {
while (part_element < a[high])
high--;
if (low >= high)
break;
a[low++] = a[high];
a[high] = part_element;
while (a[low] <= part_element)
low++;
if (low >= high)
break;
a[high--] = a[low];
a[low] = part_element;
}
如何确保对于在堆栈(stack)上分配的数组的每个访问或写入都是有效的(即不会引起未定义的行为)?我已经尝试了以下方法:
- 手动使用
gdb
调试一些实际数据 - 将源代码传递给静态分析工具,如
split
或cppcheck
valgrind
使用--tool=exp-sgcheck
开关
例如,有一个五个元素的数组{8, 1, 2, 3, 4}
:
#define N 5
int main(void)
{
int a[N] = {8, 1, 2, 3, 4}, i;
quicksort(a, 0, N - 1);
printf("After sort:");
for (i = 0; i < N; i++)
printf(" %d", a[i]);
putchar('\n');
return 0;
}
结果是(很可能是实现相关的):
After sort: 1 1 2 4 8
1. GDB
(gdb) p low
$1 = 3
(gdb) p high
$2 = 4
(gdb) p a[low]
$3 = 1
(gdb) p part_element
$4 = 8
(gdb) s
47 low++;
(gdb) s
46 while (a[low] <= part_element)
(gdb) s
47 low++;
(gdb) s
46 while (a[low] <= part_element)
(gdb) p low
$5 = 5
(gdb) p high
$6 = 4
(gdb) bt full
#0 split (a=0x7fffffffe140, low=5, high=4) at qsort.c:46
part_element = 8
#1 0x00000000004005df in quicksort (a=0x7fffffffe140, low=0, high=4) at qsort.c:30
middle = <value optimized out>
#2 0x0000000000400656 in main () at qsort.c:14
a = {4, 1, 2, 1, 8}
i = <value optimized out>
正如你所看到的,low
变量已经超出了范围:
(gdb) p low
$5 = 5
2. 静态分析工具
$ splint -retvalint -exportlocal qsort.c
Splint 3.1.2 --- 07 Feb 2011
Finished checking --- no warnings
$ cppcheck qsort.c
Checking qsort.c...
3. 使用 --tool=exp-sgcheck
进行 Valgrind 检查
$ valgrind --tool=exp-sgcheck ./a.out
==5480== exp-sgcheck, a stack and global array overrun detector
==5480== NOTE: This is an Experimental-Class Valgrind Tool
==5480== Copyright (C) 2003-2012, and GNU GPL'd, by OpenWorks Ltd et al.
==5480== Using Valgrind-3.8.1 and LibVEX; rerun with -h for copyright info
==5480== Command: ./a.out
==5480==
==5480== Invalid read of size 4
==5480== at 0x4005A0: split (qsort.c:46)
==5480== by 0x4005DE: quicksort (qsort.c:30)
==5480== by 0x400655: main (qsort.c:14)
==5480== Address 0x7ff000114 expected vs actual:
==5480== Expected: stack array "a" of size 20 in frame 2 back from here
==5480== Actual: unknown
==5480== Actual: is 0 after Expected
==5480==
After sort: 1 1 2 4 8
==5480==
==5480== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
位置0x4005A0:split(qsort.c:46)
与我手动在gdb
中找到的位置相匹配。
gcc
的-fstack-protector-all
有时也会在成本很低的情况下提供帮助。 - keltarexp-sgcheck
,这仍然是实验性的。我知道C语言不提供数组边界检查,而且在将数组传递给函数后,sizeof
信息会丢失,因此跟踪它并不容易。除此之外,我认为静态代码分析工具在这里也可能很有用。 - Grzegorz Szpetkowski