如何展示缓存未命中？

Question

如何展示缓存未命中？

4

受 Meyers 的启发，我正在阅读关于计算机高速缓存的文章computer cache，并想做一个实验来展示其中提到的东西。以下是我尝试的内容:

#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    typedef uint8_t data_t;

    const uint64_t max = (uint64_t)1<<30;
    const unsigned cycles = 1000;
    const uint64_t step = 63;  // tried also for 64

    volatile data_t acu = 0;
    volatile data_t *arr = malloc(sizeof(data_t) * max);
    for (uint64_t i = 0; i < max; ++i)
        arr[i] = ~i;

    for(unsigned c = 0; c < cycles; ++c)
        for (uint64_t i = 0; i < max; i += step)
            acu += arr[i];

    printf("%lu\n", max);

    return 0;
}

然后只需运行gcc --std=c99 -O0 test.c && time ./a.out。我已经检查过，我的CPU缓存行长度为64字节。通过分配step = 64，我试图更频繁地生成缓存未命中，而不是使用step=63。然而，step=63实际上运行稍快一些。我怀疑我成了预取的“牺牲品”，因为我的RAM读取是顺序的。如何改进我的遍历数组的示例，以展示缓存未命中的代价？

- Vorac

1个回答

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Vincent · Accepted Answer

使用 step = 63 仍然会出现大量的缓存未命中。前两次访问将在同一缓存行上，但接下来的 63 次将导致缓存未命中，访问第 63、6、61 字节等。更好的测量方法是展示 step = 1（几乎没有缓存未命中）和 step = 64（总是缓存未命中）之间的差异，并调整max以获得总访问次数。