在内核源码文档“Documentation/memory-barriers.txt”中有一张插图,如下所示:
CPU 1 CPU 2 ======================= ======================= { B = 7; X = 9; Y = 8; C = &Y } STORE A = 1 STORE B = 2 <write barrier> STORE C = &B LOAD X STORE D = 4 LOAD C (gets &B) LOAD *C (reads B)
Without intervention, CPU 2 may perceive the events on CPU 1 in some effectively random order, despite the write barrier issued by CPU 1:
+-------+ : : : : | | +------+ +-------+ | Sequence of update | |------>| B=2 |----- --->| Y->8 | | of perception on | | : +------+ \ +-------+ | CPU 2 | CPU 1 | : | A=1 | \ --->| C->&Y | V | | +------+ | +-------+ | | wwwwwwwwwwwwwwww | : : | | +------+ | : : | | : | C=&B |--- | : : +-------+ | | : +------+ \ | +-------+ | | | |------>| D=4 | ----------->| C->&B |------>| | | | +------+ | +-------+ | | +-------+ : : | : : | | | : : | | | : : | CPU 2 | | +-------+ | | Apparently incorrect ---> | | B->7 |------>| | perception of B (!) | +-------+ | | | : : | | | +-------+ | | The load of X holds ---> \ | X->9 |------>| | up the maintenance \ +-------+ | | of coherence of B ----->| B->2 | +-------+ +-------+ : :
我不明白,既然我们有写屏障,那么任何存储操作在执行C =&B时都必须生效,这意味着此时B应该等于2。对于CPU 2,当它获取C(即&B)的值时,B应该已经是2了,为什么它会认为B是7?我真的很困惑。
&B
加载是一种可能有用的方式,以满足一个地址尚未知晓的负载。值预测是一种方法;一些DEC Alpha模型具有分段L1d高速缓存,可以产生这种效果。分支预测是另一种方式。所以,它确实会发生,但机制比简单的硬件预取要奇怪得多。 - Peter Cordes