C语言中的原子交换

Question

C语言中的原子交换

7

我觉得我在这里漏掉了一些显而易见的东西。我有这样的代码：

int *a = <some_address>;
int *b = <another_address>;

// ...

// desired atomic swap of addresses a and b here
int *tmp = a;
a = b; b = tmp;

我希望能够原子化地完成此操作。我已经查看了 __c11_atomic_exchange 和 OSX 上的 OSAtomic 方法，但是似乎没有任何方法能够直接进行原子交换。它们可以将值原子化地写入 2 个变量中的 1 个，但从来没有同时完成两者。

有什么想法吗？

- Tim

7

你所遗漏的是同时与两个内存位置进行原子操作的事实不存在。如果你想要原子地交换两个对象，你需要用某种互斥量来保护它们。 - Casey

1

谢谢，Casey。那真是一个“傻瓜”时刻... - Tim

只有少数机器曾经具有原子操作两个不相交位置的能力，例如68020到68040具有CAS2指令。如果您拥有像Intel RTM这样的事务性内存，则可以将其作为事务执行。但这种情况非常罕见，以至于ISO C和C++不公开此类操作，否则atomic_int在几乎所有架构上都必须是非无锁的（因为它不能通过其他方式合成而不锁定）。 - Peter Cordes

2个回答

0

快速概述：基于自旋锁的原子交换答案

这段代码片段是我们将要构建的。

对于在多核系统上使用基本自旋锁（高速、无上下文切换互斥）方法，这是自旋锁最好的应用场景，您可以像这样对变量值进行原子交换（在本例中为指针）：

int *a = &some_int1;  // ptr a
int *b = &some_int2;  // ptr b
spinlock_t spinlock = SPINLOCK_UNLOCKED;

// swap pointers `a` and `b` (note: swapping the pointers themselves, not the
// integer values they point to)
atomic_swap_int_ptr(&spinlock, &a, &b);

以下是详细信息：

C11中通用自旋锁方法（用于包装需要原子操作的任何“关键部分”）

如果您的架构支持使用数组“对”进行无锁原子交换（在此处的主要答案中提供），则请考虑。它可能是无锁的，因此比使用互斥锁或自旋锁更有效率。

我想深入探讨一下这个问题，并尝试使用自旋锁。我没有对各种技术（数组“对”，互斥锁和自旋锁）进行速度分析，但对我来说，进行这样的分析非常有趣和学术意义，以了解它们在速度上的比较。

无论如何，正如我所说的：您也可以使用互斥锁。

或者，如果您不希望线程在互斥锁不可用时进入睡眠状态，则可以通过使用自旋锁来提高效率，因为自旋锁必须自旋的次数通常很少，因此自旋比使线程进入睡眠状态并进行整个上下文切换更快。

C11有一整套原子类型和函数，可以让您轻松实现自己的自旋锁以完成此操作。

但请记住，在裸机（没有操作系统）单核（处理器）系统上，基本自旋锁不能使用，因为如果主上下文获取锁，然后在锁仍被占用时，ISR触发并尝试获取锁，那么ISR将无限期地阻塞。

为了避免这种情况，必须使用更复杂的锁定机制，例如带有超时的自旋锁、自动重新调度自身以稍后时间获取锁的ISR、自动放弃等。

因此，自旋锁最好保留给具有操作系统和调度程序的多核系统。

使用`_Atomic`类型在C11中实现基本自旋锁

#include <stdbool.h>
#include <stdatomic.h>


#define SPINLOCK_UNLOCKED 0
#define SPINLOCK_LOCKED   1

typedef volatile atomic_bool spinlock_t;

// This is how you'd create a new spinlock to use in the functions below.
// spinlock_t spinlock = SPINLOCK_UNLOCKED;

/// \brief  Set the spin lock to `SPINLOCK_LOCKED`, and return the previous 
///  state of the `spinlock` object.
/// \details    If the previous state was `SPINLOCK_LOCKED`, you can know that
///  the lock was already taken, and you do NOT now own the lock. If the
///  previous state was `SPINLOCK_UNLOCKED`, however, then you have now locked
///  the lock and own the lock.
bool atomic_test_and_set(spinlock_t* spinlock)
{
    bool spinlock_val_old = atomic_exchange(spinlock, SPINLOCK_LOCKED);
    return spinlock_val_old;
}

/// Spin (loop) until you take the spin lock.
void spinlock_lock(spinlock_t* spinlock)
{
    // Spin lock: loop forever until we get the lock; we know the lock was
    // successfully obtained after exiting this while loop because the
    // atomic_test_and_set() function locks the lock and returns the previous
    // lock value. If the previous lock value was SPINLOCK_LOCKED then the lock
    // was **already** locked by another thread or process. Once the previous
    // lock value was SPINLOCK_UNLOCKED, however, then it indicates the lock
    // was **not** locked before we locked it, but now it **is** locked because
    // we locked it, indicating we own the lock.
    while (atomic_test_and_set(spinlock) == SPINLOCK_LOCKED) {};
}

/// Release the spin lock.
void spinlock_unlock(spinlock_t* spinlock)
{
    *spinlock = SPINLOCK_UNLOCKED;
}

现在，您可以像这样在多核系统中使用自旋锁：

/// Atomically swap two pointers to int (`int *`). 
/// NB: this does NOT swap their contents, meaning: the integers they point to.
/// Rather, it swaps the pointers (addresses) themselves.
void atomic_swap_int_ptr(spinlock_t * spinlock, int ** ptr1, int ** ptr2)
{
    spinlock_lock(spinlock)
    // critical, atomic section start

    int * temp = *ptr1;
    *ptr1 = *ptr2;
    *ptr2 = temp;
    
    // critical, atomic section end
    spinlock_unlock(spinlock)
}


// Demo to perform an address swap

int *a = &some_int1;
int *b = &some_int2;
spinlock_t spinlock = SPINLOCK_UNLOCKED;

// swap pointers `a` and `b` (note: swapping the pointers themselves, not the
// integer values they point to)
atomic_swap_int_ptr(&spinlock, &a, &b);  // <=========== final answer

参考资料

https://en.wikipedia.org/wiki/Test-and-set#Pseudo-C_implementation_of_a_spin_lock
***** C11并发支持库和atomic_bool和其他类型：https://en.cppreference.com/w/c/thread
C11 atomic_exchange()函数：https://en.cppreference.com/w/c/atomic/atomic_exchange

其他链接

请注意，您也可以使用C11的struct atomic_flag以及其相应的atomic_flag_test_and_set()函数，而不是像我上面所做的那样使用atomic_exchange()创建自己的atomic_test_and_set()函数。
C11中的_Atomic类型：https://en.cppreference.com/w/c/language/atomic
C++11使用头文件<atomic>实现自旋锁

待办事项（最新的在顶部）

[ ] 20230418: 升级此实现并使用以下更改进行速度分析。请参阅C并发文档：https://en.cppreference.com/w/c/thread：
1. [ ] 使用atomic_flag而不是atomic_bool：https://en.cppreference.com/w/c/atomic/atomic_flag（重点添加）：
  
  atomic_flag是一种原子布尔类型。与其他原子类型不同，它保证是无锁的。与atomic_bool不同，atomic_flag不提供加载或存储操作。
  1. 确保使用ATOMIC_FLAG_INIT进行初始化。
  2. 此外，使用atomic_flag_test_and_set()代替我上面的自定义atomic_test_and_set()，并使用atomic_flag_clear()来清除或“取消设置”标志。
[ ] 20220922: 此实现需要进行修改以添加安全的反死锁机制，例如自动延迟、超时和重试，以防止在单核心和裸机系统上出现死锁。请参见我的笔记：Add basic mutex (lock) and spin lock implementations in C11, C++11, AVR, and STM32和What are the various ways to disable and re-enable interrupts in STM32 microcontrollers in order to implement atomic access guards?

- Gabriel Staples

1

Spinlocks（仅需要获取/释放语义）使用的是自旋锁，而不是默认的seq_cst。顺便说一句，您的措辞可能会被误解为在多核系统上，自旋锁是最好的锁定方式。我认为您的意思是，在任何可能考虑使用自旋锁的系统中，它们只有在多核系统上才有意义，并且最好使用抢占式多任务处理。同意在这里使用自旋锁是一个合理的选择。 - Peter Cordes

1

如果您想为要交换的两个对象使用单独的锁（或锁的哈希表），那么可以允许不同线程在交换不同对时进行更多的并行操作。为了避免死锁，按地址递增的顺序获取锁，这样，如果两个操作都需要锁定相同的一对对象，则避免了一个锁定一个对象和另一个锁定另一个对象的情况。（如果使用锁的哈希表，则需要检查它们是否都散列到同一个条目，并仅锁定该条目。）但这是更多的锁操作，如果一个锁没有竞争激烈，可能会更慢。 - Peter Cordes

1

如果我没记错的话，一些编译器在非无锁 std::atomic 的情况下使用自旋锁而不是完整的互斥锁。 - Peter Cordes

1

你可能会发现无论使用哪种架构，GCC生成的汇编代码都是一样的。至少在x86架构中，它使用xchg指令来实现atomic_flag_test_and_set_explicit。(对于.clear()或atomic_bool赋值操作，如果不使用memory_order_release让其只使用更便宜的mov指令。) 也许对于一些ISA，如果存在特殊的test-and-set指令但原子交换更加麻烦，它会使用原子交换指令。如果https://godbolt.org/上有一些历史的ISAs（如m68k），这可能很有趣。 - Peter Cordes

1

相关：Performance pthread_spinlock_t is 2x better than my own implementation with lock free std::atomic_flag, around std::list展示了使用C++ std::atomic_flag手动实现的自旋锁，并且对其在Skylake上进行了一些变体的性能测试（包括/不包括x86 pause指令，以及在尝试另一个RMW之前进行只读检查）。这些性能测试可能不准确且具有误导性。 - undefined

显示剩余7条评论

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- Jens Gustedt · Accepted Answer

这取决于你的架构，实际和有效的操作方式有所不同。无论如何，你需要确保要交换的两个指针在内存中是相邻的，最简单的方法是将它们设置为类似以下的形式：

typedef struct pair { void *a[2]; } pair;

如果您有一个完整的C11编译器，那么您可以使用_Atomic(pair)作为原子类型来交换这样一个pair的内容：首先将实际的pair加载到临时变量中，构造一个新的pair并交换两个值，进行比较和交换以存储新值，必要时进行迭代。

inline
void pair_swap(_Atomic(pair) *myPair) {
  pair actual = { 0 };
  pair future = { 0 };

  while (!atomic_compare_exchange_weak(myPair, &actual, future)) {
      future.a[0] = actual.a[1];
      future.a[1] = actual.a[0];
  }
}

根据您的架构，这可能会使用无锁原语操作实现，也可能不会。现代的64位架构通常具有128位原子性，这可能只需要一些128位汇编指令。

但这很大程度上取决于您平台上原子操作的实现质量。P99将为您提供一个实现原子操作“类似函数”的部分，并支持检测到的128位交换的实现；在我的机器上（64位Intel Linux），这有效地编译为lock cmpxchg16b指令。

C语言中的原子交换

快速概述：基于自旋锁的原子交换答案

C11中通用自旋锁方法（用于包装需要原子操作的任何“关键部分”）

使用_Atomic类型在C11中实现基本自旋锁

参考资料

其他链接

待办事项（最新的在顶部）

使用`_Atomic`类型在C11中实现基本自旋锁