在Python中,变量交换操作是否保证原子性?

32
3个回答

58

让我们来看一下:

>>> x = 1
>>> y = 2
>>> def swap_xy():
...   global x, y
...   (x, y) = (y, x)
... 
>>> dis.dis(swap_xy)
  3           0 LOAD_GLOBAL              0 (y)
              3 LOAD_GLOBAL              1 (x)
              6 ROT_TWO             
              7 STORE_GLOBAL             1 (x)
             10 STORE_GLOBAL             0 (y)
             13 LOAD_CONST               0 (None)
             16 RETURN_VALUE    

看起来它们不是原子性的:在LOAD_GLOBAL字节码之间,ROT_TWO之前或之后以及STORE_GLOBAL字节码之间,变量x和y的值可能会被另一个线程更改。

如果你想要原子地交换两个变量,你需要一个锁或互斥体。

对于那些希望得到实证证明的人:

>>> def swap_xy_repeatedly():
...   while 1:
...     swap_xy()
...     if x == y:
...       # If all swaps are atomic, there will never be a time when x == y.
...       # (of course, this depends on "if x == y" being atomic, which it isn't;
...       #  but if "if x == y" isn't atomic, what hope have we for the more complex
...       #  "x, y = y, x"?)
...       print 'non-atomic swap detected'
...       break
... 
>>> t1 = threading.Thread(target=swap_xy_repeatedly)
>>> t2 = threading.Thread(target=swap_xy_repeatedly)
>>> t1.start()
>>> t2.start()
>>> non-atomic swap detected

4

我错了。

我改变了看法

Kragen Sitaker writes:

Someone recommended using the idiom

spam, eggs = eggs, spam

to get a thread-safe swap. Does this really work? (...)
So if this thread loses control anywhere between the first LOAD_FAST
and the last STORE_FAST, a value could get stored by another thread
into "b" which would then be lost. There isn't anything keeping this
from happening, is there?

Nope. In general not even a simple assignment is necessarily thread safe since performing the assignment may invoke special methods on an object which themselves may require a number of operations. Hopefully the object will have internally locked its "state" values, but that's not always the case.

But it's really dictated by what "thread safety" means in a particular application, because to my mind there are many levels of granularity of such safety so it's hard to talk about "thread safety". About the only thing the Python interpreter is going to give you for free is that a built-in data type should be safe from internal corruption even with native threading. In other words if two threads have a=0xff and a=0xff00, a will end up with one or the other, but not accidentally 0xffff as might be possible in some other languages if a isn't protected.

With that said, Python also tends to execute in such a fashion that you can get away with an awful lot without formal locking, if you're willing to live on the edge a bit and have implied dependencies on the actual objects in use. There was a decent discussion along those lines here in c.l.p a while back - search groups.google.com for the "Critical sections and mutexes" thread among others.

Personally, I explicitly lock shared state (or use constructs designed for exchanging shared information properly amongst threads, such as Queue.Queue) in any multi-threaded application. To my mind it's the best protection against maintenance and evolution down the road.

-- -- David


2
为什么?GIL?反汇编并没有表明原子性(请参见@jemfinch的答案)。 - kennytm
1
(顺便说一句,上面的评论不是一个修辞问题。) - kennytm
@Kenny:这是我的误解,我对元组拆包的底层工作方式理解有误。 - Esteban Küber

-1

Python原子操作用于共享数据类型。

https://sharedatomic.top

该模块可用于多进程和多线程条件下的原子操作。高性能Python!高并发,高性能!

使用多进程和多线程的原子API示例:

您需要遵循以下步骤来利用该模块:

  1. 创建函数,供子进程使用,参考UIntAPIs、IntAPIs、BytearrayAPIs、StringAPIs、SetAPIs、ListAPIs,在每个进程中,您可以创建多个线程。

     def process_run(a):
       def subthread_run(a):
         a.array_sub_and_fetch(b'\0x0F')
    
       threadlist = []
       for t in range(5000):
           threadlist.append(Thread(target=subthread_run, args=(a,)))
    
       for t in range(5000):
           threadlist[t].start()
    
       for t in range(5000):
           threadlist[t].join()
    
  2. 创建共享字节数组

    a = atomic_bytearray(b'ab', length=7, paddingdirection='r', paddingbytes=b'012', mode='m')
    
  3. 启动进程/线程以利用共享字节数组

     processlist = []
    
     for p in range(2):
    
       processlist.append(Process(target=process_run, args=(a,)))
    
     for p in range(2):
    
       processlist[p].start()
    
     for p in range(2):
    
       processlist[p].join()
    
     assert a.value == int.to_bytes(27411031864108609, length=8, byteorder='big')
    

这个库是否具有原子交换两个独立变量或atomic_bytearray的不同部分的能力?没有现代CPU指令集允许这样做(例如一些旧的m68k),除非通过事务性内存。此外,请勿在多个问题上重复相同的答案,特别是没有将其适应特定问题。这被发布为Does python have atomic CompareAndSet operation?(以及一些关于普通int的原子性的问题,其中它实际上并不适用)。 - Peter Cordes
我们没有两个指针交换函数,但是我们有三个指针移位函数。shared_atomic.atomic_int.int_shift(v: _cffi_backend._CDataBase, n: _cffi_backend._CDataBase, r: _cffi_backend._CDataBase)在两个组之间原子地交换3个指针的值,在将v存储到r之后将n存储到v中。 参数: v - 指向v的指针 n - 指向n的指针 r - 指向r的指针 - Xiquan Ren

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接