在使用线程时如何使用全局变量

Question

在使用线程时如何使用全局变量

115

我该如何在线程之间共享全局变量？

我的Python代码示例是：

from threading import Thread
import time
a = 0  #global variable

def thread1(threadname):
    #read variable "a" modify by thread 2

def thread2(threadname):
    while 1:
        a += 1
        time.sleep(1)

thread1 = Thread( target=thread1, args=("Thread-1", ) )
thread2 = Thread( target=thread2, args=("Thread-2", ) )

thread1.join()
thread2.join()

我不知道如何让两个线程共享一个变量。

- Mauro Midolo

5个回答

56

在一个函数中：

a += 1

编译器将解释为将值赋给a => 创建本地变量a，这不是您想要的。它可能会出现a未初始化的错误，因为（本地）a确实没有被初始化：

>>> a = 1
>>> def f():
...     a += 1
... 
>>> f()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 2, in f
UnboundLocalError: local variable 'a' referenced before assignment

你也许可以使用（非常不被赞成，而且有充分理由）global 关键字来达到你想要的效果，像这样：

>>> def f():
...     global a
...     a += 1
... 
>>> a
1
>>> f()
>>> a
2

一般而言，您应该避免使用全局变量，因为这很容易失控。尤其是在多线程程序中更是如此，在那里您没有任何同步机制可以让thread1知道何时修改了a。简而言之：线程是复杂的，并且当两个或多个线程在同一个值上工作时，您不能指望有关事件发生顺序的直观理解。语言、编译器、操作系统、处理器等都可能发挥作用，并决定为了速度、实用性或任何其他原因来修改操作的顺序。

这种情况的正确做法是使用Python共享工具(锁和其他工具)，或者更好的方式是通过队列来通信而不是共享数据，例如：

from threading import Thread
from queue import Queue
import time

def thread1(threadname, q):
    #read variable "a" modify by thread 2
    while True:
        a = q.get()
        if a is None: return # Poison pill
        print a

def thread2(threadname, q):
    a = 0
    for _ in xrange(10):
        a += 1
        q.put(a)
        time.sleep(1)
    q.put(None) # Poison pill

queue = Queue()
thread1 = Thread( target=thread1, args=("Thread-1", queue) )
thread2 = Thread( target=thread2, args=("Thread-2", queue) )

thread1.start()
thread2.start()
thread1.join()
thread2.join()

- val

这解决了一个大问题。看起来这是正确的方法去做它。 - Abhidemon

这是我用来解决同步问题的方法。 - Zhang LongQI

1

我有一些问题。首先，如果我有多个变量需要在线程之间共享，我需要为每个变量创建一个单独的队列吗？其次，为什么上面的程序中的队列会同步？难道它们不应该在每个函数中作为本地副本使用吗？ - user4340135

2

虽然这是旧的问题，但我仍然回答。队列本身没有同步机制，就像变量a一样。是队列默认的阻塞行为创建了同步。语句a = q.get()会阻塞（等待），直到值a可用。变量q是局部的：如果你给它分配一个不同的值，它只会在本地发生。但是在代码中分配给它的队列是在主线程中定义的。 - user6627712

2

并不总是需要使用队列来在线程之间共享信息。chepner的答案中的示例完全没问题。此外，队列并不总是合适的工具。例如，如果您想阻塞直到值可用，则队列很有用。但如果两个线程竞争共享资源，那么队列就无用了。最后，在线程中使用全局变量也不会更糟。事实上，它们可能更自然。例如，您的线程只可能是一段代码块，比如一个循环，需要自己的进程。因此，当您将循环放入函数中时，本地作用域就被人为地创建出来了。 - user6627712

9

应该考虑使用锁，例如threading.Lock。有关更多信息，请参见锁对象。

接受的答案可以通过thread1打印10，这不是你想要的。您可以运行以下代码更轻松地了解错误。

def thread1(threadname):
    while True:
      if a % 2 and not a % 2:
          print "unreachable."

def thread2(threadname):
    global a
    while True:
        a += 1

使用锁可以防止在多次读取时更改a：

def thread1(threadname):
    while True:
      lock_a.acquire()
      if a % 2 and not a % 2:
          print "unreachable."
      lock_a.release()

def thread2(threadname):
    global a
    while True:
        lock_a.acquire()
        a += 1
        lock_a.release()

如果线程长时间使用变量，首先将其复制到本地变量是个不错的选择。

- Jason Pan

5

感谢Jason Pan提出的建议。thread1 if语句不是原子操作，因此在执行该语句时，thread2可能会干扰thread1，从而导致无法到达的代码被执行。我已经将之前帖子中的想法整理成一个完整的演示程序（如下），并在Python 2.7中运行了它。

通过一些深入的分析，我相信我们可以获得进一步的见解，但现在我认为重要的是展示非原子行为遇到线程时会发生什么。

# ThreadTest01.py - Demonstrates that if non-atomic actions on
# global variables are protected, task can intrude on each other.
from threading import Thread
import time

# global variable
a = 0; NN = 100

def thread1(threadname):
    while True:
      if a % 2 and not a % 2:
          print("unreachable.")
    # end of thread1

def thread2(threadname):
    global a
    for _ in range(NN):
        a += 1
        time.sleep(0.1)
    # end of thread2

thread1 = Thread(target=thread1, args=("Thread1",))
thread2 = Thread(target=thread2, args=("Thread2",))

thread1.start()
thread2.start()

thread2.join()
# end of ThreadTest01.py

正如预测的那样，在运行示例时，“不可达”代码有时会被执行，从而产生输出。

另外，当我在thread1中插入锁获取/释放对时，发现“不可达”消息打印的概率大大降低。为了看到这个消息，我将睡眠时间减少到0.01秒，并将NN增加到1000。

在thread1中使用锁获取/释放对时，我本来不希望看到消息，但它确实存在。在我也在thread2中插入锁获取/释放对之后，该消息不再出现。回想起来，thread2中的递增语句可能也是非原子的。

- Krista M Hill

1

你需要在两个线程中使用锁，因为这些是协作的“咨询锁”（而不是“强制性”的）。你说得对，增量语句是非原子的。 - Darkonaut

0

好的，运行示例：

警告！切勿在家庭/工作环境下尝试此操作！仅限于课堂使用；)

使用信号量、共享变量等方法来避免竞争条件。

from threading import Thread
import time

a = 0  # global variable


def thread1(threadname):
    global a
    for k in range(100):
        print("{} {}".format(threadname, a))
        time.sleep(0.1)
        if k == 5:
            a += 100


def thread2(threadname):
    global a
    for k in range(10):
        a += 1
        time.sleep(0.2)


thread1 = Thread(target=thread1, args=("Thread-1",))
thread2 = Thread(target=thread2, args=("Thread-2",))

thread1.start()
thread2.start()

thread1.join()
thread2.join()

和输出：

Thread-1 0
Thread-1 1
Thread-1 2
Thread-1 2
Thread-1 3
Thread-1 3
Thread-1 104
Thread-1 104
Thread-1 105
Thread-1 105
Thread-1 106
Thread-1 106
Thread-1 107
Thread-1 107
Thread-1 108
Thread-1 108
Thread-1 109
Thread-1 109
Thread-1 110
Thread-1 110
Thread-1 110
Thread-1 110
Thread-1 110
Thread-1 110
Thread-1 110
Thread-1 110

如果时机合适，a += 100 操作将被跳过：

处理器在 T 时刻执行 a+100 并得到 104。但它停止了，并跳转到下一个线程。在这里，在 T+1 时刻执行旧值为 a 的 a+1，a == 4。因此它计算出 5。在 T+2 时刻跳回线程 1，并在内存中写入 a=104。现在回到线程 2，时间是 T+3，并在内存中写入 a=5。哇！下一个打印指令将打印 5 而不是 104。

非常难以重现和捕获的错误。

- visoft

1

请考虑添加正确的实现。这对于那些学习在线程之间共享数据的人来说非常有帮助。 - JS.

1

已添加到“待办事项”列表中 :) - visoft

网页内容由stack overflow 提供, 点击上面的

可以查看英文原文，
原文链接

- chepner · Accepted Answer

你只需在thread2中将a声明为全局变量，这样你就不会修改该函数内局部的a。

def thread2(threadname):
    global a
    while True:
        a += 1
        time.sleep(1)

在thread1中，只要不尝试修改a的值(这会创建一个遮蔽全局变量的局部变量；如果需要，请使用global a)，就不需要做任何特别的事情。

def thread1(threadname):
    #global a       # Optional if you treat a as read-only
    while a < 10:
        print a