我目前正在尝试找到最佳的方式来交替运行两个线程并让它们互相等待。
最佳组合应该是快速且 CPU 成本低的。
到目前为止,我发现了三种方法,并将它们放在一个演示应用程序中展示了我发现的问题。
使用 TMonitor 遵循经典的等待/脉冲模式表现不太好,因为所有的锁定(根据 SamplingProfiler 的数据,大部分时间都在这些函数里)。我尝试了使用 Windows 事件(SyncObjs.TEvent)实现相同的效果,但表现类似(即不太好)。
使用调用 TThread.Yield 的等待循环表现最好,但显然会大量占用 CPU 周期。如果切换非常快,这并不重要,但当线程实际等待时就会产生影响(您可以在演示中看到这一点)。
使用 TSpinWait 表现得很好(如果不是这三种方法中表现最好),但只有当切换非常快时才能达到最佳性能。随着切换时间越长,性能越差,因为 TSpinWait 的工作原理如此。
由于多线程不是我擅长的领域,我想知道是否有一些组合方式或完全不同的方法可以实现在两种情况下(快速和缓慢切换)都有良好的性能。
program PingPongThreads;
{$APPTYPE CONSOLE}
{$R *.res}
uses
Classes,
Diagnostics,
SyncObjs,
SysUtils;
type
TPingPongThread = class(TThread)
private
fCount: Integer;
protected
procedure Execute; override;
procedure Pong; virtual;
public
procedure Ping; virtual;
property Count: Integer read fCount;
end;
TPingPongThreadClass = class of TPingPongThread;
TMonitorThread = class(TPingPongThread)
protected
procedure Pong; override;
procedure TerminatedSet; override;
public
procedure Ping; override;
end;
TYieldThread = class(TPingPongThread)
private
fState: Integer;
protected
procedure Pong; override;
public
procedure Ping; override;
end;
TSpinWaitThread = class(TPingPongThread)
private
fState: Integer;
protected
procedure Pong; override;
public
procedure Ping; override;
end;
{ TPingPongThread }
procedure TPingPongThread.Execute;
begin
while not Terminated do
Pong;
end;
procedure TPingPongThread.Ping;
begin
TInterlocked.Increment(fCount);
end;
procedure TPingPongThread.Pong;
begin
TInterlocked.Increment(fCount);
end;
{ TMonitorThread }
procedure TMonitorThread.Ping;
begin
inherited;
TMonitor.Enter(Self);
try
if Suspended then
Start
else
TMonitor.Pulse(Self);
TMonitor.Wait(Self, INFINITE);
finally
TMonitor.Exit(Self);
end;
end;
procedure TMonitorThread.Pong;
begin
inherited;
TMonitor.Enter(Self);
try
TMonitor.Pulse(Self);
if not Terminated then
TMonitor.Wait(Self, INFINITE);
finally
TMonitor.Exit(Self);
end;
end;
procedure TMonitorThread.TerminatedSet;
begin
TMonitor.Enter(Self);
try
TMonitor.Pulse(Self);
finally
TMonitor.Exit(Self);
end;
end;
{ TYieldThread }
procedure TYieldThread.Ping;
begin
inherited;
if Suspended then
Start
else
fState := 3;
while TInterlocked.CompareExchange(fState, 2, 1) <> 1 do
TThread.Yield;
end;
procedure TYieldThread.Pong;
begin
inherited;
fState := 1;
while TInterlocked.CompareExchange(fState, 0, 3) <> 3 do
if Terminated then
Abort
else
TThread.Yield;
end;
{ TSpinWaitThread }
procedure TSpinWaitThread.Ping;
var
w: TSpinWait;
begin
inherited;
if Suspended then
Start
else
fState := 3;
w.Reset;
while TInterlocked.CompareExchange(fState, 2, 1) <> 1 do
w.SpinCycle;
end;
procedure TSpinWaitThread.Pong;
var
w: TSpinWait;
begin
inherited;
fState := 1;
w.Reset;
while TInterlocked.CompareExchange(fState, 0, 3) <> 3 do
if Terminated then
Abort
else
w.SpinCycle;
end;
procedure TestPingPongThread(threadClass: TPingPongThreadClass; quickSwitch: Boolean);
const
MAXCOUNT = 10000;
var
t: TPingPongThread;
i: Integer;
sw: TStopwatch;
w: TSpinWait;
begin
t := threadClass.Create(True);
try
for i := 1 to MAXCOUNT do
begin
t.Ping;
if not quickSwitch then
begin
// simulate some work
w.Reset;
while w.Count < 20 do
w.SpinCycle;
end;
if i = 1 then
begin
if not quickSwitch then
begin
Writeln('Check CPU usage. Press <Enter> to continue');
Readln;
end;
sw := TStopwatch.StartNew;
end;
end;
Writeln(threadClass.ClassName, ' quick switches: ', quickSwitch);
Writeln('Duration: ', sw.ElapsedMilliseconds, ' ms');
Writeln('Call count: ', t.Count);
Writeln;
finally
t.Free;
end;
end;
procedure Main;
begin
TestPingPongThread(TMonitorThread, False);
TestPingPongThread(TYieldThread, False);
TestPingPongThread(TSpinWaitThread, False);
TestPingPongThread(TMonitorThread, True);
TestPingPongThread(TYieldThread, True);
TestPingPongThread(TSpinWaitThread, True);
end;
begin
try
Main;
except
on E: Exception do
Writeln(E.ClassName, ': ', E.Message);
end;
Writeln('Press <Enter> to exit');
Readln;
end.
更新:
我想到了一种事件和自旋等待的组合:
constructor TSpinEvent.Create;
begin
inherited Create(nil, False, False, '');
end;
procedure TSpinEvent.SetEvent;
begin
fState := 1;
inherited;
end;
procedure TSpinEvent.WaitFor;
var
startCount: Cardinal;
begin
startCount := TThread.GetTickCount;
while TInterlocked.CompareExchange(fState, 0, 1) <> 1 do
begin
if (TThread.GetTickCount - startCount) >= YieldTimeout then // YieldTimeout = 10
inherited WaitFor(INFINITE)
else
TThread.Yield;
end;
end;
当进行快速切换时,这种方法的性能仅比基于纤程的实现慢5到6倍,并且添加Ping调用之间的一些工作时,性能减慢不到1%。当使用纤程时,它当然会在2个核心上运行而不是1个。