这是否意味着每次完成 I/O 操作时都会为其获取一个
新的线程池线程?或者是专门为此分配的线程数?实际上,为每个 I/O 请求创建一个新线程将非常低效,甚至会达到适得其反的效果。相反,运行时从一小组线程开始(确切的数量取决于您的环境),并根据需要添加和删除工作线程(确切的算法同样因您的环境而异)。.NET 的每个主要版本都看到了这种实现的变化,但基本思想保持不变:运行时尽力只创建和维护尽可能少的线程,以有效地处理所有 I/O。在我的系统(Windows 8.1,.NET 4.5.2)中,全新的控制台应用程序在进入
Main
时仅具有 3 个进程线程,并且在实际请求工作之前,此数字不会增加。
那么当全部完成时,这是否意味着我将同时拥有1000个IOCP线程池线程? 并不是这样的。当您发出I/O请求时,线程将等待完成端口以获取结果,并调用注册的任何回调来处理结果(无论是通过BeginXXX
方法还是作为任务的连续体)。如果您使用任务并且不等待它,该任务仅在那里结束,线程将被返回到线程池中。
如果您确实等待它会怎样呢? 1000个 I/O 请求的结果不会真正同时到达,因为中断并不会同时到达,但是假设间隔时间远小于我们处理它们所需的时间。在这种情况下,线程池将保持旋转以处理结果,直到达到最大限制,并且任何进一步请求都将排队在完成端口上。根据您的配置方式,这些线程可能需要一些时间才能启动。
考虑以下(故意糟糕的)示例程序:
static void Main(string[] args) {
printThreadCounts();
var buffer = new byte[1024];
const int requestCount = 30;
int pendingRequestCount = requestCount;
for (int i = 0; i != requestCount; ++i) {
var stream = new FileStream(
@"C:\Windows\win.ini",
FileMode.Open, FileAccess.Read, FileShare.ReadWrite,
buffer.Length, FileOptions.Asynchronous
);
stream.BeginRead(
buffer, 0, buffer.Length,
delegate {
Interlocked.Decrement(ref pendingRequestCount);
Thread.Sleep(Timeout.Infinite);
}, null
);
}
do {
printThreadCounts();
Thread.Sleep(1000);
} while (Thread.VolatileRead(ref pendingRequestCount) != 0);
Console.WriteLine(new String('=', 40));
printThreadCounts();
}
private static void printThreadCounts() {
int completionPortThreads, maxCompletionPortThreads;
int workerThreads, maxWorkerThreads;
ThreadPool.GetMaxThreads(out maxWorkerThreads, out maxCompletionPortThreads);
ThreadPool.GetAvailableThreads(out workerThreads, out completionPortThreads);
Console.WriteLine(
"Worker threads: {0}, Completion port threads: {1}, Total threads: {2}",
maxWorkerThreads - workerThreads,
maxCompletionPortThreads - completionPortThreads,
Process.GetCurrentProcess().Threads.Count
);
}
在我的系统上(拥有 8 个逻辑处理器),输出如下(在您的系统上可能会有所不同):
Worker threads: 0, Completion port threads: 0, Total threads: 3
Worker threads: 0, Completion port threads: 8, Total threads: 12
Worker threads: 0, Completion port threads: 9, Total threads: 13
Worker threads: 0, Completion port threads: 11, Total threads: 15
Worker threads: 0, Completion port threads: 13, Total threads: 17
Worker threads: 0, Completion port threads: 15, Total threads: 19
Worker threads: 0, Completion port threads: 17, Total threads: 21
Worker threads: 0, Completion port threads: 19, Total threads: 23
Worker threads: 0, Completion port threads: 21, Total threads: 25
Worker threads: 0, Completion port threads: 23, Total threads: 27
Worker threads: 0, Completion port threads: 25, Total threads: 29
Worker threads: 0, Completion port threads: 27, Total threads: 31
Worker threads: 0, Completion port threads: 29, Total threads: 33
========================================
Worker threads: 0, Completion port threads: 30, Total threads: 34
当我们发出30个异步请求时,线程池会迅速提供8个线程来处理结果,但在此之后,它只会以每秒约2个的悠闲速度启动新线程。这表明,如果您想正确利用系统资源,最好确保您的I/O处理快速完成。实际上,让我们将代理更改为以下内容,它表示对请求的“适当”处理:
stream.BeginRead(
buffer, 0, buffer.Length,
ar => {
stream.EndRead(ar);
Interlocked.Decrement(ref pendingRequestCount);
}, null
);
结果:
Worker threads: 0, Completion port threads: 0, Total threads: 3
Worker threads: 0, Completion port threads: 1, Total threads: 11
========================================
Worker threads: 0, Completion port threads: 0, Total threads: 11
请注意,您的系统和不同的运行可能会导致结果有所不同。在这里,我们仅仅瞥见完成端口线程的工作,而我们发出的30个请求则在不启动新线程的情况下得到了完成。您应该会发现,您可以将"30"更改为"100"甚至"100000":我们的循环无法启动比请求更快。然而,请注意,由于"I/O"一直在重复读取相同的字节,并且将从操作系统缓存而不是从磁盘中读取进行服务,因此结果对我们有利。当然,这并不意味着它展示了实际吞吐量,只是说明了开销的差异。
如果想要使用工作者线程而不是完成端口线程来重复这些结果,只需将FileOptions.Asynchronous
更改为FileOptions.None
即可。这将使文件访问同步,异步操作将在工作者线程上完成,而不使用完成端口:
Worker threads: 0, Completion port threads: 0, Total threads: 3
Worker threads: 8, Completion port threads: 0, Total threads: 15
Worker threads: 9, Completion port threads: 0, Total threads: 16
Worker threads: 10, Completion port threads: 0, Total threads: 17
Worker threads: 11, Completion port threads: 0, Total threads: 18
Worker threads: 12, Completion port threads: 0, Total threads: 19
Worker threads: 13, Completion port threads: 0, Total threads: 20
Worker threads: 14, Completion port threads: 0, Total threads: 21
Worker threads: 15, Completion port threads: 0, Total threads: 22
Worker threads: 16, Completion port threads: 0, Total threads: 23
Worker threads: 17, Completion port threads: 0, Total threads: 24
Worker threads: 18, Completion port threads: 0, Total threads: 25
Worker threads: 19, Completion port threads: 0, Total threads: 26
Worker threads: 20, Completion port threads: 0, Total threads: 27
Worker threads: 21, Completion port threads: 0, Total threads: 28
Worker threads: 22, Completion port threads: 0, Total threads: 29
Worker threads: 23, Completion port threads: 0, Total threads: 30
Worker threads: 24, Completion port threads: 0, Total threads: 31
Worker threads: 25, Completion port threads: 0, Total threads: 32
Worker threads: 26, Completion port threads: 0, Total threads: 33
Worker threads: 27, Completion port threads: 0, Total threads: 34
Worker threads: 28, Completion port threads: 0, Total threads: 35
Worker threads: 29, Completion port threads: 0, Total threads: 36
========================================
Worker threads: 30, Completion port threads: 0, Total threads: 37
线程池每秒会启动一个工作线程,而不是两个完成端口线程。显然这些数字取决于具体实现,并且可能在新版本中发生变化。
最后,让我们演示如何使用 ThreadPool.SetMinThreads
来确保有足够的线程来完成请求。如果回到 FileOptions.Asynchronous
并将 ThreadPool.SetMinThreads(50, 50)
添加到我们玩具程序的 Main
中,结果如下:
Worker threads: 0, Completion port threads: 0, Total threads: 3
Worker threads: 0, Completion port threads: 31, Total threads: 35
========================================
Worker threads: 0, Completion port threads: 30, Total threads: 35
现在,线程池不再每两秒钟耐心地添加一个线程,而是一直创建新的线程直到达到最大数量(这种情况下没有达到最大值,所以最终数量保持在30个)。当然,这30个线程中的所有线程都被卡在无限等待状态下--但如果这是一个真实的系统,那么这些30个线程现在可能正在做有用的工作,虽然可能效率不高。不过,我不会尝试使用100000个请求。