我目前正在运行一些JMeter测试,以测试Web服务的性能。它使用了非常高的CPU。对于一个JMeter请求线程,它使用的CPU占比在10-30%之间(取决于请求类型)。当我将它提升到仅有15个线程时,我获得了约95%的CPU利用率。自然地,我想弄清楚发生了什么。我进行了Hprof CPU采样(我尝试了times选项,但启动我的服务需要一个半小时,而且没有消息会通过)。以下是该采样结果的片段(在15分钟内)。
CPU样本开始(总数= 220846)2014年8月22日星期五13:38:54 排名 自身 累计 计数 跟踪方法 1 14.96% 14.96% 33038 300514 java.net.PlainSocketImpl.socketAccept 2 14.84% 29.80% 32776 301258 sun.nio.ch.EPollArrayWrapper.epollWait 3 12.45% 42.26% 27505 313002 sun.nio.ch.EPollArrayWrapper.epollWait 4 7.48% 49.73% 16517 300604 java.net.PlainSocketImpl.socketAccept 5 7.18% 56.91% 15856 303203 sun.nio.ch.EPollArrayWrapper.epollWait 6 6.18% 63.09% 13639 313001 sun.nio.ch.ServerSocketChannelImpl.accept0 7 6.04% 69.13% 13329 304259 sun.nio.ch.EPoll.epollWait 8 5.11% 74.23% 11275 307102 sun.nio.ch.EPollArrayWrapper.epollWait对于这些顶部样本的相应堆栈:
TRACE 300514: java.net.PlainSocketImpl.socketAccept(:Unknown line) java.net.AbstractPlainSocketImpl.accept(:Unknown line) java.net.ServerSocket.implAccept(:Unknown line) java.net.ServerSocket.accept(:Unknown line) sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(:Unknown line) sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(:Unknown line) java.lang.Thread.run(:Unknown line) TRACE 301258: sun.nio.ch.EPollArrayWrapper.epollWait(:Unknown line) sun.nio.ch.EPollArrayWrapper.poll(:Unknown line) sun.nio.ch.EPollSelectorImpl.doSelect(:Unknown line) sun.nio.ch.SelectorImpl.lockAndDoSelect(:Unknown line) sun.nio.ch.SelectorImpl.select(:Unknown line) org.apache.tomcat.util.net.NioBlockingSelector$BlockPoller.run(NioBlockingSelector.java:327) TRACE 313002: sun.nio.ch.EPollArrayWrapper.epollWait(:Unknown line) sun.nio.ch.EPollArrayWrapper.poll(:Unknown line) sun.nio.ch.EPollSelectorImpl.doSelect(:Unknown line) sun.nio.ch.SelectorImpl.lockAndDoSelect(:Unknown line) sun.nio.ch.SelectorImpl.select(:Unknown line) org.apache.tomcat.util.net.NioEndpoint$Poller.run(NioEndpoint.java:1163) java.lang.Thread.run(:Unknown line) TRACE 300604: java.net.PlainSocketImpl.socketAccept(:Unknown line) java.net.AbstractPlainSocketImpl.accept(:Unknown line) java.net.ServerSocket.implAccept(:Unknown line) java.net.ServerSocket.accept(:Unknown line) sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(:Unknown line) sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(:Unknown line) sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(:Unknown line) java.lang.Thread.run(:Unknown line) TRACE 303203: sun.nio.ch.EPollArrayWrapper.epollWait(:Unknown line) sun.nio.ch.EPollArrayWrapper.poll(:Unknown line) sun.nio.ch.EPollSelectorImpl.doSelect(:Unknown line) sun.nio.ch.SelectorImpl.lockAndDoSelect(:Unknown line) sun.nio.ch.SelectorImpl.select(:Unknown line) net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:217) net.spy.memcached.MemcachedConnection.run(MemcachedConnection.java:836) TRACE 313001: sun.nio.ch.ServerSocketChannelImpl.accept0(:Unknown line) sun.nio.ch.ServerSocketChannelImpl.accept(:Unknown line) org.apache.tomcat.util.net.NioEndpoint$Acceptor.run(NioEndpoint.java:793) java.lang.Thread.run(:Unknown line) TRACE 304259: sun.nio.ch.EPoll.epollWait(:Unknown line) sun.nio.ch.EPollPort$EventHandlerTask.poll(:Unknown line) sun.nio.ch.EPollPort$EventHandlerTask.run(:Unknown line) java.lang.Thread.run(:Unknown line) TRACE 307102: sun.nio.ch.EPollArrayWrapper.epollWait(:Unknown line) sun.nio.ch.EPollArrayWrapper.poll(:Unknown line) sun.nio.ch.EPollSelectorImpl.doSelect(:Unknown line) sun.nio.ch.SelectorImpl.lockAndDoSelect(:Unknown line) sun.nio.ch.SelectorImpl.select(:Unknown line) net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:217) net.spy.memcached.MemcachedConnection.run(MemcachedConnection.java:836)如您所见,超过一半的CPU使用率似乎来自应该在等待的线程。这不应该占用CPU时间,对吗?
我看到了这个帖子http://www.brendangregg.com/blog/2014-06-09/java-cpu-sampling-using-hprof.html,它让我想到这个结果可能是误导性的,但我的“top -H”结果显示最大的CPU使用率,Zabbix监控也是如此。因此,它实际上正在消耗CPU。然而,那里有一个链接引用了hprof作者的一句话:
当您有Java线程没有使用CPU,但设法保持活动状态时,当它们没有使用CPU时,它们的堆栈跟踪将似乎消耗大量的CPU时间。有人能解释一下为什么会出现这种情况以及在这些情况下我可以做些什么来减少CPU使用率吗?或者所有CPU使用率指标都是误导性的吗?如果是这样,了解服务的真正CPU利用率的更好方法是什么?