Lambda性能提升,Java 8和11的比较

12

我对lambda表达式和方法引用进行了一些JMH测试,看起来类似于:

IntStream......reduce(Integer::max)
vs.
IntSream.......reduce((i1, i2) -> Integer.max(i1, i2))
我注意到的是,在Java 8中,方法引用的执行速度大约比Lambda快5倍。当我在Java 11中运行测试时,两种方法的执行时间都与Java 8中的方法引用一样快。因此,在Java 11中Lambda和方法引用之间没有太大的性能差异。
我的问题是:从Java 8到11有哪些改进措施可以提高这种性能?我正在使用OpenJDK。
编辑:我的基准测试:
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MILLISECONDS)
@State(Scope.Benchmark)
@Fork(value = 1, jvmArgs = {"-XX:CompileThreshold=5000"})
@Warmup(iterations = 2)
public class FindMaxInt {

@Param({"10000", "1000000", "10000000"})
private int n;

private List<Integer> data;

@Setup
public void setup(){
    data = createData();
}

@Benchmark
public void streamWithMethodReference(final Blackhole blackhole){
    int max = data.stream().mapToInt(Integer::intValue).reduce(Integer.MIN_VALUE, Integer::max);
    blackhole.consume(max);
}

@Benchmark
public void streamWithLambda(final Blackhole blackhole){
    int max = data.stream().mapToInt(Integer::intValue).reduce(Integer.MIN_VALUE, (i1, i2) -> Integer.max(i1, i2));
    blackhole.consume(max);
}

1
请展示您的基准测试结果。我无法复现您所说的效果。 - apangin
1
Lambda比方法引用多一层间接性,因此在JIT编译期间,具有lambda的表达式可能更早达到内联深度限制。请尝试使用“-XX:MaxInlineLevel = 20”重新运行您的测试。 - apangin
1
我仍然不理解Java8和11的结果如此之不同。我编辑了我的帖子并添加了基准测试@apangin。 - Johan Wiström
1个回答

23

这里是这篇那篇答案描述的效果组合。

不同的结果是由于内联树的差异所导致的。Lambda相比方法引用多了一层间接,因此在JIT编译时,使用Lambda表达式的表达式可能更早地达到内联深度限制。默认值为-XX:MaxInlineLevel=9

使用-XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining运行基准测试以查看整个内联树。以下是我们在JDK 8上获得的结果:

1563  560       4       bench.FindMaxInt::streamWithLambda (38 bytes)
                           @ 3   java.util.stream.IntPipeline::<init> (7 bytes)   inline (hot)
                             @ 3   java.util.stream.AbstractPipeline::<init> (91 bytes)   inline (hot)
                               @ 1   java.util.stream.PipelineHelper::<init> (5 bytes)   inline (hot)
                                 @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
                               @ 51   java.util.stream.StreamOpFlag::combineOpFlags (9 bytes)   inline (hot)
                                 @ 2   java.util.stream.StreamOpFlag::getMask (30 bytes)   inline (hot)
                               @ 66   java.util.stream.IntPipeline$StatelessOp::opIsStateful (2 bytes)   inline (hot)
                           @ 4   java.util.Collection::stream (11 bytes)   inline (hot)
                            \-> TypeProfile (5120/5120 counts) = java/util/ArrayList
                             @ 1   java.util.ArrayList::spliterator (12 bytes)   inline (hot)
                               @ 8   java.util.ArrayList$ArrayListSpliterator::<init> (26 bytes)   inline (hot)
                                 @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
                             @ 7   java.util.stream.StreamSupport::stream (19 bytes)   inline (hot)
                               @ 1   java.util.Objects::requireNonNull (14 bytes)   inline (hot)
                               @ 11   java.util.stream.StreamOpFlag::fromCharacteristics (37 bytes)   inline (hot)
                                 @ 1   java.util.ArrayList$ArrayListSpliterator::characteristics (4 bytes)   inline (hot)
                                  \-> TypeProfile (5124/5124 counts) = java/util/ArrayList$ArrayListSpliterator
                               @ 15   java.util.stream.ReferencePipeline$Head::<init> (8 bytes)   inline (hot)
                                 @ 4   java.util.stream.ReferencePipeline::<init> (8 bytes)   inline (hot)
                                   @ 4   java.util.stream.AbstractPipeline::<init> (55 bytes)   inline (hot)
                                     @ 1   java.util.stream.PipelineHelper::<init> (5 bytes)   inline (hot)
                                       @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
                           @ 9   java.lang.invoke.LambdaForm$MH/883049899::linkToTargetMethod (8 bytes)   force inline by annotation
                             @ 4   java.lang.invoke.LambdaForm$MH/1922154895::identity_L (8 bytes)   force inline by annotation
                           @ 14   java.util.stream.ReferencePipeline::mapToInt (26 bytes)   inline (hot)
                            \-> TypeProfile (5120/5120 counts) = java/util/stream/ReferencePipeline$Head
                             @ 1   java.util.Objects::requireNonNull (14 bytes)   inline (hot)
                             @ 22   java.util.stream.ReferencePipeline$4::<init> (20 bytes)   inline (hot)
                               @ 16   java.util.stream.IntPipeline$StatelessOp::<init> (29 bytes)   inline (hot)
                                 @ 3   java.util.stream.IntPipeline::<init> (7 bytes)   inline (hot)
                                   @ 3   java.util.stream.AbstractPipeline::<init> (91 bytes)   inline (hot)
                                     @ 1   java.util.stream.PipelineHelper::<init> (5 bytes)   inline (hot)
                                       @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
                                     @ 51   java.util.stream.StreamOpFlag::combineOpFlags (9 bytes)   inline (hot)
                                       @ 2   java.util.stream.StreamOpFlag::getMask (30 bytes)   inline (hot)
                                     @ 66   java.util.stream.IntPipeline$StatelessOp::opIsStateful (2 bytes)   inline (hot)
                           @ 21   java.lang.invoke.LambdaForm$MH/883049899::linkToTargetMethod (8 bytes)   force inline by annotation
                             @ 4   java.lang.invoke.LambdaForm$MH/1922154895::identity_L (8 bytes)   force inline by annotation
                           @ 26   java.util.stream.IntPipeline::reduce (16 bytes)   inline (hot)
                            \-> TypeProfile (5120/5120 counts) = java/util/stream/ReferencePipeline$4
                             @ 3   java.util.stream.ReduceOps::makeInt (18 bytes)   inline (hot)
                               @ 1   java.util.Objects::requireNonNull (14 bytes)   inline (hot)
                               @ 14   java.util.stream.ReduceOps$5::<init> (16 bytes)   inline (hot)
                                 @ 12   java.util.stream.ReduceOps$ReduceOp::<init> (10 bytes)   inline (hot)
                                   @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
                             @ 6   java.util.stream.AbstractPipeline::evaluate (94 bytes)   inline (hot)
                               @ 50   java.util.stream.AbstractPipeline::isParallel (8 bytes)   inline (hot)
                               @ 80   java.util.stream.TerminalOp::getOpFlags (2 bytes)   inline (hot)
                                \-> TypeProfile (5130/5130 counts) = java/util/stream/ReduceOps$5
                               @ 85   java.util.stream.AbstractPipeline::sourceSpliterator (265 bytes)   inline (hot)
                                 @ 79   java.util.stream.AbstractPipeline::isParallel (8 bytes)   inline (hot)
                               @ 88   java.util.stream.ReduceOps$ReduceOp::evaluateSequential (18 bytes)   inline (hot)
                                 @ 2   java.util.stream.ReduceOps$5::makeSink (5 bytes)   inline (hot)
                                   @ 1   java.util.stream.ReduceOps$5::makeSink (16 bytes)   inline (hot)
                                     @ 12   java.util.stream.ReduceOps$5ReducingSink::<init> (15 bytes)   inline (hot)
                                       @ 11   java.lang.Object::<init> (1 bytes)   inline (hot)
                                 @ 6   java.util.stream.AbstractPipeline::wrapAndCopyInto (18 bytes)   inline (hot)
                                   @ 3   java.util.Objects::requireNonNull (14 bytes)   inline (hot)
                                   @ 9   java.util.stream.AbstractPipeline::wrapSink (37 bytes)   inline (hot)
                                     @ 1   java.util.Objects::requireNonNull (14 bytes)   inline (hot)
                                     @ 23   java.util.stream.ReferencePipeline$4::opWrapSink (10 bytes)   inline (hot)
                                      \-> TypeProfile (5081/5081 counts) = java/util/stream/ReferencePipeline$4
                                       @ 6   java.util.stream.ReferencePipeline$4$1::<init> (11 bytes)   inline (hot)
                                         @ 7   java.util.stream.Sink$ChainedReference::<init> (16 bytes)   inline (hot)
                                           @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
                                           @ 6   java.util.Objects::requireNonNull (14 bytes)   inline (hot)
                                   @ 13   java.util.stream.AbstractPipeline::copyInto (53 bytes)   inline (hot)
                                     @ 1   java.util.Objects::requireNonNull (14 bytes)   inline (hot)
                                     @ 9   java.util.stream.AbstractPipeline::getStreamAndOpFlags (5 bytes)   accessor
                                     @ 12   java.util.stream.StreamOpFlag::isKnown (19 bytes)   inline (hot)
                                     @ 20   java.util.Spliterator::getExactSizeIfKnown (25 bytes)   inline (hot)
                                      \-> TypeProfile (5081/5081 counts) = java/util/ArrayList$ArrayListSpliterator
                                       @ 1   java.util.ArrayList$ArrayListSpliterator::characteristics (4 bytes)   inline (hot)
                                       @ 19   java.util.ArrayList$ArrayListSpliterator::estimateSize (11 bytes)   inline (hot)
                                         @ 1   java.util.ArrayList$ArrayListSpliterator::getFence (48 bytes)   inline (hot)
                                           @ 38   java.util.ArrayList::access$000 (5 bytes)   accessor
                                     @ 25   java.util.stream.Sink$ChainedReference::begin (11 bytes)   inline (hot)
                                      \-> TypeProfile (5081/5081 counts) = java/util/stream/ReferencePipeline$4$1
                                       @ 5   java.util.stream.ReduceOps$5ReducingSink::begin (9 bytes)   inline (hot)
                                        \-> TypeProfile (5079/5079 counts) = java/util/stream/ReduceOps$5ReducingSink
                                     @ 32   java.util.ArrayList$ArrayListSpliterator::forEachRemaining (129 bytes)   inline (hot)
                                       @ 51   java.util.ArrayList::access$000 (5 bytes)   accessor
                                       @ 99   java.util.stream.ReferencePipeline$4$1::accept (23 bytes)   inline (hot)
                                         @ 12   bench.FindMaxInt$$Lambda$8/390011259::applyAsInt (8 bytes)   inline (hot)
                                          \-> TypeProfile (13752/13752 counts) = bench/FindMaxInt$$Lambda$8
                                           @ 4   java.lang.Integer::intValue (5 bytes)   accessor
                                         @ 17   java.util.stream.ReduceOps$5ReducingSink::accept (19 bytes)   inline (hot)
                                          \-> TypeProfile (13752/13752 counts) = java/util/stream/ReduceOps$5ReducingSink
                                           @ 10   bench.FindMaxInt$$Lambda$9/208515840::applyAsInt (6 bytes)   inline (hot)
                                            \-> TypeProfile (9107/9107 counts) = bench/FindMaxInt$$Lambda$9
                                             @ 2   bench.FindMaxInt::lambda$streamWithLambda$0 (6 bytes)   inline (hot)
                                               @ 2   java.lang.Integer::max (6 bytes)   inlining too deep
                                     @ 38   java.util.stream.Sink$ChainedReference::end (10 bytes)   inline (hot)
                                       @ 4   java.util.stream.Sink::end (1 bytes)   inline (hot)
                                        \-> TypeProfile (5125/5125 counts) = java/util/stream/ReduceOps$5ReducingSink
                                 @ 12   java.util.stream.ReduceOps$5ReducingSink::get (5 bytes)   inline (hot)
                                   @ 1   java.util.stream.ReduceOps$5ReducingSink::get (8 bytes)   inline (hot)
                                     @ 4   java.lang.Integer::valueOf (32 bytes)   inline (hot)
                                       @ 28   java.lang.Integer::<init> (10 bytes)   inline (hot)
                                         @ 1   java.lang.Number::<init> (5 bytes)   inline (hot)
                                           @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
                             @ 12   java.lang.Integer::intValue (5 bytes)   accessor
                           @ 34   org.openjdk.jmh.infra.Blackhole::consume (28 bytes)   disallowed by CompilerOracle

关键的几行代码如下所示。它们意味着内联刚好在最后一次调用 Integer.max 处中断,因为默认的 9 层限制已经达到。

@ 2   bench.FindMaxInt::lambda$streamWithLambda$0 (6 bytes)   inline (hot)
  @ 2   java.lang.Integer::max (6 bytes)   inlining too deep

在JDK 11上,内联树的形状有很大不同:

1588  705       4       bench.FindMaxInt::streamWithLambda (38 bytes)
                           @ 4   java.util.Collection::stream (11 bytes)   inline (hot)
                            \-> TypeProfile (5263/5263 counts) = java/util/ArrayList
                             @ 1   java.util.ArrayList::spliterator (12 bytes)   inline (hot)
                               @ 8   java.util.ArrayList$ArrayListSpliterator::<init> (26 bytes)   inline (hot)
                                 @ 6   java.lang.Object::<init> (1 bytes)   inline (hot)
                             @ 7   java.util.stream.StreamSupport::stream (19 bytes)   inline (hot)
                               @ 1   java.util.Objects::requireNonNull (14 bytes)   inline (hot)
                               @ 11   java.util.stream.StreamOpFlag::fromCharacteristics (37 bytes)   inline (hot)
                                 @ 1   java.util.ArrayList$ArrayListSpliterator::characteristics (4 bytes)   inline (hot)
                                  \-> TypeProfile (5125/5125 counts) = java/util/ArrayList$ArrayListSpliterator
                               @ 15   java.util.stream.ReferencePipeline$Head::<init> (8 bytes)   inline (hot)
                                 @ 4   java.util.stream.ReferencePipeline::<init> (8 bytes)   inline (hot)
                                   @ 4   java.util.stream.AbstractPipeline::<init> (55 bytes)   inline (hot)
                                     @ 1   java.util.stream.PipelineHelper::<init> (5 bytes)   inline (hot)
                                       @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
                           @ 9   java.lang.invoke.Invokers$Holder::linkToTargetMethod (8 bytes)   force inline by annotation
                             @ 4   java.lang.invoke.LambdaForm$MH/0x0000000800060440::invoke (8 bytes)   force inline by annotation
                           @ 14   java.util.stream.ReferencePipeline::mapToInt (26 bytes)   inline (hot)
                            \-> TypeProfile (5263/5263 counts) = java/util/stream/ReferencePipeline$Head
                             @ 1   java.util.Objects::requireNonNull (14 bytes)   inline (hot)
                             @ 22   java.util.stream.ReferencePipeline$4::<init> (20 bytes)   inline (hot)
                               @ 16   java.util.stream.IntPipeline$StatelessOp::<init> (29 bytes)   inline (hot)
                                 @ 3   java.util.stream.IntPipeline::<init> (7 bytes)   inline (hot)
                                   @ 3   java.util.stream.AbstractPipeline::<init> (91 bytes)   inline (hot)
                                     @ 1   java.util.stream.PipelineHelper::<init> (5 bytes)   inline (hot)
                                       @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
                                     @ 51   java.util.stream.StreamOpFlag::combineOpFlags (9 bytes)   inline (hot)
                                       @ 2   java.util.stream.StreamOpFlag::getMask (30 bytes)   inline (hot)
                                     @ 66   java.util.stream.IntPipeline$StatelessOp::opIsStateful (2 bytes)   inline (hot)
                           @ 21   java.lang.invoke.Invokers$Holder::linkToTargetMethod (8 bytes)   force inline by annotation
                             @ 4   java.lang.invoke.LambdaForm$MH/0x0000000800060440::invoke (8 bytes)   force inline by annotation
                           @ 26   java.util.stream.IntPipeline::reduce (16 bytes)   inline (hot)
                            \-> TypeProfile (5263/5263 counts) = java/util/stream/ReferencePipeline$4
                             @ 3   java.util.stream.ReduceOps::makeInt (18 bytes)   inline (hot)
                               @ 1   java.util.Objects::requireNonNull (14 bytes)   inline (hot)
                               @ 14   java.util.stream.ReduceOps$6::<init> (16 bytes)   inline (hot)
                                 @ 12   java.util.stream.ReduceOps$ReduceOp::<init> (10 bytes)   inline (hot)
                                   @ 1   java.lang.Object::<init> (1 bytes)   inline (hot)
                             @ 6   java.util.stream.AbstractPipeline::evaluate (94 bytes)   inline (hot)
                               @ 50   java.util.stream.AbstractPipeline::isParallel (8 bytes)   inline (hot)
                               @ 80   java.util.stream.TerminalOp::getOpFlags (2 bytes)   inline (hot)
                                \-> TypeProfile (5362/5362 counts) = java/util/stream/ReduceOps$6
                               @ 85   java.util.stream.AbstractPipeline::sourceSpliterator (265 bytes)   inline (hot)
                                 @ 79   java.util.stream.AbstractPipeline::isParallel (8 bytes)   inline (hot)
                               @ 88   java.util.stream.ReduceOps$ReduceOp::evaluateSequential (18 bytes)   already compiled into a big method
                             @ 12   java.lang.Integer::intValue (5 bytes)   accessor
                           @ 34   org.openjdk.jmh.infra.Blackhole::consume (28 bytes)   disallowed by CompileCommand
编译树的截止位置之所以提前,是由于另一个原因导致的:
@ 88   java.util.stream.ReduceOps$ReduceOp::evaluateSequential (18 bytes)   already compiled into a big method

JDK 11 中默认的垃圾收集器已经改为 G1。由于 G1 的屏障,编译后的代码看起来更大了,因此内联启发式算法阻止了最热的 forEachRemaining 循环内联到 streamWithLambda 方法中。

事实上,JDK 11 中这并不是一种优化,反而是相反的情况。但是,在最热的循环之外发生的内联树截断使得该基准测试的整体性能表现更好。

Inlining tree


你说的“hottest loop”是什么意思? - Thomas Banderas
3
在代码中消耗最多 CPU 时间的循环是遍历流元素的循环。这个循环可以在这里找到:http://hg.openjdk.java.net/jdk/jdk/file/cd701366fcf8/src/java.base/share/classes/java/util/ArrayList.java#l1652 - apangin
阅读了一些相关内容后,我是否应该理解(这里没有明确说明)JIT已经在两种情况下优化了热循环,但是Java 8通过单独编译一些热方法来取消了优化,使得(最热的)Integer.max现在未被编译?而Java 11由于G1的怪癖意味着它不小心地没有这样做,因此保留了原始编译? - drekbour
@drekbour 在这里反优化是无关紧要的。此外,这里的所有内容都已编译。Integer.max没有被内联,这意味着与未编译不同的事情。 - apangin
明白了 s/compile/inline/g 的含义,但是 Java 11 在这种情况下为什么仍然保持快速呢?(另外:这是分层编译在起作用吗?) - drekbour
@drekbour 内联。在JDK 11中,方法forEachRemaining被编译为单个单元。由JDK 8编译的循环内部有一个方法调用(这意味着开销)。 - apangin

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接