Java 11字符串拼接性能与Java 8相比如何?

6

有人知道为什么我在Java 8和Java 11上运行此代码时性能差异如此之大吗?

在不使用任何运行时标志的情况下,似乎这段代码在Java 11下运行比Java 8慢得多。

import java.util.Date;

public class PerformanceExperiment {
    public static volatile String s = "";

    public static void main(String[] args)
    {                         
        System.out.println("Starting performance test");
        String s1 = "STRING ONE";
        String s2 = "STRING TWO";
        long now1 = (new Date()).getTime();
        for (long i = 0; i < 1_000_000_00; i++)
        {
            s = "abc " + s1 + " def " + s2;
        }
        long now2 = (new Date()).getTime();
        System.out.println("initial block took " + (now2 - now1) + "ms");
        for (long i = 0; i < 4_000_000_00; i++)
        {
            s = "abc " + s1 + " def " + s2;
        }
        long now3 = (new Date()).getTime();
        System.out.println("Main block took " + (now3 - now2) + "ms");
    }
}

我尝试了很多命令行标志,但都没有达到Java 8的性能水平。

我只在Windows上测试过,可能在其他操作系统上表现不同。


1
好的阅读材料:https://redfin.engineering/java-string-concatenation-which-way-is-best-8f590a7d22a8 - Mebin Joe
1
问题真的存在吗?你的数据是什么? - VSB
1
我已经几乎专门使用Java超过十年了,但我刚刚学到你可以在int字面量中加下划线... 惊呆了 - ryvantage
@Alex,你为什么不提供你的输出呢?你说“显著慢”,但是什么是显著的?这个程序有任何输出吗? - ryvantage
这可能是JDK本身的一个错误。 我尝试了你的代码,结果如下: 初始块花费了1809毫秒 主要块花费了5918毫秒 我在“s1”和“s2”中添加了“final”后再次尝试你的代码,结果如下: 初始块花费了118毫秒 主要块花费了432毫秒 我将迭代次数增加到Integer.MAX_VALUE,结果如下: 初始块花费了31300毫秒 主要块花费了31127毫秒 现在是Integer.MAX_VALUE + final: 初始块花费了1578毫秒 主要块花费了1327毫秒 - Bogdan B
显示剩余2条评论
3个回答

8

我对你的应用进行了修改,以便:

  1. 使用 System.nanoTime() 替代 new Date(),以提高精度(有关更多信息,请参见此答案: https://dev59.com/WHI-5IYBdhLWcg3wm5pH#1776053)。
  2. 使用 Netbeans 分析器。
  3. 循环10次

使用 Netbeans 8.2 和 JDK 8 v181:

Starting performance test 0
initial block took 3147ms
Main block took 9469ms
Starting performance test 1
initial block took 2398ms
Main block took 9601ms
Starting performance test 2
initial block took 2463ms
Main block took 9671ms
Starting performance test 3
initial block took 2464ms
Main block took 9565ms
Starting performance test 4
initial block took 2410ms
Main block took 9672ms
Starting performance test 5
initial block took 2418ms
Main block took 9598ms
Starting performance test 6
initial block took 2384ms
Main block took 9733ms
Starting performance test 7
initial block took 2402ms
Main block took 9610ms
Starting performance test 8
initial block took 2509ms
Main block took 11222ms
Starting performance test 9
initial block took 2455ms
Main block took 10661ms

而分析器显示了这个遥测数据:

enter image description here

使用JDK 11.0.2的Netbeans 10.0:
Starting performance test 0
initial block took 3760ms
Main block took 15056ms
Starting performance test 1
initial block took 3734ms
Main block took 14602ms
Starting performance test 2
initial block took 3615ms
Main block took 14762ms
Starting performance test 3
initial block took 3748ms
Main block took 14534ms
Starting performance test 4
initial block took 3628ms
Main block took 14759ms
Starting performance test 5
initial block took 3625ms
Main block took 14959ms
Starting performance test 6
initial block took 3987ms
Main block took 14967ms
Starting performance test 7
initial block took 3803ms
Main block took 14701ms
Starting performance test 8
initial block took 3599ms
Main block took 14762ms
Starting performance test 9
initial block took 3627ms
Main block took 14434ms

enter image description here

我的结论是:JDK 11在提高内存效率方面做了更多的工作。注意到垃圾收集器中“幸存代”的数量要少得多,而且内存使用量和波动性也显著降低。这种权衡似乎是在速度上,但速度差异小于内存使用差异。

6

TL;DR: 需要更好的基准测试,更好的设置来控制版本之间的差异等。使用JMH可以轻松解决大部分基准测试问题。当前测试行为似乎是由可疑的基准测试方法和默认GC的变化所解释的。

考虑一下:

public class PerformanceExperiment {
    public static volatile String s = "";

    public static void main(String[] args) {
        for (int c = 0; c < 5; c++) {
            test();
        }
    }

    public static void test() {
        String s1 = "STRING ONE";
        String s2 = "STRING TWO";
        long time1 = System.currentTimeMillis();
        for (long i = 0; i < 4_000_000_00; i++) {
            s = "abc " + s1 + " def " + s2;
        }
        long time2 = System.currentTimeMillis();
        System.out.println("Main block took " + (time2 - time1) + "ms");
    }
}

首先,它使用更方便的时间安排。然后,它测量相同的字节码块,而原始测试会预热“初始测试”,然后继续测量完全冷却的测试。

然后,JIT编译将命中该方法,并且您希望重新进入该方法以使优化代码运行,否则您将运行中间的“on-stack-replacement”代码--您可以通过调用test的外部迭代来完成此操作。并且除此之外,您还想多次进入以捕获最优化的版本。

由于测试分配了大量内存,因此您需要确定堆大小。

所以,在这里:

$ ~/Install/jdk8u191-rh/bin/javac PerformanceExperiment.java
$ ~/Install/jdk8u191-rh/bin/java -Xms2g -Xmx2g PerformanceExperiment
Main block took 10024ms
Main block took 9768ms
Main block took 7249ms
Main block took 7235ms
Main block took 7205ms

...这是相同字节码上的11.0.2:

$ ~/Install/jdk11.0.2/bin/java -Xms2g -Xmx2g PerformanceExperiment
Main block took 9775ms
Main block took 10825ms
Main block took 8635ms
Main block took 8616ms
Main block took 8622ms

这里是带有匹配GC的11.0.2版本(9+将默认GC更改为G1,详情请参考JEP 248):

$ ~/Install/jdk11.0.2/bin/java -Xms2g -Xmx2g -XX:+UseParallelGC PerformanceExperiment
Main block took 9281ms
Main block took 9129ms
Main block took 6725ms
Main block took 6688ms
Main block took 6684ms

此外,每个小迭代还有volatile存储,这会耗费相当多的资源,很可能会扭曲基准测试结果。

此外,还有与标识字符串连接interaction、线程本地握手(JEP 312)和其他VM修复程序的交互(JEP 280),但仅在编译目标=8之后才能看到,这超出了此练习的范围。


1
我同意您提出的所有观点。然而,在我看来,一个相当简单且小的代码示例在新版JDK中可能需要更长时间才能运行,默认情况下不使用任何标志。 - Alex
4
JDK的选择不同,它们做出了不同的权衡,导致性能行为明显不同。除非你能证明“那个代码示例”比其他一切都更重要,否则这里就没有问题。换句话说,你的测试肯定测量了某些内容,但真正的问题是它是否测量了你所需要的内容。 - Aleksey Shipilev
@alex 说得有道理...我们在一个机器人 Java 应用程序中遭受了这种行为变化的痛苦。在竞争过程中,我们的日志记录代码中需要进行字符串连接的前几个调用所需的额外时间使我们失去了平衡。多亏了这个线程,我们最终找到了根本原因,但为时已晚。 - Billy
@Billy,在“生产”之前不要更改设置,这是一条黄金法则。 - Eugene

0

https://bugs.java.com/bugdatabase/view_bug.do?bug_id=JDK-8221760 实际错误 - hakamairi

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接