G1年轻代垃圾回收器未释放内存 - 空间耗尽

13

我正在使用G1GC,jdk 1.7

Java HotSpot(TM) 64-Bit Server VM (24.79-b02) for linux-amd64 JRE (1.7.0_79-b15), built on Apr 10 2015 11:34:48 by "java_re" with gcc 4.3.0 20080428 (Red Hat 4.3.0-8)
Memory: 4k page, physical 32826020k(12590436k free), swap 33431548k(33358800k free)

CommandLine flags: -XX:AutoBoxCacheMax=3000000 -XX:+DisableExplicitGC 
-XX:G1NewSizePercent=20 -XX:+HeapDumpOnOutOfMemoryError -XX:InitialHeapSize=10737418240 
-XX:InitiatingHeapOccupancyPercent=70 -XX:MaxDirectMemorySize=1073741824 -XX:MaxGCPauseMillis=1000 
-XX:MaxHeapSize=10737418240 
-XX:-OmitStackTraceInFastThrow -XX:+PrintGC -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCDetails 
-XX:+PrintGCTimeStamps -XX:+UnlockExperimentalVMOptions 
-XX:+UseCompressedOops -XX:+UseG1GC

大多数情况下,小型垃圾回收运作良好,但仍会在每小时发生一到两次异常的fullGC。

这是一个正常的年轻代GC日志。

3443.100: [GC pause (young), 0.3021260 secs]
  [Parallel Time: 277.6 ms, GC Workers: 4]
  [GC Worker Start (ms): Min: 3443100.5, Avg: 3443100.6, Max: 3443100.6, Diff: 0.1]
  [Ext Root Scanning (ms): Min: 2.9, Avg: 3.0, Max: 3.1, Diff: 0.2, Sum: 11.9]
  [Update RS (ms): Min: 33.5, Avg: 33.6, Max: 33.9, Diff: 0.4, Sum: 134.5]
     [Processed Buffers: Min: 180, Avg: 204.8, Max: 227, Diff: 47, Sum: 819]
  [Scan RS (ms): Min: 76.0, Avg: 76.2, Max: 76.3, Diff: 0.3, Sum: 304.9]
  [Code Root Scanning (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.3]
  [Object Copy (ms): Min: 164.4, Avg: 164.4, Max: 164.5, Diff: 0.1, Sum: 657.7]
  [Termination (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
  [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.0, Sum: 0.2]
  [GC Worker Total (ms): Min: 277.3, Avg: 277.4, Max: 277.4, Diff: 0.1, Sum: 1109.5]
  [GC Worker End (ms): Min: 3443377.9, Avg: 3443378.0, Max: 3443378.0, Diff: 0.0]
  [Code Root Fixup: 0.2 ms]
  [Code Root Migration: 0.3 ms]
  [Clear CT: 2.0 ms]
  [Other: 22.1 ms]
  [Choose CSet: 0.0 ms]
  [Ref Proc: 15.7 ms]
  [Ref Enq: 0.5 ms]
  [Free CSet: 3.2 ms]
  [Eden: 5996.0M(5996.0M)->0.0B(5648.0M) Survivors: 148.0M->196.0M Heap: 8934.5M(10.0G)->2997.2M(10.0G)]
  [Times: user=1.13 sys=0.00, real=0.30 secs]

这是一份在fullGC之前不寻常的GC日志。 它重复了两到三次,但没有清理任何内存。

3482.422: [GC pause (young) (to-space exhausted), 3.4878580 secs]
[Parallel Time: 1640.5 ms, GC Workers: 4]
  [GC Worker Start (ms): Min: 3482421.8, Avg: 3482422.4, Max: 3482424.0, Diff: 2.2]
  [Ext Root Scanning (ms): Min: 2.1, Avg: 3.2, Max: 3.8, Diff: 1.7, Sum: 12.6]
  [Update RS (ms): Min: 104.8, Avg: 105.2, Max: 105.6, Diff: 0.8, Sum: 421.0]
     [Processed Buffers: Min: 201, Avg: 221.2, Max: 236, Diff: 35, Sum: 885]
  [Scan RS (ms): Min: 75.1, Avg: 75.2, Max: 75.3, Diff: 0.1, Sum: 300.8]
  [Code Root Scanning (ms): Min: 0.0, Avg: 0.1, Max: 0.1, Diff: 0.1, Sum: 0.2]
  [Object Copy (ms): Min: 1455.9, Avg: 1456.1, Max: 1456.2, Diff: 0.3, Sum: 5824.2]
  [Termination (ms): Min: 0.0, Avg: 0.1, Max: 0.2, Diff: 0.2, Sum: 0.4]
  [GC Worker Other (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.1]
  [GC Worker Total (ms): Min: 1638.2, Avg: 1639.8, Max: 1640.4, Diff: 2.2, Sum: 6559.3]
  [GC Worker End (ms): Min: 3484062.2, Avg: 3484062.2, Max: 3484062.2, Diff: 0.0]
[Code Root Fixup: 0.2 ms]
[Code Root Migration: 0.5 ms]
[Clear CT: 2.0 ms]
[Other: 1844.7 ms]
  [Choose CSet: 0.0 ms]
  [Ref Proc: 60.1 ms]
  [Ref Enq: 0.5 ms]
  [Free CSet: 1.2 ms]
[Eden: 5648.0M(5648.0M)->0.0B(1876.0M) Survivors: 196.0M->172.0M Heap: 9441.9M(10.0G)->9352.3M(10.0G)]
[Times: user=9.29 sys=0.05, real=3.49 secs]

然后它开始进行一次fullGC

3490.812: [Full GC 9626M->1879M(10G), 7.6059670 secs]
[Eden: 0.0B(2048.0M)->0.0B(6144.0M) Survivors: 0.0B->0.0B Heap: 9626.3M(10.0G)->1879.5M(10.0G)], [Perm: 33901K->33901K(36864K)]
[Times: user=10.24 sys=0.00, real=7.61 secs]

为什么最后一个youngGC没有清理任何内存?


1
我不知道jvm7 gc的细节,但通常情况下,如果老一代中有许多指针指向年轻一代,即使这些指针位于老一代的垃圾中,年轻一代gc也会失败。算法无法确定何时发生这种情况。一种方法是观察连续的年轻一代收集中恢复了什么,并在年轻一代收集的结果不佳时触发完整的收集。这很可能就是这里正在发生的事情。一旦旧的垃圾被释放,年轻一代就可以跟进。 - Gene
1
你的一些参数可能会过于限制G1的启发式算法。你确定它们是必要的吗?另外,你的目标是低暂停时间还是高吞吐量?同时,你应该升级到Java 8。在Java 7下,G1GC仍然存在很多问题。 - the8472
1个回答

17
您正在遭受排空失败,可以看到集合的启动消息中的 to-space exhausted 部分。这是因为堆上没有足够的空闲空间来提升生存或已提升的对象(或两者皆有),而且堆无法再扩展更多。
Monica Beckwith 在 《调整垃圾优先(G1)垃圾回收器的技巧》 中写道:

G1 GC 的排空失败非常昂贵 -

  • 对于成功复制的对象,G1 需要更新引用,区域必须被保留。
  • 对于未成功复制的对象,G1 将自行转发它们并在原地保存区域。
通常情况下,当 G1 被迫执行这些操作时,它无法跟上分配速度,并最终由于分配失败而被迫进行完整的 GC。这可能就是为什么您在几次排空失败后看到完整的 GC。
Monica 还在《垃圾优先(G1)垃圾回收器的调整》中写到了可能的解决方案:
  • 增加 -XX:G1ReservePercent 选项(及相应的堆)的值,以增加 "to-space" 的保留内存量。

  • 通过减少 -XX:InitiatingHeapOccupancyPercent 来更早地开始标记周期。

  • 您还可以增加 -XX:ConcGCThreads 选项的值,以增加并行标记线程的数量。

此外,增加堆大小也是另一个减少排空失败可能性的选择。

当我们增加G1ReservePercent时,为什么还需要增加总堆大小?这是因为百分比的增加需要转化为绝对空间。这似乎有些违反直觉,因为似乎只需增加总堆大小就可以使G1Reserve空间在绝对意义上增加。 - Moiz Raja
1
@MoizRaja 如果你增加G1ReservePercent而不增加堆大小,那么将会保留更多的内存空间给垃圾收集器,因此你的应用程序可用的内存空间将变少。为了弥补这一点,总堆大小需要增加。你是正确的,增加总堆大小也会增加保留空间。因此,如果你同时增加G1ReservePercent和堆大小,你将从这两个操作中获得保留空间的增加。 - K Erlandsson

网页内容由stack overflow 提供, 点击上面的
可以查看英文原文,
原文链接