This appendix contain examples of garbage collection verbosegc output that indicate an problem with garbage collection. The default collector is being used unless otherwise specified. Unless otherwise specified the verbosegc output is generated with
-verbosegc -XX:+ PrintGCDetails
The format for the diagnostic output presented here is the format used in 1.4.2. The format is not guaranteed to be the same in later releases. In particular the output associated with -XX:+ PrintGCDetails is almost guaranteed to change in that it represents the current needs of JVM developers.
The minor collection output for these options produce output of the form
[GC [<collector>: <starting occupancy1> -> <ending occupancy1>, <pause time1> secs] <starting occupancy3> -> <ending occupancy3>, <pause time3> secs]
<collector> is an internal name for the collector used in the minor collection
<starting occupancy1> is the occupancy of the young generation before the collection
<ending occupancy1> is the occupancy of the young generation after the collection
<pause time1> is the pause time in seconds for the minor collection.
<starting occupancy3> is the occupancy of the entire heap before the collection
<ending occupancy3> is the occupancy of the entire heap after the collection
<pause time3> is the pause time for the entire garbage collection. This would include the time for a major collection is one was done.
In the discussions that follow snippets of the verbosegc output are presented as examples of the information that can be gathered. When statements are made regarding a quantity such as pause time, the numbers being quoted are for averages over all the data gathered, not just from the snippet presented. If a range of values is given, that range is from the snippet presented.
The young generation heap size in this first example is about 4 Mbytes with an overall heap size of about 32 Mbytes.
[GC [DefNew: 4032K->64K(4032K), 0.0429742 secs] 9350K->7748K(32704K), 0.0431096 secs]
[GC [DefNew: 4032K->64K(4032K), 0.0403446 secs] 11716K->10121K(32704K), 0.0404867 secs]
[GC [DefNew: 4032K->64K(4032K), 0.0443969 secs] 14089K->12562K(32704K), 0.0445251 secs]
This output seems reasonable from the point of view of the time spent in garbage collection but note that although the occupancy of the young generation is decreasing (e.g., in the first line from 4032 to 64k by 3968k) the occupancy of the entire heap is decreasing by considerably less (by 1602k from 9350k to 7748k). This indicates that only about 40% objects in the young generation were garbage and the rest survive the collection and are being promoted into the old generation.
Increasing the young generation size to increase the free space after the collection
The young generation heap size in this next run is increased to 8 Mbytes.
[GC [DefNew: 8128K->64K(8128K), 0.0453670 secs] 13000K->7427K(32704K), 0.0454906 secs]
[GC [DefNew: 8128K->64K(8128K), 0.0388632 secs] 15491K->9663K(32704K), 0.0390013 secs]
[GC [DefNew: 8128K->64K(8128K), 0.0388610 secs] 17727K->11829K(32704K), 0.0389919 secs]
With an 8 Mbyte size most of young generation is garbage at the time of the minor collection. In the first line the young generation heap decreases by 8064k from 8128k to 64k and the entire heap decreases by 5573k from 13000k to 7427k meaning about 68% of the young generation was garbage. As would be expected the size of the young generation does not change the amount of live objects that survive the minor collection (about 2.4M bytes in each case) but there are fewer minor collections and the cost of the collections in terms of the minor collection pause times are comparable (fractions of a second listed at the end of each line).
Application time and stopped time
To illustrate the cost of the 4 Mbyte young generation versus the 8 Mbyte young generation, the output below was generated with the additional command line options -XX:+ PrintGCApplicationConcurrentTime and -XX:+ PrintGCApplicationStoppedTime which measure the amount of time the applications runs between collection pauses and the length of the collection pauses, respectively. In the run with a 4 Mbyte young generation the application runs for .53 to .91 second with minor collection pauses of from .045 to .047 second. That's between about 5% and 8% overhead for minor collections.
Application time: 0.5291524 seconds
[GC [DefNew: 3968K->64K(4032K), 0.0460948 secs] 7451K->6186K(32704K), 0.0462350 secs]
Total time for which application threads were stopped: 0.0468229 seconds
Application time: 0.5279058 seconds
[GC [DefNew: 4032K->64K(4032K), 0.0447854 secs] 10154K->8648K(32704K), 0.0449156 secs]
Total time for which application threads were stopped: 0.0453124 seconds
Application time: 0.9063706 seconds
[GC [DefNew: 4032K->64K(4032K), 0.0464574 secs] 12616K->11187K(32704K), 0.0465921 secs]
Total time for which application threads were stopped: 0.0470484 seconds
For the 8 Mbyte young generations
Application time: 1.3874623 seconds
[GC [DefNew: 8064K->63K(8128K), 0.0509215 secs] 11106K->5994K(32704K), 0.0510972 secs]
Total time for which application threads were stopped: 0.0517092 seconds
Application time: 1.5225065 seconds
[GC [DefNew: 8127K->63K(8128K), 0.0432982 secs] 14058K->8273K(32704K), 0.0434172 secs]
Total time for which application threads were stopped: 0.0440447 seconds
Application time: 1.4263524 seconds
[GC [DefNew: 8127K->64K(8128K), 0.0363538 secs] 16337K->10381K(32704K), 0.0364811 secs]
Total time for which application threads were stopped: 0.0369103 seconds
The application is running 1.38 to 1.52 seconds between minor collection pauses of from .036 to .051. The overheard for minor collection pauses is between 2.6% and 3.6%.
In going from a young generation heap size of 4 Mbytes to 8 Mbytes the overheard due to minor collections was about cut in half. What happens if the young generation size is doubled to 16 Mbytes?
[GC [DefNew: 16000K->16000K(16192K), 0.0000574 secs][Tenured: 2973K->2704K(16384K), 0.1012650 secs] 18973K->2704K(32576K), 0.1015066 secs]
[GC [DefNew: 16000K->16000K(16192K), 0.0000518 secs][Tenured: 2704K->2535K(16384K), 0.0931034 secs] 18704K->2535K(32576K), 0.0933519 secs]
[GC [DefNew: 16000K->16000K(16192K), 0.0000498 secs][Tenured: 2535K->2319K(16384K), 0.0860925 secs] 18535K->2319K(32576K), 0.0863350 secs]
This is an example of the relative size of the young generation to the tenured generation being too large to allow the young generation promotion to be guaranteed (the young generation is about half the size of the tenured generation). The young generation cannot be collected successfully and only major collections are occurring. Note that in this cause the young generation is collected but only as part of the more expensive major collection.
With the young generation sized at 8 Mbytes and the overall heap at 32 Mbytes the major collection pauses are about .13 second.
[GC [DefNew: 8128K->8128K(8128K), 0.0000558 secs][Tenured: 17746K->2309K(24576K), 0.1247669 secs] 25874K->2309K(32704K), 0.1250098 secs]
If the heap size is increased to 64 Mbytes, the major collections pauses increase to about .21 second.
[GC [DefNew: 8128K->8128K(8128K), 0.0000369 secs][Tenured: 50059K->5338K(57344K), 0.2218912 secs]
58187K->5338K(65472K), 0.2221317 secs]
but the major collections are occurring at about half the frequency. To see this print out time stamps at the collections by adding the option -XX:+ PrintGCTimeStamps. For the 32 Mbyte heap the major collections are occurring about every 10 -11 seconds (only the major collections are shown).
111.042: [GC 111.042: [DefNew: 8128K->8128K(8128K), 0.0000505 secs]111.042: [Tenured: 18154K->2311K(24576K), 0.1290354 secs] 26282K->2311K(32704K), 0.1293306 secs]
122.463: [GC 122.463: [DefNew: 8128K->8128K(8128K), 0.0000560 secs]122.463: [Tenured: 18630K->2366K(24576K), 0.1322560 secs] 26758K->2366K(32704K), 0.1325284 secs]
133.896: [GC 133.897: [DefNew: 8128K->8128K(8128K), 0.0000443 secs]133.897: [Tenured: 18240K->2573K(24576K), 0.1340199 secs] 26368K->2573K(32704K), 0.1343218 secs]
144.112: [GC 144.112: [DefNew: 8128K->8128K(8128K), 0.0000544 secs]144.112: [Tenured: 16564K->2304K(24576K), 0.1246831 secs] 24692K->2304K(32704K), 0.1249602 secs]
For the 64 Mbyte heap the major collections are occurring about every 30 seconds.
90.597: [GC 90.597: [DefNew: 8128K->8128K(8128K), 0.0000542 secs]90.597: [Tenured: 49841K->5141K(57344K), 0.2129882 secs] 57969K->5141K(65472K), 0.2133274 secs]
120.899: [GC 120.899: [DefNew: 8128K->8128K(8128K), 0.0000550 secs]120.899: [Tenured: 50384K->2430K(57344K), 0.2216590 secs] 58512K->2430K(65472K), 0.2219384 secs]
153.968: [GC 153.968: [DefNew: 8128K->8128K(8128K), 0.0000511 secs]153.968: [Tenured: 51164K->2309K(57344K), 0.2193906 secs] 59292K->2309K(65472K), 0.2196372 secs]
Which heap size, 32 Mbytes or 64 Mbytes, is better? For higher throughput the 64 Mbyte heap is better while the 32 Mbyte heap provides the lesser pause times.
Even larger young generation sizes
To pursue higher throughput consider even larger heaps. For a 256 Mbyte total heap with a 64 Mbyte young generation size, when compared to the 32 Mbyte heap with an 8 Mbyte young generation run, the minor collection pauses are only about 10% longer reflecting the fact that the number of objects surviving the collection are about the same. The major collection are about 3.8 times longer reflecting the larger size of the heap.
[GC [DefNew: 64575K->959K(64576K), 0.0457646 secs] 196016K->133633K(261184K), 0.0459067 secs]
[GC [DefNew: 64575K->64575K(64576K), 0.0000573 secs][Tenured: 132673K->5437K(196608K), 0.4959855 secs] 197249K->5437K(261184K), 0.4962533 secs]
[GC [DefNew: 63616K->959K(64576K), 0.0360258 secs] 69053K->7600K(261184K), 0.0361663 secs]
Try the parallel young generation collectors
If the minor collection pauses are too long, try using the parallel young generation collectors. Adding -XX:+UseParallelGC produces the following output. In this example adaptive sizing was turned off with -XX:- UseAdaptiveSizing for simplicity. Also time stamps were added for comparison with other results.
500.285: [GC 51526K->2678K(253952K), 0.0120328 secs]
506.734: [GC 51830K->2646K(253952K), 0.0117832 secs]
513.068: [GC 51798K->2742K(253952K), 0.0124632 secs]
519.566: [GC 51894K->2550K(253952K), 0.0122923 secs]
Here the average minor collection pause is about .015 second which is a reduction of 68%. The adaptive sizing was turned off because of an anomalous behavior that is exhibited in this next set of output with adaptive sizing on.
[GC 62752K->2992K(259328K), 0.0126936 secs]
[GC 62896K->60192K(259328K), 0.0127579 secs]
[GC 63008K->3120K(259328K), 0.0123150 secs]
[GC 63024K->60256K(259328K), 0.0120565 secs]
[GC 63072K->3024K(259328K), 0.0127215 secs]
[GC 62928K->60208K(259328K), 0.0113090 secs]
[GC 63024K->3136K(259328K), 0.0133799 secs]
[GC 63040K->60256K(259328K), 0.0135459 secs]
Adaptive sizing adjusts the size of the survivor spaces. In the alternate minor collections shown above the size of the survivor spaces is oscillating between values that alternatively cause a minimal collection of the young generation with a nearly complete collection of the young generation. Although a strange behavior the use of adaptive sizing may still yield a higher throughput than turning it off.
Alternatively, using -XX:+UseParNewGC yields
497.905: [GC 497.905: [ ParNew: 64576K->960K(64576K), 0.0255372 secs] 155310K->93003K(261184K), 0.0256767 secs]
506.305: [GC 506.305: [ParNew: 64576K->960K(64576K), 0.0276291 secs] 156619K->94267K(261184K), 0.0277958 secs]
514.565: [GC 514.565: [ParNew: 64576K->960K(64576K), 0.0261376 secs] 157883K->95711K(261184K), 0.0262927 secs]
522.838: [GC 522.838: [ParNew: 64576K->960K(64576K), 0.0316625 secs] 159327K->97331K(261184K), 0.0318099 secs]
which shows about a 44% decrease in the minor collection pauses. The time stamps were added to show the period of the minor collections. With -XX:+UseParallelGC the minor collections occur about every 6.5 seconds and take .015 second (for a minor collection overhead of .33%). For -XX:+UseParNewGC the collections occur about every 8 seconds and take about .026 second (for a minor collection overhead of .23%).
Try the concurrent low pause collector
The major collection pauses for the 256 Mbyte total heap with the 64 Mbyte young generation is about .489 second. If this is too long the concurrent low pause collector can be tried with the command line option -XX:+UseConcMarkSweepGC (and also keeping the -XX:+UseParNewGC)
[GC [ParNew: 64576K->960K(64576K), 0.0377639 secs] 140122K->78078K(261184K), 0.0379598 secs]
[GC [ParNew: 64576K->960K(64576K), 0.0329313 secs] 141694K->79533K(261184K), 0.0331324 secs]
[GC [ParNew: 64576K->960K(64576K), 0.0413880 secs] 143149K->81128K(261184K), 0.0416101 secs]
[GC [1 CMS-initial-mark: 80168K(196608K)] 81144K(261184K), 0.0059036 secs]
[CMS-concurrent-mark: 0.129/0.129 secs]
[CMS-concurrent-preclean: 0.007/0.007 secs]
[GC[ Rescan (non-parallel) [ grey object rescan, 0.0020879 secs][root rescan, 0.0144199 secs], 0.016
6258 secs][weak refs processing, 0.0000411 secs] [1 CMS-remark: 80168K(196608K)] 82493K(261184K),
[CMS-concurrent-sweep: 1.208/1.208 secs]
[CMS-concurrent-reset: 0.036/0.036 secs]
[GC [ParNew: 64576K->960K(64576K), 0.0311520 secs] 66308K->4171K(261184K), 0.0313513 secs]
[GC [ParNew: 64576K->960K(64576K), 0.0348341 secs] 67787K->5695K(261184K), 0.0350776 secs]
[GC [ParNew: 64576K->960K(64576K), 0.0359806 secs] 69311K->7154K(261184K), 0.0362064 secs]
With the application used for these examples the time to do a concurrent collection is short relative to the time between minor collections. This is typically not the case. More often there will be minor collections during the concurrent collection. Note that the concurrent phases may be long (e.g., the CMS-concurrent-sweep is 1.2 seconds) but the application is not stopped during the concurrent phases so the application will not see a pause. Conversely, although the application is not paused during the concurrent phases, neither does it have use of all the available processors (one being used by the garbage collection thread). The pauses will come from the CMS-initial-mark and CMS-remark pause. In this snippet of output the larger of those is .017 second). Over the run the average for the CMS-remark pause (always the longer pause) was .019 second. The maximum pause for a major collection (as compared to the default collector) was then reduced from .489 second to .019 second (96% reduction). Note also that the minor collection pause increased to .035 second which is higher than the .026 second minor pause using only -XX:+UseParNewGC. This is indicative of the overhead associated with the concurrency of the major collection.
Try the incremental low pause collector
The major collection times can also be reduced by use of the incremental low pause collector. Use the command line option -Xincgc.
[GC [DefNew: 64575K->959K(64576K), 0.0616204 secs][Train: 5537K->5474K(196608K), 0.0108803 secs] 68664K->6434K(261184K), 0.0730248 secs]
[GC [DefNew: 64575K->959K(64576K), 0.0559078 secs][Train: 6813K->6749K(196608K), 0.0115077 secs] 70050K->7709K(261184K), 0.0680289 secs]
[GC [DefNew: 64575K->959K(64576K), 0.0607615 secs][Train: 8178K->8114K(196608K), 0.0112705 secs] 71325K->9074K(261184K), 0.0727902 secs]
During each minor collection part of the old generation is collected and there is no separate major collection pause. The minor collection pauses here increase to about .078 second. If the incremental collection of the tenured generation cannot be completed before tenured generation is full, a major collection occurs and is indicated in the verbosegc output by the 'MSC' in the output.
[GC [DefNew: 64575K->64575K(64576K), 0.0005311 secs][Train MSC: 68226K->135K(196608K), 0.4347857 secs] 132802K->2565K(261184K), 0.4397831 secs]