Understanding JVM Garbage Collection – Part 6 (Serial and Parallel Collector) 

JVM GC Implementation

The JVM GC is a pluggable subsystem that enables the JVM to execute the same Java program with different GC implementations. Using a new GC algorithm does not require any change to the existing Java program. However, the chosen GC implementation may significantly impact the program's performance.
 
Java provides three types of collectors: serial, parallel, and mostly concurrent collectors. Generally, the JVM uses two kinds of collectors: evacuating collectors for the young generation and compacting collectors for the old generation.
 
Let's delve into each of the different Java GC collectors.
 

Serial Collector

The Serial GC is the simplest Java GC algorithm. It is the default collector on single-core 32-bit machines. The Serial GC uses a single thread for garbage collection. Due to its low memory footprint, the serial GC is the default collector for Hotspot's client JVMs. It is suitable for single-processor machines and small heap sizes. Consider using serial collectors on multiprocessor hardware only if you have a heap size of less than 100MB. The serial collector is a generational garbage collector that uses an evacuating, also known as mark and copy, collector for the young generation and a mark-sweep-compact (MSC) collector for the old generation. The young generation collector is called Serial, while the old generation collector is called Serial (MSC). However, the serial collector is impractical for most modern server-side applications.
 
 
 

Parallel Collector

The parallel collector is the default garbage collector on 64-bit machines with two or more CPUs (up to Java 8). It is similar to serial collectors in that it is a generational collector, but it uses multiple threads to perform garbage collection. The parallel collector collects both the young and old generations using multiple threads. By default, the parallel collector calculates the number of garbage collection threads based on the number of cores or hardware threads. For machines with eight or more cores, the number of threads is set to 5/8 of the number of cores, while for machines with less than eight cores, the number of threads used is set to the number of cores. The use of multiple garbage collection threads helps to speed up the garbage collection process on multiprocessor hardware, hence the name throughput collector. Parallel GC is one of the best ways to minimize the total time spent on garbage collection and maximize throughput. However, it can lead to long pause times, making it unsuitable for certain applications, especially those with large heap sizes.

Initially, the parallel collector was only available for the young generation and was configured using the -XX:+UseParallelGC flag. This flag would enable parallel GC on the young generation while the old generation would use a serial collector. As heap sizes grew, it became evident that parallel GC was required for both the young and old generations. In Java 6 update, Parallel GC was introduced for both the young and old generations. Parallel GC could be turned on using -XX:+UseParallelOldGC. In Java 7, Parallel GC was made the default collector. As of JDK 8, the -XX:+UseParallelOldGC VM argument has been deprecated and should no longer be used. Simply using -XX:+UseParallelGC turns on the parallel collector for both old and young generations.

The parallel collector uses a mark-and-copy collector called ParallelScavenge for the young generation and a mark-sweep-compact collector called ParallelOld for the old generation collection, similar to the serial collector. However, the main difference is the use of multiple threads in the parallel collector.

Multi-threaded young generation collections can lead to a fragmented tenured due to promoting young generation objects to tenured using multiple threads. Each minor GC thread reserves a portion of tenured for object promotion, called promotion buffers. GC threads promoting only to their promotion buffers can lead to fragmentation as these buffers may not be optimally utilized. A major GC, or garbage collection of the old generation, can reduce fragmentation via compaction.

As of Java 6, old generation GC cycles are also multi-threaded, with ParallelOld GC not only collecting garbage but also compacting the heap. Compaction helps eliminate fragmentation and is only performed if the heap has reached a certain level of fragmentation. Compaction eliminates memory waste and leads to a better memory layout, but it does have a cost associated with it. The Parallel collector divides the heap into regions, with the number of parallel GC threads determining the number of regions. Each GC thread marks and sweeps objects in its own region, which prevents concurrency issues between threads.

During the compaction phase, the GC aims to move live data from higher regions into lower regions. The GC first fills up region 1 with live data, then moves on to filling regions 2. This frees up the higher regions while fully utilizing the lower regions, thus eliminating fragmentation.

 

The parallel collector's GC process is illustrated in the diagram above. The diagram assumes that we have three GC threads, and thus the tenured heap area is divided into three regions (not to be confused with G1GC regions). Each GC thread is responsible for its own region. At the end of the mark and sweep phase, we can see a fragmented heap. The compaction process moves objects from lower regions to higher regions to free up space.

The parallel collector can be tuned using three main JVM parameters, which help optimize the collector for your application's needs. These parameters are:

  • Soft pause time goal - The maximum pause time goal can be specified using -XX:MaxGCPauseMillis. This is a soft goal, and the JVM will try to keep garbage collection under the specified pause time goal.
  • Maximum throughput goal - GC throughput is measured using a ratio that compares the amount of time spent doing GC to the amount of time spent doing application work. Use the -XX:GCTimeRatio VM argument to set the throughput goal. For example, -XX:GCTimeRatio=9 means that the GC threads should spend no more than 10% of their time doing GC. The value is calculated using the following formula: (1/1+N). In our case, this was 1/(1+9), which is .10 or 10%.
  • Maximum footprint goal - The maximum footprint goal establishes the total heap memory allocated. Use the -Xmx VM argument to set the total amount of heap memory allocated to the JVM. For example, -Xmx:1G will set the maximum heap size of 1G.

In the event that all of the above goals cannot be met, the collector focuses on them in the following priority order:

  1. Pause time goal
  2. Throughput goal
  3. Memory footprint goal

The collector maintains statistics on every GC run, and the collected statistics are used to establish if the pause time and throughput goals have been met. If the goals have not been met, future GC runs are adjusted to help meet these goals. Generation sizes, both young and old, are contracted and expanded to help meet these goals. To meet pause time goals, the GC might need to shrink the generation, while meeting throughput goals might require expanding the generation.

No comments yet.

Leave a Reply

fourteen + 9 =