JVM GC Evolution The JVM’s GC algorithms have evolved over the past three decades. The very first GC algorithm shipped with the JVM was the serial collector. Initially, Java was popular on the client side (Java Applets), and the serial collector was perfect for its GC needs. However, the serial collector worked well only on small […]

Understanding JVM Garbage Collection – Part 4
JVM GC Terminology The terminology used to describe garbage collection algorithms can be confusing. In this section, we will explain some commonly used terms and their meanings. Live set – The live set refers to the number of live objects in a heap. The size of the live set has a significant impact on the […]

Understanding JVM Garbage Collection – Part 3
Weak Generational Hypothesis The weak generational hypothesis has had a profound impact on the JVM’s heaps layout. Understanding the weak generational hypothesis is essential in order to understand various GC algorithms and approaches. The Weak Generational Hypothesis states that most objects die young. In other words, most objects created will be garbage collected very quickly. […]

Understanding JVM Garbage Collection – Part 2
GC Mark, Sweep, and Compact Basics The Mark and Sweep algorithm is the basis for garbage collection in Java. Although the actual algorithms used by the JVM are considerably more complex, the mark and sweep algorithm forms the basis of garbage collection in the JVM and must be understood. As you might have guessed, there […]
Understanding JVM Garbage Collection – Part 1
Java is a popular language for business and enterprise computing. Java and the JVM have enjoyed enormous success over the last two decades. Java was initially considered a slow language (late 90’s). That changed with the introduction of the HotSpot virtual machine in April 99. HotSpot improved performance via “just in time” compilation and adaptive optimisation. Since […]
CAP Theorem
The CAP theorem is a tool used to makes system designers aware of trade-offs while designing networked shared-data systems. CAP has influenced the design of many distributed data systems. It made designers aware of a wide range of tradeoff to consider while designing distributed data systems. Over the year the CAP theorem has been widely […]
Reasons for unbalanced Cassandra Cluster
Sometimes an Apache Cassandra cluster can end up in an unbalanced state. An unbalanced state is where data is unevenly distributed across a cluster or locally configured data directories. There are a number of reasons this can happen. In this blog post, I will cover two basics reasons this might happen. A cluster can end up […]
Understanding an Apache Cassandra Memtable Flush
A recent question in the Apache Cassandra mailing list triggered this blog post. The question revolved around events that trigger a memtable flush. Understanding the root cause of a memtable flush is essential to get a better understanding of Apache Cassandra. Another question that frequently crops up is the size of an SSTable as a result of […]

Cassandra Query Language (CQL) Tutorial
Apache Cassandra and the Cassandra Query Language (CQL) have evolved over the past couple of years. Key improvements include: Significant storage engine improvements Introduction of SSTable Attached Secondary Index i.e SASI Indexes Materialized views Simple role based authentication This post is an updated to “A Practical Introduction to Cassandra Query Language”. The tutorial will concentrate […]
Configuring Apache Cassandra Cluster with Docker
This tutorial outlines steps to install and configure Apache Cassandra using Docker. Docker provides an easy way to create an Apache Cassandra cluster. Using Docker we will get an Apache Cassandra cluster up and running in minutes