Kafka Consumer Groups, Partitions & Offsets


In the last post, you created a simple Kafka producer and consumer in Java. In this one, we’ll go deeper into how Kafka handles message consumption at scale using consumer groups, partitions, and offsets. These concepts are key to building scalable and reliable Kafka-based systems.

What Is a Consumer Group?

A consumer group is a group of consumer instances that work together to consume messages from a set of Kafka topics. Kafka ensures that each partition is consumed by only one consumer within a group at a time.

Why Use Consumer Groups?

  • To scale horizontally by adding more consumers to share the load.
  • To enable parallel processing across partitions.
  • To provide fault tolerance — if one consumer fails, another can take over.

Partitions and Parallelism

Partitions allow Kafka to scale horizontally. When a topic has multiple partitions, Kafka distributes those partitions across brokers and across consumers in a group.

Example: If a topic has 4 partitions and a group has 2 consumers, each consumer will consume from 2 partitions.

Kafka consumer group diagram

Key Rule:

Number of consumers ≤ number of partitions to ensure each consumer has at least one partition.

What Are Offsets?

Kafka tracks the position of each consumer in each partition using an offset, which is a numeric ID assigned to each record within a partition.

  • Offsets are unique within a partition.
  • Consumers use them to resume reading after restarts.
  • Kafka allows clients to commit offsets manually or automatically.

💡 Auto vs Manual Offset Commit

Auto Commit (default)

Kafka automatically commits offsets periodically (controlled by enable.auto.commit and auto.commit.interval.ms). Simple, but risky if a failure occurs before processing completes.

Manual Commit

Manually commit offsets after successful processing. Safer and more reliable in real-world systems.


// Example: Manual commit
consumer.commitSync();
  

Rebalancing

When a consumer joins or leaves a group, Kafka performs a rebalance. It redistributes the partitions among the available consumers. Rebalancing is crucial, but it can temporarily pause consumption.

Try It Yourself

  1. Create a topic with multiple partitions.
  2. Start two consumers with the same group.id.
  3. Produce messages to the topic.
  4. Observe how the partitions are split between consumers.

Consumer Group CLI Tools

You can inspect consumer group status using:


kafka-consumer-groups.sh --bootstrap-server localhost:9092 --describe --group my-java-group
  

Conclusion

Understanding how Kafka partitions and consumer groups work is essential for designing scalable applications. Offsets give you control over message processing, and consumer groups allow horizontal scaling without duplication.

Post a Comment

0 Comments