1.

What is an offset in Kafka? What are the different ways to commit an offset? Where does Kafka maintain offset?

Answer»

As we already know, a Kafka topic is divided into partitions. The data inside each partition is ordered and can be accessed using an offset. Offset is a position within a partition for the next message to be SENT by the consumer. There are TWO types of offsets maintained by Kafka:

Current Offset

  1. It is a POINTER to the last record that Kafka has sent in the most recent poll. This offset thus ensures that the consumer does not get the same record twice.

Committed Offset

  1. It is a pointer to the last record that a consumer has successfully processed. It plays an important role in case of partition rebalancing – when a new consumer gets assigned to a partition – the new consumer can use committed offset to determine where to start reading records from

There are two ways to commit an offset:

  1. Auto-commit: Enabled by default and can be turned off by setting property – enable.auto.commit - to false. THOUGH convenient, it might cause duplicate records to get processed.
  2. Manual-commit: This implies that auto-commit has been turned off and offset will be manually committed when the record has been processed.

Prior to Kafka v0.9, Zookeeper was being used to store topic offset, however from v0.9 ONWARDS, the information regarding offset on a topic’s partition is stored on a topic called _consumer_offsets.



Discussion

No Comment Found