1.

Let’s say that a producer is writing records to a Kafka topic at 10000 messages/sec while the consumer is only able to read 2500 messages per second. What are the different ways in which you can scale up your consumer?

Answer»

The answer to this question encompasses two main aspects – Partitions in a TOPIC and Consumer Groups.

A Kafka topic is divided into partitions. The message sent by the producer is distributed among the topic’s partitions based on the message key. Here we can assume that the key is such that messages would get equally distributed among the partitions.

Consumer Group is a way to bunch together consumers so as to INCREASE the throughput of the consumer application. Each consumer in a group latches to a partition in the topic. i.e. if there are 4 partitions in the topic and 4 consumers in the group then each consumer would read from a SINGLE partition. However, if there are 6 partitions and 4 consumers, then the data would be read in parallel from 4 partitions only. HENCE its IDEAL to maintain a 1 to 1 mapping of partition to the consumer in the group. 

Now in order to scale up processing at the consumer end, two things can be done:

  1. No of partitions in the topic can be increased (say from existing 1 to 4). 
  2. A consumer group can be created with 4 instances of the consumer attached to it.

Doing this would help read data from the topic in parallel and hence scale up the consumer from 2500 messages/sec to 10000 messages per second.



Discussion

No Comment Found