1.

What do you mean by an unbalanced cluster in Kafka? How can you balance it?

Answer»

It's as simple as assigning a UNIQUE broker id, listeners, and log directory to the server.properties file to add new brokers to an existing Kafka cluster. However, these brokers will not be allocated any data PARTITIONS from the cluster's existing topics, so they won't be performing much work unless the partitions are moved or new topics are formed. A cluster is referred to as unbalanced if it has any of the following problems :

Leader Skew: 

Consider the following scenario: a topic with three partitions and a replication factor of three across three brokers. 

The leader receives all READS and writes on a partition. Followers send fetch requests to the leaders in order to receive their most recent messages. Followers exist solely for redundancy and fail-over purposes.

Consider the case of a broker who has failed. It's possible that the failed broker was a collection of numerous leader partitions. Each unsuccessful broker's leader partition is promoted as the leader by its followers on the other brokers. Because fail-over to an out-of-sync replica is not allowed, the follower must be in sync with the leader in order to be promoted as the leader.

If another broker goes down, all of the leaders are on the same broker, therefore there is no redundancy.

When both brokers 1 and 3 go live, the partitions gain some redundancy, but the leaders stay focused on broker 2.

As a result, the Kafka brokers have a leader imbalance. When a node is a leader for more partitions than the number of partitions/number of brokers, the cluster is in a leader skewed condition.

Solving the leader skew problem:

Kafka offers the ability to reassign leaders to the desired replicas in order to tackle this problem. This can be accomplished in one of two ways:

  • The auto.leader.rebalance.enable=true broker option allows the controller node to transfer leadership to the preferred replica leaders, RESTORING the even distribution.
  • When Kafka-preferred-replica-election.sh is run, the preferred replica is selected for all partitions: The utility requires a JSON file containing a mandatory list of zookeeper hosts and an optional list of topic partitions. If no list is provided, the utility uses a zookeeper to retrieve all of the cluster's topic partitions. The Kafka-preferred-replica-election.sh utility can be time-consuming to use. Custom scripts can render only the topics and partitions that are required, automating the process across the cluster.

Broker Skew:

Let us consider a Kafka cluster with nine brokers. Let the topic name be "sample_topic." The following is how the brokers are assigned to the topic in our example:

Broker IdNumber of PartitionsPartitionsIs Skewed?

0

3

(0, 7, 8)

No

1

4

(0, 1, 8, 9)

No

2

5

(0, 1, 2 , 9, 10)

No

3

6

(1, 2, 3, 9, 19, 11)

Yes

4

6

(2, 3, 4, 10, 11, 12)

Yes

5

6

(3, 4, 5, 11, 12, 13)

Yes

6

5

(4, 5, 6, 12, 13)

No

7

4

(5, 6, 7, 13)

No

8

3

(6, 7, 8)

No

On brokers 3,4 and 5, the topic “sample_topic” is skewed. This is because if the number of partitions per broker on a given issue is more than the average, the broker is considered to be skewed.

Solving the broker skew problem :

The following steps can be used to solve it:

  • Generate the candidate assignment configuration using the partition reassignment tool (Kafka-reassign-partition.sh) with the –generate option. The current and intended replica allocations are shown here.
  • Create a JSON file with the suggested assignment.
  • To update the metadata for balancing, run the partition reassignment tool.
  • Run the “Kafka-preferred-replica-election.sh” tool to complete the balancing after the partition reassignment is complete.


Discussion

No Comment Found