InterviewSolution
| 1. |
What is Kafka, what are the components available in Kafka? What is the role of Zookeeper in Kafka and what is a Sequence of starting the Kafka services? |
|
Answer» Basically It is massaging system which is exchanging the large volume of Streaming/log data in between processes, Application and SERVERS. Distributed messaging is based on the queue which can handle a high volume of data and allow you to pass the messages from one end to another. Kafka is appropriate for both offline and online message consumption. Prior to talk about Kafka further, we need to know about the components belongs to Kafka and below are the details.
Kafka Broker: Kafka cluster consists of one or more server that is called kafka broker in which kafka is running. Producers are nothing but processes that distribute data into Kafka topics within the brokers, then consumer of topics drag the messages off from the Kafka topics.
Kafka Topics: A Topic is nothing but category or feed name to which messages are stored and distributed. All kafka massages are prepared into topics. so whenever you want to send a message you can send it to specific Topic and whenever you want to read the messages you can read it from a specific topic. Kafka Topic Partition: Kafka topics are divided into a number of partitions and it contains the messages in a sequence, sequence is only applicable within a partition. Each massage in partition is recognized by its offset value. Here offset is represented as an incremental ID which is maintained by Zookeeper. The offsets are meaningful for that partition, It does not have any value across the partition. A topic may contain any number of partitions. Basically there is no such rule and regulation for write the available messages to which partition. However, there is an option available to adding a key to a massage. If a producer distributes the messages with a Key then all the messages with the same key will go to the same partition. Kafka producers: Basically producers are WRITING data to a topic, while writing data, producers need to specify the Topic name and one broker name to connect to. Kafka is having own mechanism to send the data to the right partition of the right broker automatically. Producers having the Mechanism where producer can receive an acknowledgment of data it writes. Below is the acknowledgment which the producer receives.
Kafka Consumer: Basically consumer reads data from topics. As we know Topics are divided into multiple partitions so consumer reads data from each partition of topic. Consumers need to mention the topic name as well as broker. Consumer read data from a partition in sequence. when consumer connects a broker Kafka will make sure that it connected to an entire cluster. Kafka Consumer Group: Consumer group consists of multiple consumer process. One consumer group having one unique group Id. One consumer instance in one consumer group will read data from one partition. If the number of consumers exceeds the number of partition then in this case extra number of consumers will be inactive. For example, there are 6 partitions in total and there are 8 consumers in a single consumer group. In this case, there will be 2 inactive consumers. Here in Kafka two types of massaging PATTERNS are available such as:
1. Point to point messaging system: In point to point messaging system, Massages are keeping on the queue. One or more consumers read the message in the queue but a particular message can be read by one consumer at a time. Basically Point-to-point messaging is used when a single message will be received by only one message consumer. There may be multiple consumers reading on the queue for the same message but only one of the consumers will receive it. There can be multiple producers as well. They will be sending messages to the queue but it will be received by only one receiver. 2. Publish subscribe messaging system: Here in Publish subscribe messaging system, message producers are called publishers and message consumers are called subscribers. Here in this scenario Topic can have multiple receivers and each and every receiver receives a copy of each message. Based on the above picture, below are a few points that explain the publish-subscribe messaging system. Massages are shared through channel and it is called as Topic. Topics are placed in a centralized place where the producer can distribute and a consumer can read the messages. Each message is delivered to one or more than one consumer and it is called subscribers. The publisher or producer is not aware of which massage or topic is receiving by which consumer or subscriber. A single message created by one publisher may be copied and distributed to hundreds or thousands of subscribers. Role of Zookeeper in Kafka: Zookeeper is a mandatory component in Kafka ecosystem, It helps in managing kafka brokers and helps in leader election of partitions. It helps in maintaining the cluster membership. For example, when a new broker is added or a broker is removed and a new topic is added or a topic is deleted, when a broker goes down or comes up etc, Zookeeper manages such situations informing Kafka. It also handle the topic CONFIGURATIONS like number of partitions a topic has and the leader of the partitions for a topic. The sequence of starting the Kafka services:
This is default zookeeper configuration file available in Kafka, for which below are the properties dataDir=/tmp/zookeeper Client Port= 2183 [root@xxxx]# /bin/zookeeper-server-start.sh /config/zookeeper.properties
You can start the Kafka broker with the default configuration file. Below are the configuration properties broker.id=0 log.dir=/tmp/Kafka-logs zookeeper.connect=localhost:2183Here one broker whose ID is 0 and its connecting the zookeeper using port as 2183. [root@xxxx]# /bin/kafka-server-start.sh /config/server.properties
Below is the example to create a topic with a single partition and replica [root@xxxx]#/bin/Kafka-create-topic.sh -zookeeper localhost:2183 -replica 1 -partition 1 -topic examtopic Here in the above example, we created a topic as an examtopic.
[root@xxxx]#/bin/Kafka-console-producer.sh -broker-list localhost:9090 -topic exam topic broker-list ==> this is the server and port information for the brokers, here in the above example we have provided server as localhost and port as 9090 in command line we created the producer client that accepts your massages and distributes it to a cluster as massages then a consumer can consume or read the messages. Hi Bibhu, How are you?
[root@xxxx]#/bin/Kafka-console-consumer.sh -zookeeper localhost:2183 -topic examtopic -from-beginning Consumer runs with the default configuration properties as mentioned below, this information will be there in the consumer. Properties file.
|
|