InterviewSolution
| 1. |
As per the configuration, HDFS is in High availability mode with automatic failover. Explain in brief about the daemon which will take care of the failover. |
|
Answer» High Availability of cluster was introduced in Hadoop 2 to SOLVE the single point of Name node failure problem in Hadoop 1. The High availability Name node architecture provides an opportunity to have two name nodes as Active name node and Passive/Standby name node. So, both are running Name Nodes at the same time in a High Availability cluster. Whenever Active Name Node goes down due to crashes of server or graceful failover during the maintenance period at the same time control will go to passive/Standby Name Node automatically and it reduces the cluster downtime. There are two problems in maintaining consistency in the HDFS High Availability cluster:
As discussed above There are two types of failover: A. Graceful Failover: In this case, we manually initiate the failover for routine maintenance. B. Automatic Failover: In this case, the failover is initiated automatically in case of Name Node failure or Name node crashes. In either case of a Name Node failure, Passive or Stand by Name Node can take control of exclusive lock in Zookeeper and showing as it WANTS to become the next Active Name Node. In HDFS High availability cluster, APACHE Zookeeper is a service which provides the automatic failover. When the Name Node is active at that time Zookeeper maintains a session with the active Name Node. In any scenario when active Name Node get failed at that time the session will expire and the Zookeeper will inform to Passive or Stand by Name Node to initiate the failover process. The ZookeeperFailoverController (ZKFC) is a Zookeeper client that also monitors and manages the Name Node status. Each of the Name Nodes runs a ZKFC also. ZKFC is responsible for monitoring the health of the Name Nodes periodically. When zookeeper is installed in your cluster you should make sure that below are the process, or daemons running in Active Name Node, Standby Name Node and Data node. When you do JPS (Java Virtual Machine Process Status Tool ) in Active NameNode you should get below Daemons:
When you do JPS (Java Virtual Machine Process Status Tool ) in Standby NameNode you should get below Daemons:
When you do JPS (Java Virtual Machine Process Status Tool ) in DataNode you should get below Daemons:
|
|