1.

What is meant by High Availability in HDFS? What are failover and fencing and what role do they play in making the system highly available?

Answer»

High availability in HDFS implies that the system does not have any single point of failure, is available 24/7 so that there is no or limited impact on client applications and is able to self-recover from failure without any manual intervention.

For implementing High Availability in HDFS, a pair of NameNodes is set up in an active-standby configuration. The passive NODE is kept in SYNC with the active node. Both active and passive nodes have access to shared storage space. When any namespace modification is performed by the Active node, it logs a record of the modification to an edit log file stored in the shared directory. The Standby node is constantly WATCHING this directory for edits, and as it sees the edits, it applies them to its own namespace THEREBY keeping in sync with Active node.

In case of a failure of active NameNode, the standby node takes over and starts servicing client requests. The transition from active to standby node is managed by Failover Controller. It uses Zookeeper to ensure that only NameNode is active at a given time. Each NameNode runs a failover controller process that monitors its NameNode for failures using a heartbeat MECHANISM and triggers a failover in case of failure.

However, it needs to be ensured that only NameNode is active at a given time. Two active NameNodes at the same time will cause the corruption of data. To avoid such a scenario fencing is done which ensures that only NameNode is active at a given time. The Journal Nodes perform fencing by allowing one NameNode to be writer at a time. The Standby NameNode takes over the responsibility of writing to the JournalNodes and forbid any other NameNode to remain active.



Discussion

No Comment Found

Related InterviewSolutions