|
Answer» Hadoop is an open-source framework intending to store and process big data in a distributed manner. Hadoop’s ESSENTIAL Components: - HDFS (Hadoop Distributed File System) – Hadoop’s key storage system is HDFS. The extensive data is stored on HDFS. It is mainly devised for storing massive datasets in commodity hardware.
- Hadoop MapReduce – The responsible layer of Hadoop for data processing is MapReduce. It puts a request for processing of structured and unstructured data which is already stored in HDFS. It is liable for the parallel processing of a high volume of data by distributing data into detached tasks. There are two stages of processing: Map and Reduce. In simple terms, Map is a stage where data blocks are read and made available to the executors (COMPUTERS /nodes /containers) for processing. Reduce is a stage where all processed data is collected and collated.
- YARN – The framework which is USED to process in Hadoop is YARN. For resource management and to provide multiple data processing engines like real-time streaming, data science, and BATCH processing is done by YARN.
|