InterviewSolution
| 1. |
You have installed a cluster HDFS and Map Reduce version 2 on YARN. You have no dfs.hosts entries in your hdfs-site.xml configuration file. You configure a new worker node by setting fs.default.name in its configuration files to point to the Name Node on your cluster, and you start the Data Node daemon on that worker node. What do you have to do on the cluster to allow the worker node to join, and start sorting HDFS blocks? |
|
Answer» Java garbage collection is the process by which Java programs perform automatic memory management. when we are talking about automatic memory management, it is a technique that automatically manages to allocation and deallocation of memory. Java programs compile to bytecode that can be run on a Java Virtual Machine alternatively Byte code is the compiled format of java program, once java program has been converted to byte code afterward it will EXECUTE by JVM and transferred across a network. While Java programs are running on the JVM , JVM has consumed memory which is called heap memory to do the same. Heap memory is a part of memory DEDICATED to the program. Hadoop mapper is a java process and EVERY java process has its own heap memory. Heap memory maximum allocation settings configured as mapred.map.child.java.opts or mapreduce.map.java.opts in Hadoop2. If the mapper process runs out of heap memory then the mapper throws a java out of memory exceptions as mentioned below. ERROR: java.lang.Runtimeexception:Java.lang.OutofMemoryError The java heap settings or size should be smaller than the Hadoop container memory limit because we need to reserve some memory for java code. Usually, it is recommended to reserve 20% memory for code. So if the settings are correct then Java-based Hadoop tasks will never get killed by Hadoop so you will not see the "Killing container" error like above. To execute the actual map or reduce task, YARN will run a JVM within the container. the Hadoop property MapReduce.{map|reduc}.java.opts is proposed to pass to this JVM. This could include -Xmx to set the max heap size of the JVM. Example: hadoop jar<jarName> -Dmapreduce.reduce.memory.mb=4096 -Dmapreduce.map.java.opts=-Xmx3276 |
|