MapReduce memory tuning

 Memory Tuning

When the Hadoop data processing, processing method memory overflow occur? (Tuning memory)

1, Mapper / Reducer JVM memory overflow stage (usually stack)

1) JVM heap (Heap) memory overflow: when out of heap memory, usually throws the following exception:

The first: "java.lang.OutOfMemoryError:" GC overhead limit exceeded;

The second: "Error: Java heapspace" anomalies;

第三种:“running beyondphysical memory limits.Current usage: 4.3 GB of 4.3 GBphysical memoryused; 7.4 GB of 13.2 GB virtual memory used. Killing container”。

2) Memory Stack overflow: thrown to: java.lang.StackOverflflowError

Often appear in SQL (SQL statement too many combinations of conditions, continue to be parsed become recursive call), or MR code has a recursive call. This depth of recursive call stack caused by long chain method calls. This error occurs write a general description of the program in question.

2, MRAppMaster out of memory

If the data input a lot of great jobs, resulting in a large number of Mapper and Reducer, resulting in MRAppMaster (current job manager) pressure, eventually leading to insufficient memory MRAppMaster job ran appeared OOM general information

Exception information is:

Exception: java.lang.OutOfMemoryError thrown from theUncaughtExceptionHandler in thread

"Socket Reader #1 for port 30703

Halting due to Out Of Memory Error...

Halting due to Out Of Memory Error...

Halting due to Out Of Memory Error...

3, non-JVM memory overflow

Exception information is generally: java.lang.OutOfMemoryError: Direct buffffer memory

Own application using the operating system's memory, did not control, a memory leak occurs, leading to memory overflow. Error Resolution parameter tuning

1, Mapper / Reducer JVM heap memory overflow stage tuning parameters

MapReduce main memory to control the current through the two set of parameters as follows :( parameter transfer large)

Maper:

mapreduce.map.java.opts = -Xmx2048m (default parameter indicates jvm heap memory, attention is not mapreduce mapred)

mapreduce.map.memory.mb = 2304 (container memory)

Reducer:

mapreduce.reduce.java.opts = - = - Xmx2048m (default parameters indicates jvm heap memory)

mapreduce.reduce.memory.mb=2304(container的内存)

Note: Because the yarn container in this mode, map / reduce task running in the Container, so mapreduce.map (reduce) the above-mentioned size greater than .memory.mb mapreduce.map (reduce) .java. opts size value. mapreduce. {map | reduce} .java.opts can be provided by using the maximum Xmx JVM heap is generally set to 0.75 times the memory.mb, because of the need to reserve some space for the java code, etc.

2、MRAppMaster:

yarn.app.mapreduce.am.command-opts = -Xmx1024m (default parameter indicates jvm heap memory) yarn.app.mapreduce.am.resource.mb = 1536 (container RAM) inside Hive ETL Note, in the following manner setting: set mapreduce.map.child.java.opts = "- Xmx3072m" (Note: Be sure to use quotation marks -Xmx setting, without the quotes various errors) set mapreduce.map.memory.mb = 3288

or

set mapreduce.reduce.child.java.opts="xxx"

set mapreduce.reduce.memory.mb=xxx

YARN parameters involved:

• yarn.scheduler.minimum-allocation-mb (minimum allocation unit 1024M)

•yarn.scheduler.maximum-allocation-mb (8192M)

• yarn.nodemanager.vmem-pmem-ratio (ratio between the virtual and physical memory by default 2.1)

•yarn.nodemanager.resource.memory.mb

Yarn of ResourceManger (abbreviation RM) through the queue allocated memory, CPU, and other resources to the logical application, the maximum allowed default RM AM Application Container resources 8192MB ( "yarn.scheduler.maximum-allocation-mb"), default the minimum allocation of resources for the 1024M ( "yarn.scheduler.minimum-allocation-mb"), AM only incremental ( "yarn.scheduler.minimum-allocation-mb") and does not exceed ( "yarn.scheduler.maximum -allocationmb "value) of the whereabouts of RM application resources, AM is responsible for (" mapreduce.map.memory.mb ") and

Value ( "mapreduce.reduce.memory.mb") is structured to be ( "yarn.scheduler.minimum-allocation-mb") divisible, RM will reject the application memory than 8192MB and 1024MB resource request can not be divisible. (There will be different in different configurations)

Guess you like

Origin www.cnblogs.com/tesla-turing/p/11958845.html