Hadoop out-of-memory (OOM) classification, parameter tuning and optimization (codes simulate out-of-memory types and analyze the causes)

Out of memory error classification during MapReduce job running

1. JVM memory overflow in Mapper/Reducer stage (usually heap)

1) JVM heap (Heap) memory overflow: When the heap memory is insufficient, the following exceptions are generally thrown:

The first: "java.lang.OutOfMemoryError:" GC overhead limit exceeded;

The second: "Error: Java heapspace" exception information;

第三种:“running beyondphysical memory limits.Current usage: 4.3 GB of 4.3 GBphysical memory used; 7.4 GB of 13.2 GB virtual memory used. Killing container”。

2) Stack Overflow: The exception thrown is: java.lang.StackOverflowError

It often appears in SQL (there are too many conditional combinations in the SQL statement, which are parsed into constant recursive calls), or there are recursive calls in MR code. This deep recursive call is caused by the method call chain in the stack being too long. This kind of error generally indicates that there is a problem with the program writing.

2. Insufficient memory of MRAppMaster

If the input data of the job is large, a large number of Mappers and Reducers will be generated, which will put a lot of pressure on the MRAppMaster (the manager of the current job), and eventually cause the MRAppMaster to run out of memory, and the OOM message will generally appear when the job runs.

The exception information is:

Exception: java.lang.OutOfMemoryError thrown from theUncaughtExceptionHandler in thread "Socket Reader #1 for port 30703

Halting due to Out Of Memory Error...

Halting due to Out Of Memory Error...

Halting due to Out Of Memory Error...

3. Non-JVM memory overflow

The exception information is generally: java.lang.OutOfMemoryError:Direct buffer memory

I applied for the memory of the operating system by myself, but I did not control it well, and there was a memory leak, which resulted in a memory overflow.

Error resolution parameter tuning

1. JVM heap memory overflow parameter tuning in Mapper/Reducer stage

At present, MapReduce mainly controls memory through two sets of parameters: (adjust the following parameters larger)

Maper: 

mapreduce.map.java.opts=-Xmx2048m (default parameter, indicating jvm heap memory, note that mapreduce is not mapred)

mapreduce.map.memory.mb=2304 (container's memory)

Reducer:

mapreduce.reduce.java.opts=-=-Xmx2048m (default parameter, representing jvm heap memory)

mapreduce.reduce.memory.mb=2304 (memory of container)

Note: Because in the yarn container mode, the map/reduce task runs in the container, so the size of mapreduce.map(reduce).memory.mb mentioned above is larger than mapreduce.map(reduce).java. The size of the opts value. mapreduce.{map|reduce}.java.opts can set the maximum heap usage of JVM through Xmx, generally set to 0.75 times memory.mb, because some space needs to be reserved for java code, etc.

2 RA MRAppMaster:

yarn.app.mapreduce.am.command-opts=-Xmx1024m (default parameter, representing jvm heap memory)

yarn.app.mapreduce.am.resource.mb=1536 (memory of container)

Note that in Hive ETL, set it as follows:

set mapreduce.map.child.java.opts="-Xmx3072m" (Note: When setting -Xmx, you must use quotation marks, and there are various errors without quotation marks)

set mapreduce.map.memory.mb=3288

or

set mapreduce.reduce.child.java.opts="xxx"

set mapreduce.reduce.memory.mb=xxx

Involving YARN parameters:

•yarn.scheduler.minimum-allocation-mb ( minimum allocation unit 1024M )

•yarn.scheduler.maximum-allocation-mb (8192M)

• yarn.nodemanager.vmem-pmem-ratio (ratio between virtual memory and physical memoryDefault 2.1)

•yarn.nodemanager.resource.memory.mb

        Yarn 's ResourceManger ( RM for short) allocates memory, CPU and other resources to applications through logical queues . By default, RM allows the maximum AM to apply for Container resources of 8192MB (" yarn.scheduler.maximum-allocation-mb ") , by default The minimum allocation resource is 1024M (" yarn.scheduler.minimum-allocation-mb ") , AM can only be allocated in increments ( " yarn.scheduler.minimum-allocation-mb " ) and will not exceed ( " yarn.scheduler.maximum The value of -allocation-mb ") goes to RM to apply for resources, and AM is responsible for putting ("The values ​​of mapreduce.map.memory.mb ") and (" mapreduce.reduce.memory.mb ") are normalized to be divisible by (" yarn.scheduler.minimum-allocation-mb ") , RM will refuse to apply for memory exceeding 8192MB and Resource requests that are not divisible by 1024MB . (different configurations will vary)

For the workflow of Yarn, please refer to

Error simulation and analysis

1. Error simulation ( JVM architecture and GC garbage collection mechanism )

(1) heap memory overflow


(2) stack memory overflow


(3) Method area memory overflow


(4) Non-JVM memory overflow


2. Reason analysis (in the final analysis, it is caused by GC)

1) The three core areas of the JVM


2) JVMHeap area (young generation, old generation) and method area (permanent generation) structure diagram


3) Analyze the whole process of GC through a piece of code:


  
   
   
  1. public class HelloJVM {
  2. //When the JVM is running, it will find the entry method main in the Method area by reflection
  3. public static void main (String[] args) { //The main method is also placed in the Method method area
  4. /**
  5. * student (lowercase) is placed in the Stack area in the main thread
  6. * The Student object instance is placed in the Heap area shared by all threads
  7. */
  8. Student student = new Student( "spark");
  9. /**
  10. * First, the student pointer (or handle) will be passed (the pointer directly points to the object in the heap, the handle indicates that there is an intermediate, student points to the handle, and the handle points to the object)
  11. * Find the Student object, when the object is found, the specific method will be called to perform the task through the pointer in the method area inside the object
  12. */
  13. student.sayHello();
  14. }
  15. }
  16. class Student {
  17. // The name itself is placed in the stack area as a member, but the String object pointed to by name is placed in the Heap
  18. private String name;
  19. public Student(String name) {
  20. this.name = name;
  21. }
  22. //sayHello this method is placed in the method area
  23. public void sayHello() {
  24. System.out.println( "Hello, this is " + this.name);
  25. }
  26. }

Interpret the code from the perspective of Java GC: The Person object in line 20 of the program will first enter the Eden of the young generation (if the object is too large, it may directly enter the old generation). Objects exist in Eden and from before GC. When GC is performed, objects in Eden are copied to a survivor space such as To ( survive ( survived )) space: including from and to, their space size is the same, also called s1 and s2) (there is a copy algorithm), the objects in From (the algorithm will consider the number of GC survival) to a certain number of times (threshold ( If the object still exists in Survive after each GC, his Age will be incremented by 1 after GC, and 15 will be placed in Old Generation by default. However, the actual situation is more complicated, and it is possible to directly arrive from the Survive area without reaching the threshold. Old Generation area. During GC, objects in Survive will be judged. Some objects in Survive space have the same Age, that is, the number of GCs that have passed through is the same, and the sum of such a batch of objects with the same age is greater than or equal to Survive space. In half, this group of objects will enter the old Generation, (a dynamic adjustment))), and will be copied to the Old Generation. If the number of objects in From is not reached, it will be copied to To. After the copying is completed, To The valid objects are saved in Eden and From, and the rest of Eden and From are invalid objects. At this time, all objects in Eden and From are cleared. When copying, the objects in Eden enter To, and To may be full. At this time, the objects in Eden will be directly copied to OldGeneration, and the objects in From will also directly enter Old Generation. There is such a situation, To is relatively small, the space is full when it is copied for the first time, and it directly enters the old Generation. After the copying is completed, the names of To and From will be swapped, because Eden and From are both empty. After swapping, Eden and To are both empty, and the next allocation will be allocated to Eden. Repeat this process all the time. Advantages: The most efficient and most efficient way to use objects is in Young Generation, through From to to avoid generating Full GC too frequently (FullGC will generally be generated when Old Generation is full)
When the virtual machine performs MinorGC (new generation GC), it will judge whether the size of the object to enter the Old Generation area is larger than the remaining space of the Old Generation, and if it is larger, a Full GC will occur.
The object has just been allocated in Eden. If the space is insufficient, try GC to reclaim the space. If the MinorGC space is still not enough, it will be placed in the Old Generation. If the OldGeneration space is not enough, it will be OOM.
Larger objects, arrays, etc., larger than a certain value (configurable) are directly allocated to the old generation, (to avoid frequent memory copying) Permanent Generation (permanent generation)
belonging to the Heap space in the young generation and the old generation can be understood as the method area,
(It belongs to the method area) GC may also occur. For example, all the instance objects of the class are GCed, and its class loader is also GCed. At this time, the GC of the objects in the permanent generation will be triggered.
FullGC will be generated if OldGeneration is full

To learn more about GC, please check: http://blog.csdn.net/aijiudu/article/details/72991993

 

Guess you like

Origin blog.csdn.net/weixin_42273775/article/details/119482302