[JVM] JVM heap memory (heap) detailed explanation

foreword

JAVA heap memory management is one of the main factors affecting performance.
Heap memory overflow is a very common fault in JAVA projects. Before solving this problem, you must first understand how the JAVA heap memory works.

1. Heap memory division

First look at how the JAVA heap memory is divided, as shown in the figure:
insert image description here

  1. JVM memory is divided into heap memory and non-heap memory. Heap memory is divided into young generation (Young Generation) and old generation (Old Generation). Non-heap memory is a permanent generation (Permanent Generation).
  2. The young generation is divided into Eden and Survivor areas. The Survivor area consists of FromSpace and ToSpace. The Eden area occupies a large capacity, and the Survivor area occupies a small capacity. The default ratio is 8:1:1.
  3. Heap memory usage: Objects are stored, and the garbage collector collects these objects, and then recycles them according to the GC algorithm.
  4. Non-heap memory usage: The permanent generation, also known as the method area, stores objects that survive for a long time when the program is running, such as metadata, methods, constants, attributes, etc. of the class.

In the JDK1.8 version, the permanent generation is abandoned, and the replacement is MetaSpace. The metaspace is similar to the permanent generation, and it is the implementation of the method area. The biggest difference between them is that the metaspace is not in the JVM, but uses the local Memory.
Note that metaspace has two parameters:

  • MetaspaceSize: Initialize the size of the metaspace and control the GC threshold
  • MaxMetaspaceSize: Limit the upper limit of the metaspace size to prevent exceptions from occupying too much physical memory

2. Why remove the permanent generation?

Reason for removing the permanent generation: Changes made to integrate HotSpot JVM and JRockit VM (new JVM technology), because JRockit does not have a permanent generation.
With the meta space, there will be no permanent generation OOM problem anymore!

3. The concept of generation

The newly generated objects are first placed in the Eden area of ​​the young generation. When the Eden space is full, a Minor GC is triggered, and the surviving objects are moved to the Survivor0 area. After the Survivor0 area is full, the Minor GC is triggered, and the surviving objects in the Survivor0 area are moved to the Survivor1 area. It is guaranteed that there is always a survivor area that is empty for a period of time. Objects that survive multiple Minor GCs are moved to the Old Generation.

The old generation stores long-term surviving objects. When it is full, it will trigger Major GC=Full GC. During the GC, all threads will be stopped and wait for the GC to complete. Therefore, applications with high response requirements should minimize the occurrence of Major GC to avoid response timeout.

  • Minor GC: clean up the young generation
  • Major GC: Clean up the old generation
  • Full GC: Clean up the entire heap space, including the young generation and permanent generation

All GC stops applying all threads.

Four, why generation?

Objects are classified according to their survival probability, and objects with a long survival time are placed in a fixed area, thereby reducing garbage scanning time and GC frequency. Carry out different garbage collection algorithms for classification, and make use of the strengths and avoid weaknesses of the algorithms.

5. Why is survivor divided into two survivor spaces of equal size?

Mainly to solve fragmentation. If the memory fragmentation is serious, that is, two objects occupy discontinuous memory, and the existing continuous memory is not enough to store new objects, GC will be triggered.

6. Common parameters of JVM heap memory

parameter describe
-Xms The initial size of the heap memory, in m, g
-Xmx(MaxHeapSize) The maximum allowable size of the heap memory, generally not greater than 80% of the physical memory
-XX:PermSize The initial size of the non-heap memory, the general application settings are initialized to 200m, and the maximum 1024m is enough
-XX:MaxPermSize Maximum allowed size of non-heap memory
-XX:NewSize(-Xns) Initial size of young generation memory
-XX:MaxNewSize(-Xmn) The maximum allowable size of the young generation memory, can also be abbreviated
-XX:SurvivorRatio=8 The capacity ratio of the Eden area to the Survivor area in the young generation, the default is 8, that is, 8:1
-Xss stack memory size

7. Garbage Collection Algorithm (GC, Garbage Collection)

Red is marked inactive objects, green is active objects.

  • Mark-Sweep (Mark-Sweep)
    GC is divided into two phases, mark and clear. First mark all recyclable objects, and recycle all marked objects uniformly after the marking is completed. At the same time, discontinuous memory fragmentation will be generated. Excessive fragmentation will cause that when the program needs to allocate large objects later, it will not be able to find enough contiguous memory, and the GC will have to be triggered again as a last resort.
    insert image description here

  • Copy (Copy)
    divides the memory into two blocks according to capacity, and only uses one of them at a time. When this piece of memory is used up, copy the surviving object to another piece, and then clean up the used memory space at one time. In this way, half of the memory area is reclaimed every time, and there is no need to consider the problem of memory fragmentation, which is simple and efficient. The disadvantage requires twice the memory space.
    insert image description here

  • Mark-compact (Mark-Compact)
    is also divided into two stages, first mark recyclable objects, then move all surviving objects to one end, and then clean up the memory outside the boundary. This method avoids the fragmentation problem of the mark-sweep algorithm, and also avoids the space problem of the copy algorithm.
    Generally, after the GC is executed in the young generation, a small number of objects will survive, and the copy algorithm will be used, and the collection can be completed only by paying a small amount of cost for copying the surviving objects. In the old generation, because the object survival rate is high and there is no additional memory space allocation, it is necessary to use the mark-clean or mark-compact algorithm for recycling.
    The Java heap overflowed again! Teach you a nirvana
    insert image description here

8. Garbage Collector

  • The serial collector (Serial)
    is an older collector, single-threaded. When collecting, the app's worker threads must be paused until the collection is complete.

  • Parallel collector (Parallel)
    Multiple garbage collection threads work in parallel, which is more efficient under multi-core CPUs, and application threads are still in a waiting state.

  • CMS collector (Concurrent Mark Sweep)
    The CMS collector is designed to shorten the pause time of the application. It is implemented based on the mark-sweep algorithm. The whole process is divided into 4 steps, including:

    • Initial Mark
    • Concurrent Mark
    • Remark
    • Concurrent Sweep

    Among them, the two steps of initial marking and re-marking still need to suspend the application thread. The initial marking is just to mark the objects that GC Roots can directly relate to, and the speed is very fast. The concurrent marking phase is to mark recyclable objects, and the re-marking phase is to correct the part of the mark that is changed due to the continued operation of the user program during the concurrent marking period. The mark record of the object, the pause time of this stage is slightly longer than the initial mark stage, but far longer than the concurrent mark time period.
    Since the concurrent mark and clear process collector threads that consume the longest in the entire process can work together with user threads, the memory recovery of the CMS collector is executed concurrently with the user, greatly reducing the pause time.

  • G1 collector (Garbage First)
    The G1 collector divides the heap memory into multiple independent regions (Regions) of equal size, and can predict the pause time and the reason. It can avoid the full area collection of the entire heap. G1 tracks the value of garbage accumulation in each Region (the size of the acquired space and the time required for recycling), maintains a priority list in the background, and gives priority to recycling the Region with the highest value according to the allowed collection time each time, thus ensuring a limited time for higher collection efficiency.
    G1 collector work engineering is divided into 4 steps, including:

    • Initial Mark
    • Concurrent Mark
    • Final Mark
    • Screening recovery (Live Data Counting and Evacuation)

    The initial mark is the same as CMS, mark the objects that GC Roots can directly relate to. Concurrent marking starts from GC Root to mark surviving objects. This stage takes a long time, but it can also be executed concurrently with application threads. The final marking is also to correct the part of the marking record that is changed due to the continuous operation of the user program during the concurrent marking. Finally, sort the recovery value and cost of each Region in the screening recovery stage, and perform recovery according to the user's expected GC pause time.

9. Garbage Collector Parameters

parameter describe
-XX:+UseSerialGC serial collector
-XX:+UseParallelGC parallel collector
-XX:+UseParallelGCThreads=8 The number of parallel collector threads, how many threads are performing garbage collection at the same time, generally equal to the number of CPUs
-XX:+UseParallelOldGC Designate the old generation for parallel collection
-XX:+UseConcMarkSweepGC CMS collector (concurrent collector)
-XX:+UseCMSCompactAtFullCollection Enable memory space compression and defragmentation to prevent excessive memory fragmentation
-XX:CMSFullGCsBeforeCompaction=0 Indicates how many times Full GC starts to compact and organize, 0 means to perform compaction and organize immediately after each Full GC
-XX:CMSInitiatingOccupancyFraction=80% Indicates that CMS collection starts when the memory space of the old generation is 80% used to prevent excessive Full GC
-XX:+UseG1GC G1 collector
-XX:MaxTenuringThreshold=0 When the young generation survives several GCs, it enters the old generation, and 0 means that it directly enters the old generation

10. Why does the heap memory overflow?

Objects that survive GC in the young generation are copied to the old generation. When the space in the old generation is insufficient, the JVM will perform a complete garbage collection (Full GC) on the old generation. If the object copied from the Survivor area cannot be stored after GC, OOM (Out of Memory) will appear.

11. OOM (Out of Memory) exceptions are common for the following reasons:

  • Insufficient memory in the old generation: java.lang.OutOfMemoryError: Javaheapspace
  • Insufficient permanent generation memory: java.lang.OutOfMemoryError: PermGenspace
  • Code bugs occupy memory and cannot be recovered in time.

OOM may appear in these memory areas. When actually encountering OOM, you can locate the memory overflow in which area according to the exception information.
You can add a parameter -XX:+HeapDumpOnOutMemoryError to let the virtual machine dump the current memory heap dump snapshot for post-analysis when a memory overflow exception occurs.

Familiar with the JAVA memory management mechanism and configuration parameters, the following is the tuning configuration of the JAVA application startup options:

JAVA_OPTS="-server -Xms512m -Xmx2g -XX:+UseG1GC -XX:SurvivorRatio=6 -XX:MaxGCPauseMillis=400 -XX:G1ReservePercent=15 -XX:ParallelGCThreads=4 -XX:
ConcGCThreads=1 -XX:InitiatingHeapOccupancyPercent=40 -XX:+PrintGCDetails  -XX:+PrintGCTimeStamps -Xloggc:../logs/gc.log"
  • Set the minimum and maximum values ​​of the heap memory, the maximum value refers to the historical utilization setting
  • Set the GC garbage collector to G1
  • Enable GC logs for later analysis

summary

  • Choosing an efficient GC algorithm can effectively reduce the time to stop application threads.
  • Frequent Full GC will increase the pause time and CPU usage. You can increase the space size of the old generation to reduce the Full GC, but it will increase the recovery time, and make appropriate choices according to the business.

Guess you like

Origin blog.csdn.net/u011397981/article/details/130714618