"JVM Series" Chapter 5 -- Heap Space and Object Allocation

Java heap

For most applications, the Java Heap is the largest piece of memory managed by the Java Virtual Machine. The Java heap is a memory area shared by all threads , created when the virtual machine starts, and determines the size of the space. The only purpose of this memory area is to store object instances , and almost all object instances allocate memory here.
insert image description here
According to the Java Virtual Machine Specification, the Java heap can be in a physically discontinuous memory space, as long as it is logically continuous, just like our disk space.
The size of the heap memory can be adjusted. When implemented, it can be implemented as a fixed size or expandable. However, the current mainstream virtual machines are implemented according to expandability (controlled by -Xmx and -Xms) .

The Java heap is the key area for the garbage collector (GC, Garbage Collection) to perform garbage collection . After the method ends, the objects in the heap will not be removed immediately, but will only be removed during garbage collection, that is Recycling occurs when GC is triggered. Therefore, it is often called the "GC heap".
insert image description here

From the point of view of memory recovery, since the collector basically adopts the generational collection algorithm, the Java heap can also be subdivided into: the new generation, the old generation and the permanent area (referred to as the metaspace after jdk8);

insert image description here

Cenozoic and old areas:

  • If the Java heap area is further subdivided, it can be divided into the young generation (YoungGen) and the old generation (OldGen). By default, the ratio of the new generation to the old generation is 1:2, -XX:NewRatio=2 .

  • The young generation can be divided into Eden space, Survivor0 space and Survivor1 space (sometimes called from area, to area).
    In HotSpot, the default ratio of Eden space to Survivor space is 8:1:1 (-xx: SurvivorRatio=8).
    insert image description here

  • The young generation is the area where objects are born, grow, and die. An object is generated here, applied, and finally collected by the garbage collector and ends its life.

  • Objects with long life cycles placed in the old age are usually Java objects copied and filtered from the survivor area. Of course, there are also special cases. We know that ordinary objects will be allocated on TLAB; if the object is large, the JVM will try to allocate it directly in other locations in Eden; if the object is too large, it is completely impossible to find a long enough continuous free space in the new generation. space, the JVM will directly allocate it to the old generation. When GC occurs only in the young generation, the act of collecting objects in the young generation is called MinorGc.

  • When GC occurs in the old age, it is called MajorGC or FullGC. Generally, the frequency of MinorGC is much higher than that of MajorGC, that is, the frequency of garbage collection in the old generation will be much lower than that of the young generation.

Java objects stored in the JVM can be divided into two categories:

  • One class is a transient object with a short life cycle, which is created and destroyed very quickly
  • The life cycle of another type of object is very long, and in some extreme cases, it can be consistent with the life cycle of the JVM

Almost all Java objects are new in the Eden area, and most Java objects are destroyed in the new generation. (When some large objects cannot be stored in the Eden area, they will directly enter the old age.) IBM's special research shows that 80% of the objects in the new generation are "dead soon".

Note: MinorGC will only be triggered when the Eden area is full, and the MinorGC operation will not be triggered when the survivor area is full. If the Survivor area is full, some special rules will be triggered, that is, it may be directly promoted to the old age

The young generation maximum memory size can be set with the option "-Xmn".
insert image description here

From the perspective of memory allocation, the Java heap shared by threads may be divided into multiple thread-private allocation buffers (Thread Local Allocation Buffer, TLAB). However, no matter how it is divided, it has nothing to do with storing memory. No matter which area, the object instance is still stored. The purpose of further division is to reclaim memory better or allocate memory faster.

The size of the heap has been set when the JVM starts, and you can set it through the options "-Xmx" and "-Xms".
-Xms10m: Minimum heap memory/initial memory, equivalent to -xx:InitialHeapSize
-Xmx10m: Maximum heap memory
, equivalent to -XX:MaxHeapSize
Usually the two parameters -Xms and -Xmx are configured with the same value, the purpose This is to improve performance without re-dividing the size of the heap area after the ava garbage collection mechanism cleans up the heap area.

Parameter settings for heap space:

parameter effect
-XX:+PrintFlagsInitial View default initial values ​​of all parameters
-XX:+PrintFlagsFinal View the final value of all parameters (may no longer be the initial value)
-Xms或-XX:InitialHeapSize Initial heap space memory (defaults to 1/64 of physical memory)
-Xmx或-XX:MaxHeapSize Maximum heap space memory (default is 1/4 of physical memory)
-Xmn Set the size of the young generation. (initial value and maximum value)
-XX:NewRatio Configure the proportion of the new generation and the old generation in the heap structure
-XX:SurvivorRatio Set the ratio of Eden and S0/S1 space in the new generation
-XX:MaxTenuringThreshold Set the maximum age of the new generation garbage
-XX:+PrintGCDetails Output detailed GC processing logs (①-Xx: +PrintGC ② - verbose:gc)
-XX:HandlePromotionFalilure Whether to set space allocation guarantee

If there is no memory in the heap to complete the instance allocation, and the heap can no longer be expanded (when the memory size exceeds the maximum memory specified by "-xmx"), an OutOfMemoryError exception will be thrown.

object allocation

Allocating memory for new objects is a very rigorous and complex task. JVM designers not only need to consider how and where memory is allocated, but also because the memory allocation algorithm is closely related to the memory recovery algorithm, so they also need to consider GC. Whether memory fragmentation will occur in the memory space after memory reclamation is performed.

Object allocation process:

  • The new object is placed in the Eden area first, and this area has a size limit.
  • When the space of the Garden of Eden is full, the program needs to create objects again. The garbage collector of the JVM will perform garbage collection (MinorGC) on the Garden of Eden, destroy the objects in the Garden of Eden that are no longer referenced by other objects, and then load new ones. objects into the Eden Zone.
  • Then move the remaining objects in the Garden of Eden to Survivor Zone 0.
  • If garbage collection is triggered again, the last surviving one will be placed in the survivor area 0. If it is not recycled, it will be placed in the survivor area 1.
  • If it undergoes garbage collection again, it will be put back into the survivor area 0 at this time, and then go to the survivor area 1.
  • When can I go to the retirement home? The number of times can be set (-Xx:MaxTenuringThreshold=N), the default is 15 times.
  • In the retirement area, it is relatively leisurely. When there is insufficient memory in the retirement area, trigger GC again: Major GC to clean up memory in the retirement area
  • If the retirement area executes the Major GC and finds that the object still cannot be saved, an OOM exception will occur.
Object allocation process diagram
insert image description here
insert image description here
insert image description here

Memory allocation strategy:
If the object is still alive after the birth of Eden and the first Minor GC, and can be accommodated by the Survivor, it will be moved to the survivor space, and the object age will be set to 1. Every time the object survives MinorGC in the survivor area, the age increases by 1 year. When its age increases to a certain level (the default is 15 years old, in fact, each JVM and each GC are different), it will be promoted. to old age. The principles of object allocation for different age groups are as follows:

  • Priority assigned to Eden
  • Large objects are directly allocated to the old age, because newly created objects are all dying, so this large object may also be recycled quickly, but because the old age triggers Major GC less frequently than Minor GC, it may be recycled It will be slower, so try to avoid too many large objects in the program
  • Long-lived objects are allocated to the old generation
  • Dynamic object age judgment: If the sum of the size of all objects of the same age in the survivor area is greater than half of the Survivor space, objects whose age is greater than or equal to this age can directly enter the old age without waiting for the age required in MaxTenuringThreshold.
  • Space allocation guarantee: -Xx:HandlePromotionFailure

insert image description here

garbage collection GC

When the JVM performs GC, it does not reclaim the three memory areas together every time. Most of the time, it refers to the new generation. For the implementation of Hotspot VM, the GC in it is divided into two types according to the recovery area: one is partial collection (Partial GC), and the other is full heap collection (FullGC).

  • Partial Collection: A garbage collection that does not fully collect the entire Java heap. Among them, it is divided into:
    (1) MinorGC/YoungGC: only the garbage collection of the new generation
    (2) Old generation collection (MajorGC/OldGC): only the garbage collection of the old generation. At present, only CMSGC can collect the old generation separately. In many cases, Major GC is confused with FullGC, and it is necessary to distinguish whether it is old generation or the whole heap.
    (3) Mixed collection (MixedGC): Collects the entire young generation and some old generation garbage collection. Currently, only G1 GC will have this behavior
  • Full Heap Collection (FullGC): Garbage collection that collects the entire java heap and method area.

Minor GC trigger mechanism:

  • When the young generation space is insufficient, MinorGC will be triggered. The young generation full here refers to the Eden generation full, and the Survivor full will not trigger GC. (Every Minor GC cleans up the young generation memory.)
  • Because most of the Java objects have the characteristics of life and death , Minor GC is very frequent, and the recovery speed is generally faster. This definition is clear and easy to understand.
  • Minor GC will trigger STW (stop the word), suspend the threads of other users, and wait for the garbage collection to end before the user threads resume running

Major GC trigger mechanism:

  • Refers to the GC that occurs in the old generation, when the object disappears from the old generation, we say "Major Gc" or "Full GC" occurs
  • MajorGC appears, which is often accompanied by at least one Minor GC (but not absolute, in the collection strategy of the Parallel1 Scavenge collector, there is a strategy selection process for directly conducting MajorGC), that is, when the space in the old generation is insufficient, it will first try Trigger MinorGC. If there is not enough space afterward, trigger the Major GC
  • The speed of Major GC is generally more than 10 times slower than that of MinorGC, and the time of STW is longer.
  • If the memory is not enough after the Major GC, an OOM exception is reported. ( When OOM is triggered, a Full GC must have been performed, because the OOM exception will only occur when there is insufficient space in the old age )

Full GC trigger mechanism:

  • When calling System.gc(), the system recommends to perform Full GC, but it is not necessary to perform
  • Insufficient space for old age
  • Insufficient space in method area
  • The average size of the old generation after passing through Minor GC is larger than the available memory of the old generation
  • When copying from Eden area, survivor spacee (From Space) area to survivor spacel (To Space) area, if the size of the object is larger than the available memory of To Space, the object is transferred to the old age, and the available memory of the old age is smaller than the size of the object

Note: Full GC is to be avoided as much as possible during development or tuning, so that the pause time (STW time) will be shorter

The idea of ​​heap space generation:
Why should the Java heap be generated? Does it work without generation? After research, the life cycle of different objects is different. 70%-99% of objects are temporary objects.

  • Cenozoic: It consists of Eden and two survivors (also called from/to, s0/s1) of the same size, and to is always empty.
  • Old generation: Stores objects in the young generation that survive multiple GCs.

In fact, it is completely possible to not divide the generation. The only reason for the generation is to optimize the GC performance . If there is no generation, then all objects are in one piece, which objects are useless to find during GC, so that all areas of the heap will be scanned. And many objects are dying in the near future. If they are generational, put the newly created object in a certain place. When the GC is performed, the area where the "dead in the future" object is stored will be recycled first. Come out with a lot of space.

Allocate memory for objects TLAB

What is TLAB?

  • TLAB: Thread Local Allocation Buffer , from the perspective of memory model rather than garbage collection, the Eden area continues to be divided. The JVM allocates a private buffer area for each thread, which is included in the Eden space.
  • When multiple threads allocate memory at the same time, using TLAB can avoid a series of non-thread-safety problems, and can also improve the throughput of memory allocation, so we can call this memory allocation method a fast allocation strategy.
  • All OpenJDK derived JVMs provide the TLAB design.

Why TLAB?

  • The heap area is a thread shared area, and any thread can access the shared data in the heap area
  • Since the creation of object instances is very frequent in the JVM, it is thread-unsafe to divide the memory space from the heap area in a concurrent environment
  • In order to prevent multiple threads from operating the same address, mechanisms such as locking need to be used, which in turn affects the allocation speed.

insert image description here

  • Although not all object instances can successfully allocate memory in TLABs, the JVM does prefer TLABs for memory allocation.
  • In the program, the developer can set whether to open the TLAB space through the option "-Xx:UseTLAB".
  • By default, the memory of the TLAB space is very small, occupying only 1% of the entire Eden space . Of course, we can set the percentage size of the Eden space occupied by the TLAB space through the option "Xx:TLABWasteTargetPercent".
  • Once the object fails to allocate memory in the TLAB space, the JVM will try to use the locking mechanism to ensure the atomicity of data operations, thereby directly allocating memory in the Eden space.

escape analysis

Is the heap the only option for allocating objects?
With the development of the JIT compilation period and the gradual maturity of escape analysis technology , the stack allocation and scalar replacement optimization technology will lead to some subtle changes, and all objects are allocated to the heap and gradually become less "absolute".

In the Java virtual machine, objects are allocated memory in the Java heap, which is a common sense. However, there is a special case, that is, if an object does not have an escape method after escape analysis (Escape Analysis), it may be optimized into stack allocation. This eliminates the need to allocate memory on the heap and eliminate the need for garbage collection. This is also the most common off-heap storage technology.

Escape Analysis:

  • How to allocate objects on the heap to the stack requires escape analysis.
  • This is a cross-function global data flow analysis algorithm that can effectively reduce synchronization load and memory heap allocation pressure in Java programs. Through escape analysis, the Java Hotspot compiler can analyze the usage scope of a new object's reference and decide whether to allocate the object to the heap.
  • The basic behavior of escape analysis is to analyze the dynamic scope of objects:
    ◉ When an object is defined in a method and the object is only used inside the method, it is considered that no escape has occurred.
    ◉ An escape occurs when an object is referenced by an external method after it is defined in a method. For example, it is passed as a call parameter to other places.
  • Objects that do not escape can be allocated to the stack. As the method execution ends, the stack space is removed. Each stack contains many stack frames, that is, escape analysis occurs.

An example of escape analysis:
insert image description here

Example of complete escape analysis code:
how to quickly determine whether escape analysis has occurred, everyone depends on whether the new object entity is called outside the method.
insert image description here

Parameter setting:
After JDK 1.7, escape analysis has been enabled by default in HotSpot.
If you are using an earlier version, developers can pass:

  • Option "-xx:+DoEscapeAnalysis" explicitly turns on escape analysis
  • View the screening results of escape analysis with option "-xx:+PrintEscapeAnalysis"
    Conclusion

Summary: If you can use local variables in development, don't use them outside the method definition.

Using escape analysis, the compiler can optimize the code as follows:

Allocation on the stack:

  • Convert heap allocation to stack allocation. If an object is allocated in a subroutine, so that the pointer to the object never escapes, the object may be a candidate for allocation on the stack, not the heap.
  • According to the results of escape analysis during compilation, the JIT compiler finds that if an object does not have an escape method, it may be optimized to be allocated on the stack. After the allocation is completed, it continues to execute in the call stack, and finally the thread ends, the stack space is reclaimed, and the local variable objects are also reclaimed. This eliminates the need for garbage collection.

Synchronization omitted:

  • If an object is found to be accessed by only one thread, then operations on that object can be done regardless of synchronization.
  • The cost of thread synchronization is quite high, and the consequences of synchronization are reduced concurrency and performance.
  • When dynamically compiling synchronized blocks, the JIT compiler can use escape analysis to determine whether the lock object used by the synchronized block can only be accessed by one thread and not released to other threads. If not, then the JIT compiler will desynchronize this part of the code when compiling the synchronized block. This can greatly improve concurrency and performance. This process of canceling synchronization is called synchronization elision, also known as lock elimination.

Detached object or scalar substitution:

  • Some objects may not need to exist as a contiguous memory structure to be accessed, so part (or all) of the object may not be stored in memory, but in CPU registers.
  • In the JIT stage, if it is found that an object will not be accessed by the outside world after escape analysis, then after JIT optimization, the object will be disassembled into several member variables contained in it instead. This process is called scalar substitution.
  • Scalar: A piece of data that can no longer be broken down into smaller pieces. Primitive data types in Java are scalars.
  • Aggregate: Relatively, those data that can be decomposed. Objects in Java are aggregates, because they can be decomposed into other aggregates and scalars.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324148526&siteId=291194637