JVM principle and tuning

Garbage Collection A major feature of the underlying
Java language is that it can perform automatic garbage collection processing without requiring developers to pay too much attention to system resources, such as the release of memory resources. Although automatic garbage collection greatly reduces the workload of developers, it also increases the burden on the software system.
Having a garbage collector can be said to be a significant difference between the Java language and the C++ language. In the C++ language, the programmer must handle each memory allocation carefully, and must manually release the memory space that was occupied after the memory is used up. When the memory release is not complete, that is, there are memory blocks that are allocated but never released, it will cause memory leaks, and even cause program paralysis in severe cases.
The following lists the commonly used algorithms and experimental principles of garbage collectors:
Reference Counting (Reference Counting)
Reference counters are used in Microsoft's COM component technology and Adobe's ActionScript3.
The implementation of the reference counter is very simple. For an object A, as long as any object references A, the reference counter of A is incremented by 1, and when the reference is invalid, the reference counter is decremented by 1. As long as object A's reference counter is 0, object A cannot be used anymore.
The implementation of the reference counter is also very simple, only need to configure an integer counter for each object. But reference counters have a serious problem, which is that they cannot handle the case of circular references. Therefore, this algorithm is not used in Java's garbage collector.
A simple circular reference problem is described as follows: there are object A and object B, object A contains a reference to object B, and object B contains a reference to object A. At this point, neither object A nor object B's reference counters are zero. But there is no 3rd object that references A or B in the system. That is to say, A and B are garbage objects that should be collected, but because the garbage objects reference each other, the garbage collector cannot recognize them, causing memory leaks.
Mark-Sweep
The mark-sweep algorithm divides garbage collection into two phases: the mark phase and the sweep phase. A possible implementation is to pass the root node first in the marking phase, marking all larger objects starting from the root node. Therefore, unmarked objects are unreferenced garbage objects. Then, in the cleanup phase, all unmarked objects are cleaned up. The biggest problem with this algorithm is that there are a lot of space fragments, because the space after reclamation is discontinuous. In the process of object heap space allocation, especially the memory allocation of large objects, the working efficiency of discontinuous memory space is lower than that of continuous space.
Copying algorithm (Copying)
divides the existing memory space into two parts, only uses one of them at a time, copies the surviving objects in the memory in use to the unused memory block during garbage collection, and then clears the memory that is in use. All objects in the used memory block, swap the roles of the two memory, and complete the garbage collection.
If there are many garbage objects in the system, the number of live objects that the replication algorithm needs to copy is not too large. Therefore, when garbage collection is really needed, the efficiency of the replication algorithm is very high. In addition, since the objects are uniformly copied into the new memory space during the garbage collection process, it can be ensured that the collected memory space is free of fragmentation. The disadvantage of this algorithm is that it halves the system memory.

The idea of ​​a copy algorithm is used in Java's new-generation serial garbage collector. The new generation is divided into three parts: eden space, from space, and to space. The from space and the to space can be regarded as two space blocks of the same size, equal status, and interchangeable roles for copying. The from and to spaces are also called survivor spaces, that is, survivor spaces, and are used to store objects that have not been reclaimed.
During garbage collection, the surviving objects in the eden space will be copied to the unused survivor space (assuming to), and the young objects in the survivor space that is in use (assuming it is from) will also be copied to the to space ( Large objects, or old objects will go directly to the old age, and if the to space is full, the objects will also go directly to the old age). At this point, the remaining objects in the eden space and the from space are garbage objects, which can be emptied directly, and the to space stores the surviving objects after this recycling. This improved copying algorithm not only ensures the continuity of space, but also avoids a lot of waste of memory space.
The efficiency of the Mark-Compact
replication algorithm is based on the premise that there are fewer surviving objects and more garbage objects. This happens frequently in the young generation, but it is more common in the old generation where most objects are live objects. If the replication algorithm is still used, the cost of replication will also be high due to the large number of surviving objects.
The mark-compact algorithm is an old-age recycling algorithm, which makes some optimizations based on the mark-sweep algorithm. It also first needs to mark all reachable objects from the root node once, but after that, it does not simply clean up unmarked objects, but compacts all live objects to one end of memory. After that, clear all the space outside the boundary. This method not only avoids the generation of fragmentation, but also does not require two identical memory spaces, so its cost performance is relatively high.
Incremental Collecting
During the garbage collection process, the application software will be in a state of high CPU consumption. In this state of high CPU consumption, all threads of the application are suspended, suspending all normal work, waiting for the completion of garbage collection. If the garbage collection time is too long, the application will be suspended for a long time, which will seriously affect the user experience or system stability.
The basic idea of ​​the incremental algorithm is that if all the garbage is processed at one time and the system needs to be paused for a long time, then the garbage collection thread and the application thread can be executed alternately. Each time, the garbage collection thread collects only a small area of ​​memory space, and then switches to the application thread. This is repeated in sequence until garbage collection is complete. In this way, since the application code is intermittently executed during the garbage collection process, the pause time of the system can be reduced. However, due to the consumption of thread switching and context switching, the overall cost of garbage collection will increase, resulting in a decrease in system throughput.
Generational Collecting:
According to the characteristics of garbage collection objects, the optimal way at different stages is to use appropriate algorithms for garbage collection at this stage. Divide it into several blocks, and use different recycling algorithms according to the characteristics of each block of memory to improve the efficiency of garbage collection. Taking the Hot Spot virtual machine as an example, it puts all newly created objects into a memory area called the young generation. The young generation is characterized by the fact that objects will be recycled quickly. Therefore, a more efficient replication algorithm is selected in the young generation. When an object is still alive after several collections, the object is placed into a memory space called the old generation. In the old generation, almost all objects survive several garbage collections. Therefore, it can be considered that these objects will be memory resident for a period of time, or even the entire life cycle of the application. If you still use the copy algorithm to reclaim the old generation, you will need to copy a lot of objects. In addition, the recycling cost performance of the old generation is also lower than that of the new generation, so this approach is also not advisable. According to the idea of ​​generation, a different mark-compression algorithm can be used for the recycling of the old generation and the new generation to improve the efficiency of garbage collection.
Analyzing the garbage collector from different angles, it can be divided into different types.
1. According to the number of threads, it can be divided into serial garbage collector and parallel garbage collector. The serial garbage collector uses only one thread at a time for garbage collection; the parallel garbage collector will start multiple threads at a time for garbage collection at the same time. On CPUs with strong parallelism, the use of parallel garbage collectors can shorten the GC pause time.
2. According to the working mode, it can be divided into concurrent garbage collector and exclusive garbage collector. The concurrent garbage collector alternates with application threads to minimize application pause times; the exclusive garbage collector (Stop the world), once running, stops all other threads in the application until the garbage collection process is complete end.
3. According to the way of fragmentation, it can be divided into compressing garbage collector and non-compressing garbage collector. The compressing garbage collector will compress the surviving objects after the collection is completed to eliminate the fragments after recycling; the non-compressing garbage collector does not perform this operation.
4. According to the working memory range, it can be divided into the new generation garbage collector and the old generation garbage collector.
The following indicators can be used to evaluate the quality of a garbage disposal.
Throughput: refers to the ratio of the time spent by the application and the total running time of the system during the application's life cycle. Total system running time = application time + GC time. If the system runs for 100 minutes and the GC takes 1 minute, the throughput of the system is (100-1)/100=99%.
Garbage collector load: Contrary to throughput, garbage collector load refers to the ratio of the time taken by the collector to the total time the system is running.
Pause time: refers to the pause time of the application when the garbage collector is running. For an exclusive collector, the pause time may be longer. When using concurrent collectors, the pause time of the program will be shorter because the garbage collector and the application run alternately, but the throughput of the system may be lower because it is likely to be less efficient than the exclusive garbage collector.
Garbage collection frequency: Refers to how often the garbage collector will run. In general, for stationary applications, the frequency of the garbage collector should be as low as possible. Generally, increasing the heap space can effectively reduce the frequency of garbage collection, but may increase the pause time caused by the collection.
Response time: refers to how long after an object is called garbage, the memory space it occupies will be released.
Heap allocation: Different garbage collectors may allocate heap memory differently. A good garbage collector should have a reasonable partition of the heap memory. Memory

allocation mechanism in Java

Atomic types, representing a single value, can be a primitive type or String, etc.), and then allocated on the stack, which is rare and we won't consider it here.

  In general, the mechanism of Java memory allocation and recovery is: generational allocation and generational recovery. Objects will be divided into: Young Generation, Old Generation, Permanent Generation (Permanent Generation, also known as method area) according to the time of survival. As shown in the following figure (from "Becoming a JavaGC Expert Part I", http://www.importnew.com/1993.html):

    

  Young Generation: When an object is created, memory allocation first occurs in the young generation (large Objects can be created directly in the old generation), most objects are no longer used soon after they are created, so they quickly become unreachable and are cleaned up by the GC mechanism of the young generation (IBM research shows that 98% Objects are dying quickly), this GC mechanism is called Minor GC or Young GC. Note that Minor GC does not mean insufficient memory in the young generation, it actually only means GC on the Eden area.

  The memory allocation on the young generation is like this. The young generation can be divided into three areas: the Eden area (the Garden of Eden, the place where Adam and Eve ate the forbidden fruit and the doll, which is used to indicate the area where the memory was allocated for the first time, which is more appropriate) and the two areas. Survivor area (Survivor 0, Survivor 1). The memory allocation process is (from "Becoming a JavaGC Expert Part I", http://www.importnew.com/1993.html):

    

    The vast majority of newly created objects will be allocated in the Eden area, and most of these objects will soon die. The Eden area is a continuous memory space, so it is extremely fast to allocate memory on it;
    when the Eden area is full, perform Minor GC to clean up the dead objects and copy the remaining objects to a survival area Survivor0 (at this time, Survivor1 is blank, and one of the two Survivors is always blank);
    after that, every time the Eden area is full, a Minor GC is executed, and the remaining objects are added to Survivor0;
    when Survivor0 is also full, the Objects that are still alive are copied directly to Survivor1. After the Minor GC is performed in the Eden area, the remaining objects are added to Survivor1 (at this time, Survivor0 is blank).
    When the two survival areas are switched several times (the HotSpot virtual machine defaults to 15 times, controlled by -XX:MaxTenuringThreshold, if it is greater than this value, it enters the old age ), the objects that still survive (actually only a small part, that is, the age of an object) 15, has been switched back and forth 15 times), will be copied to the old generation.

  As can be seen from the above process, the Eden area is a continuous space, and one of the Survivors is always empty. After a GC and replication, one Survivor saves the currently alive objects, and the contents of the Eden area and the other Survivor area are no longer needed, and can be emptied directly. At the next GC, the roles of the two Survivors will interact with each other again. Change. Therefore, the efficiency of allocating memory and cleaning up memory in this way is extremely high. This garbage collection method is the famous "Stop-and-copy" cleanup method (the Eden area and a Survivor still alive) object is copied to another Survivor), this does not mean that the stop copying cleanup method is very efficient, in fact, it is only efficient in this case, if the stop copying is used in the old age, it is quite tragic.

  In the Eden area, the HotSpot virtual machine uses two techniques to speed up memory allocation. They are bump-the-pointer and TLAB (Thread-Local Allocation Buffers). The methods of these two technologies are: Since the Eden area is continuous, the core of the bump-the-pointer technology is to track the last object created. When an object is created, it is only necessary to check whether there is enough memory behind the last object, which greatly speeds up the memory allocation speed; while for TLAB technology, for multi-threading, Eden is divided into several segments, and each thread uses A separate paragraph to avoid mutual influence. TLAB combined with bump-the-pointer technology will ensure that each thread uses a segment of the Eden area and allocates memory quickly.

  Old Generation: If an object has survived in the young generation long enough without being cleaned up (that is, survived after several Young GCs), it will be copied to the old generation. The space is generally larger than the young generation, which can store more objects, and the number of GCs in the old generation is less than that in the young generation. When the old generation memory is insufficient, a Major GC, also called a Full GC, will be performed.  
  You can use the -XX:+UseAdaptiveSizePolicy switch to control whether to use the dynamic control strategy. If it is dynamically controlled, the size of each area in the Java heap and the age of entering the old generation are dynamically adjusted.

  If the object is relatively large (such as a long string or a large array) and the Young space is insufficient, the large object will be directly allocated to the old generation (large objects may trigger early GC, so they should be used sparingly, and short-lived large objects should be avoided). Use -XX:PretenureSizeThreshold to control the size of objects that are directly promoted to the old generation. Objects larger than this value will be directly allocated to the old generation.

  There may be cases where old generation objects refer to young generation objects. If Young GC needs to be performed, it may be necessary to query the entire old generation to determine whether it can be cleaned up and collected, which is obviously inefficient. The solution is to maintain a block of 512 bytes in the old generation - "card table", and all records of old generation objects referencing new generation objects are recorded here. In Young GC, you only need to check here, you don't need to check all the old generations, so the performance is greatly improved
.

Java GC

mechanism The basic algorithm of the GC mechanism is: generational collection, this need not be repeated. The collection method for each generation is described below.

  

  Young generation:

  In fact, in the previous section, the main garbage collection methods of the new generation have been introduced. In the new generation, the "stop-copy" algorithm is used for cleaning, and the memory of the new generation is divided into two parts, one part Eden area Larger, 1 part Survivor is smaller and is divided into two equal parts. Every time you clean up, copy the Eden area and objects still alive in one Survivor to another Survivor, and then clean up Eden and the Survivor just now.

  It can also be found here that in the stop-copy algorithm, the two parts used for copying are not always equal (the traditional stop-copy algorithm has the same memory for the two parts, but the new generation uses 1 large Eden area and 2 small Survivors area to avoid this problem)

   Since most of the objects are short-lived and cannot even survive in the Survivor, the ratio between the Eden area and the Survivor is relatively large, and HotSpot defaults to 8:1, which accounts for 80% of the new generation. , 10%, 10%. If the surviving memory in Survivor+Eden exceeds 10% in a collection, you need to allocate some objects to the old age. Use the -XX:SurvivorRatio parameter to configure the capacity ratio of the Survivor area in the Eden area. The default is 8, representing Eden: Survivor1: Survivor2=8:1:1.

  Old age:
  The old generation stores many more objects than the young generation, and there is no shortage of large objects. When using the stop-copy algorithm for memory cleanup of the old generation, it is quite inefficient. Generally, the algorithm used in the old age is the mark-sort algorithm, that is: mark the still-surviving objects (with references), and move all the surviving objects to one end to ensure the continuity of memory.
     When Minor GC occurs, the virtual machine checks whether the size of each promotion into the old generation is greater than the remaining space of the old generation. If it is greater, it will directly trigger a Full GC, otherwise, it will check whether -XX:+HandlePromotionFailure is set (allowing Guarantee failure), if allowed, only MinorGC will be performed, and memory allocation failure can be tolerated at this time; if not allowed, Full GC will still be performed (this means that if -XX:+Handle PromotionFailure is set, then MinorGC will be triggered at the same time Full GC, even if there is still a lot of memory in the old age, so, it is best not to do this).

   Method area (permanent generation):

  There are two types of permanent generation recycling: constants in the constant pool and useless class information. The recycling of constants is very simple, and can be recycled without references. For the recycling of useless classes, three points must be guaranteed:

    all instances of
    the class have been recycled. The ClassLoader of the loaded class has been recycled
    . The Class object of the class object is not referenced (that is, there is no place to refer to the class through reflection)

     Permanent generation recycling It is not necessary, you can set whether to recycle the class through parameters. HotSpot provides -Xnoclassgc for control
     . Use -verbose, -XX:+TraceClassLoading, -XX:+TraceClassUnLoading to view class loading and unloading information

     -verbose, -XX:+TraceClassLoading can be used in the Product version of HotSpot;
     -XX:+TraceClassUnLoading requires the fastdebug version of HotSpot to support the

garbage collector.

In the GC mechanism, the garbage collector plays an important role. The garbage collector is the specific implementation of the GC. There is no provision for the garbage collector in the Java virtual machine specification, so The garbage collectors implemented by different manufacturers are different. The garbage collector used in HotSpot version 1.6 is as shown in the figure below (the figure is from "In-depth Understanding of Java Virtual Machine: JVM Advanced Effects and Best Implementation", there are two collectors in the figure. connection, indicating that they can be used together):

  

  

Before introducing the garbage collector, it needs to be clear that in the stop copying algorithm adopted by the new generation, the meaning of "Stop-the-world" is that when reclaiming memory, The execution of all other threads needs to be suspended. This is very inefficient, and various new generation collectors are getting more and more optimized for this, but they still only shorten the stopping time, and do not completely cancel the stopping.

    Serial collector: The new generation collector, using the stop copy algorithm, uses one thread for GC, and other worker threads are suspended. Use -XX:+UseSerialGC to run in Serial+Serial Old mode for memory reclamation (this is also the default value when the virtual machine runs in Client mode)
    ParNew collector: the new generation collector, using the stop copy algorithm, the serial collector has more Threaded version, use multiple threads for GC, other worker threads are suspended, and focus on shortening garbage collection time. Use the -XX:+UseParNewGC switch to control the use of the ParNew+Serial Old collector combination to collect memory; use -XX:ParallelGCThreads to set the number of threads that perform memory collection.
    Parallel Scavenge collector: The new generation collector, using the stop copy algorithm, pays attention to the CPU throughput, that is, the time/total time of running user code, for example: JVM runs for 100 minutes, including 99 minutes of running user code and 1 minute of garbage collection, then The throughput is 99%. This kind of collector can use the CPU most efficiently and is suitable for running background operations (collectors that focus on shortening the garbage collection time, such as CMS, have very little waiting time, so they are suitable for user interaction and improve user experience). Use the -XX:+UseParallelGC switch to control the use of Parallel Scavenge+Serial Old collector combination to collect garbage (this is also the default value in Server mode); use -XX:GCTimeRatio to set the proportion of user execution time to the total time, the default is 99, That is, 1% of the time is used for garbage collection. Use -XX:MaxGCPauseMillis to set the maximum pause time of GC (this parameter is only valid for Parallel Scavenge)

    Serial Old collector: old generation collector, single-threaded collector, use mark finishing (the finishing method is Sweep (cleaning) and Compact ( Compression), cleanup is to kill the abandoned objects, and only the surviving objects are left. Other worker threads are paused (note that other threads need to be paused for markup sorting algorithm cleanup in the old age). Before JDK1.5, the Serial Old collector was used with ParallelScavenge.
    Parallel Old Collector: Old generation collector, multi-threading, multi-threading mechanism is not bad with Parallel Scavenge, using mark sorting (different from Serial Old, the sorting here is Summary (summary) and Compact (compression), the meaning of summary is to Surviving objects are copied to a pre-prepared area, instead of cleaning up discarded objects like the Sweep algorithm, which still needs to suspend other threads while Parallel Old executes. Parallel Old is useful in multi-core computing. After the emergence of Parallel Old (JDK 1.6), it has a good effect with Parallel Scavenge, which fully reflects the effect of Parallel Scavenge collector throughput priority. Use the -XX:+UseParallelOldGC switch to control the collection using the Parallel Scavenge +Parallel Old combined collector.
    CMS (Concurrent Mark Sweep) collector: an old-age collector, dedicated to obtaining the shortest recovery pause time, using mark-sweeping algorithm, multi-threading, the advantage is concurrent collection (user threads can work with GC threads at the same time), and the pause is small. Use -XX:+UseConcMarkSweepGC to perform memory recovery with ParNew+CMS+Serial Old, and use ParNew+CMS first (see below for the reasons). When the memory of the user thread is insufficient, the alternative solution Serial Old is used for collection.

    The method of CMS collection is: first mark three times, then clear once Objects that can be associated with GC Roots (that is, objects with references) have a very short pause time; Concurrent remark is the process of executing GC Roots to find references, and does not require user thread pause; Remark is at the initial mark During and concurrent marking, the part with marked changes still needs to be marked, so adding this part of the marking process, the pause time is much smaller than the concurrent marking, but slightly longer than the initial marking. After marking is complete, concurrent cleanup begins without the need for user threads to stall.
    Therefore, in the process of CMS cleaning, only initial marking and re-marking need a short pause, and concurrent marking and concurrent clearing do not need to suspend user threads, so the efficiency is very high, and it is very suitable for high-interaction occasions.
    CMS also has shortcomings. It needs to consume additional CPU and memory resources. When CPU and memory resources are tight and CPU is less, it will increase the system burden (the default number of threads to be started by CMS is (number of CPUs + 3)/4).
    In addition, during the concurrent collection process, the user thread is still running, and memory garbage is still generated, so "floating garbage" may be generated, which cannot be cleaned up this time, and can only be cleaned up in the next Full GC. Therefore, during the GC period, it is necessary to reserve enough The memory is used by user threads. Therefore, the collector using CMS does not trigger Full GC when the old age is full, but when more than half of it is used (default 68%, ie 2/3, set with -XX:CMSInitiatingOccupancyFraction), Full GC is required. If the memory consumption of user threads is not particularly large, you can appropriately increase -XX:CMSInitiatingOccupancyFraction to reduce the number of GCs and improve performance. If the reserved user thread memory is not enough, Concurrent Mode Failure will be triggered. At this time, the backup solution will be triggered: use Serial Old collector collects, but the pause time is long, so -XX:CMSInitiatingOccupancyFraction should not be set too large.
    Also, CMS uses a mark-sweep algorithm, which will cause memory fragmentation. You can use -XX:+UseCMSCompactAtFullCollection to set whether to perform defragmentation after Full GC, and use -XX:CMSFullGCsBeforeCompaction to set how many times to perform no compression. After the Full GC, perform a Full GC with compression.
    

    G1 collector: Officially released in JDK1.7, it is very different from the current concepts of the new generation and the old generation. It is currently used less and will not be introduced.


     Note the difference between Concurrent and Parallel:
     Concurrency means that the user thread and the GC thread execute at the same time (not necessarily in parallel, may alternate, but generally execute at the same time), and there is no need to pause the user thread (actually in The user thread in the CMS still needs to be paused, but it is very short, and the GC thread is executed on another CPU);
     Parallel collection means that multiple GC threads work in parallel, but the user thread is suspended at this time;
therefore, both the Serial and Parallel collectors are parallel, and the CMS collector is concurrent.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=327018229&siteId=291194637