JAVA gc garbage collection mechanism

1. GC overview
 
JVM heap related knowledge
    Why talk about the JVM heap first?
    The JVM heap is the active space of Java objects, and the objects of the classes in the program allocate space from it, which stores all the objects used by the running application. The way these objects are created is the operations of the new class. When the object is useless, the GC is responsible for the useless object (everyone on earth knows it).
JVM heap
    (1) New domain: Stores all newly generated objects
    (2) Old domain: Objects in the new domain, after a certain number of GC cycles, are moved to the old domain
    (3) Permanent domain: Storage classes and methods Object, from a configuration point of view, this domain is independent and not included in the JVM heap. The default is 4M.
 
The new domain will be divided into 3 parts: 1. The first part is called Eden. (Garden of Eden?? Maybe because Adam and Eve are the earliest objects of human activity?) 2. The other two parts are called auxiliary living spaces (kindergarten), I here one is called A space (From sqace), and the other is called B space (To Space).


2. A brief
introduction to GC The purpose of GC is very clear: in the heap, find objects that are already useless, and recover the space occupied by these objects so that they can be reused. Most garbage collection algorithm ideas are the same: put all the Objects form a set, or can be understood as a tree-like structure, starting from the root of the tree, as long as all that can be found are active objects, if not found, this object is a dead yellow flower and should be recycled.
In the sun's documentation, the new domain of the JVM heap is the use of the coping algorithm, which is proposed to overcome the overhead of handles and solve the garbage collection of heap fragments. It starts by dividing the heap into an object plane and multiple free planes. The program allocates space for objects from the object plane. When the objects are full, the garbage collection based on the coping algorithm scans the active objects from the root set and copies each active object. To the free plane (so that there is no free hole between the memory occupied by the active object), the free plane becomes the object plane, the original object plane becomes the free plane, and the program allocates memory in the new object plane.
 
For newly generated objects, they are placed in Eden; when Eden is full (too many children), GC will start to work, first stop the application, start collecting garbage, and copy all found objects to A space , once the A space is full, the GC copies all the objects that can be found in the A space to the B space (it will overwrite the original storage objects), and when the B space is full, the GC copies the objects that can be found in the B space The found objects are copied to the A space, and AB changes roles in the process. The guest officer said: copying and copying, is it annoying? When is the head? Don't worry, after a certain number of GC operations on live objects, these live objects will be put into the old domain. For these active objects, Xinyu's kindergarten life is over. Why is the new domain so frustrating? At first, I was very confused in this area, and I checked some information. It turned out that most of the objects generated by the application are short-lived. The ideal state of the copying algorithm is that all objects moved out of Eden will be collected, because These are short-lived ghosts and should be collected after a certain number of GCs, so the objects moved into the old domain are long-lived, which can prevent the back and forth copying of the AB space from affecting the application. In fact, this ideal state is difficult to achieve. There are inevitably long-lived objects in the application program. The inventor of the copying algorithm wants these objects to be placed in the new domain as much as possible to ensure small-scale replication and compress the old domain. The overhead can be much larger than replication in the new domain (the old domain is described below). For the old domain, a kind of tracing algorithm is used, called mark-sweep-compression collector. Note that there is a compression, which is an expensive operation. Garbage collection mainly recycles the memory of Young Generation blocks and Old Generation blocks. YG is used to store newly generated objects. After several collections, objects that have not been collected are moved to OG. Garbage collection of YG is also called MinorGC. Garbage collection is also called MajorGC, and the two memory collections do not interfere with each other. 2. GC process:
[older generation][survivor 1][survivor 2][eden]
*young generation=eden + survivor
1. When eden is full, trigger young GC;
2. The young GC does two things: first, remove some useless objects; second, send the old objects that are still being referenced to the survivor, and after the next few GCs, the survivor will be placed in the old.
3. When old is full, trigger full GC. Full GC consumes a lot of memory and collects most of the garbage in old and young. At this time, the user thread will be blocked.
 
3. The larger the proportion of young generation, the better.
Setting the young size larger than half of the total heap size creates inefficiencies. If it is set too small, it can cause bottlenecks because the young generation collector has to run frequently.
 
4. Summary
From the above derivation, many conclusions can be drawn. The following is a summary of the experience of the predecessors and their own understanding
1. The size of the JVM heap determines the running time of the GC. If the size of the JVM heap exceeds a certain limit, the GC will take a long time to run.
2. The longer the object lives, the longer the recovery time required by GC, which affects the recovery speed.
3. Most objects are short-lived, so if the lifetime of these objects can be made within one running cycle of the GC, wonderful!
4. In an application, the speed at which objects are created and released determines the frequency of garbage collection.
5. If the GC runs for more than 3-5 seconds at a time, it will affect the operation of the application. If possible, the size of the JVM heap should be reduced.
6. Experience from seniors: Usually, the size of the JVM heap should be 80% of the physical memory.
 
5. See the case
jmap -heap 2343
Attaching to process ID 2343, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 11.0-b16
 
using thread-local object allocation.
Parallel GC with 8 thread(s)
 
Heap Configuration:
   MinHeapFreeRatio = 40
   MaxHeapFreeRatio = 70
   MaxHeapSize      = 4294967296 (4096.0MB)
   NewSize          = 2686976 (2.5625MB)
   MaxNewSize       = -65536 (-0.0625MB)
   OldSize          = 5439488 (5.1875MB)
   NewRatio         = 2                  (YG,OG 大小比为1:2)
   SurvivorRatio    = 8
   PermSize         = 21757952 (20.75MB)
   MaxPermSize      = 268435456 (256.0MB)
 
Heap Usage:
PS Young Generation
Eden Space:
   capacity = 1260060672 (1201.6875MB)
   = 64,868,288 Used (61.86322021484375MB)
   Free = 1,195,192,384 (1139.8242797851562MB)
   5.148028935546367% Used the
From Space:
   Capacity = 85.52448 million (81.5625MB)
   Used = 59,457,648 (56.70323181152344MB)
   Free = 26,066,832 (24.859268188476562MB)
   69.52120375359195% Used the
To Space:
   Capacity = 85.85216 million (81.875MB)
   used = 0 (0.0MB)
   free = 85852160 (81.875MB)
   0.0% used
~~~~~~~~~~~~~~~~~~~~~~~~~~~ These three pieces For the YG size and usage mentioned above
PS Old Generation
   capacity = 2291138560 (2185.0MB)
   used = 1747845928 (1666.8757705688477MB)
   free = 543292632 (518.1242294311523MB)
   76.28722062099989% used
~~~~~~~~~~~~~~~~~~~~~~~~~~~OG size and usage
PS Perm Generation
   capacity = 108265472 (103.25MB)
   used = 107650712 (102.6637191772461MB)
   free = 614760 (0.5862808227539062MB)
   99.43217353728436% used
 
This machine simply says YG memory 1G, OG memory 2G, total memory 4G
In this configuration, GC operation:
jstat -gcutil -h5 2343 4s 100
  S0 S1 E O P YGC YGCT FGC FGCT GCT  
 79.82 0.00 75.34 78.55 99.44 7646 398 2052.993 3274.661 1221.668
  0.00 79.52 0.62 78.63 99.44 7647 398 2052.993 3274.775 1221.782 here there was a YG GC, which is MinorGC, took 0.12s
  0.00 79.52 28.95 78.63 99.44 7647 398 2052.993 3274.775 1221.782
  0.00 79.52 46.34 78.63 99.44 7647 1221.782 398 2052.993 3274.775
 
At the same time, it can be seen that a total of 398 Major GCs have been performed, and the total time is 2052.993. Therefore, the time of each Major GC is: 2052.993/398=5.16 seconds.
This is a very serious problem. Pause for more than 5 seconds, which is unacceptable to anyone :) The
same Minor GC was carried out 7647 times, with a total time of 1221.782 and an average time of 0.16 seconds, which is acceptable.
 

再来看看修改配置后:
jmap -heap 14103
Attaching to process ID 14103, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 11.0-b16
 
using parallel threads in the new generation.
using thread-local object allocation.
Concurrent Mark-Sweep GC
 
Heap Configuration:
   MinHeapFreeRatio = 40
   MaxHeapFreeRatio = 70
   MaxHeapSize      = 4294967296 (4096.0MB)
   NewSize          = 536870912 (512.0MB)
   MaxNewSize       = 536870912 (512.0MB)
   OldSize          = 5439488 (5.1875MB)
   NewRatio         =4                         YG:OG          1:4       
   SurvivorRatio    = 8
   PermSize         = 268435456 (256.0MB)
   MaxPermSize      = 268435456 (256.0MB)
 
Heap Usage:
New Generation (Eden + 1 Survivor Space):
   capacity = 483196928 (460.8125MB)
   used     = 428284392 (408.4438247680664MB)
   free     = 54912536 (52.368675231933594MB)
   88.63557841162434% used
Eden Space:
   capacity = 429522944 (409.625MB)
   used     = 404788608 (386.0364990234375MB)
   free     = 24734336 (23.5885009765625MB)
   94.24144010337199% used
From Space:
   capacity = 53673984 (51.1875MB)
   used     = 23495784 (22.407325744628906MB)
   free     = 30178200 (28.780174255371094MB)
   43.77499534970238% used
To Space:
   capacity = 53673984 (51.1875MB)
   used = 0 (0.0MB)
   free = 53673984 (51.1875MB)
   0.0% used
~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~YYG size and usage status
concurrent mark-sweep generation:
   capacity = 3758096384 (3584.0MB)
   used = 1680041600 (1602.2125244140625MB)
   free = 2078054784 ( 1981.7874755859375MB)
   44.70459052494594% used
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~OG Size and usage status
Perm Generation:
   capacity = 268435456 (256.0MB)
   used = 128012184 (122.0819320678711MB)
   free = 140423272 (133.9180679321289MB)
   47.688254714012146% used
 
In this configuration, GC operation:
4S-H5 100 14103 -gcutil jstat-
  S0 Sl E O P FGC FGCT YGCT YGC GCT  
 47.49 0.00 64.82 46.08 47.69 20822 2058.631 2081.365 22.734 68
  0.00 37.91 38.57 46.13 47.69 22.734 20823 2058.691 68 2081.425 where there was a YG GC, i.e. MinorGC, consuming 0.06 S
 46.69 0.00 15.19 46.18 47.69 22.734 20824 2058.776 68 2081.510
 46.69 0.00 74.59 46.18 47.69 20824 2058.776 2081.510 22.734 68
  0.00 40.29 19.95 46.24 47.69 22.734 20825 2058.848 68 2081.582
 
MajorGC average time: 22.734 / 68 = 0.334 seconds (above it is more than 5 seconds)

MinorGC average time: 2058.691/20823=0.099 seconds (slightly less than above)

Source: https://blog.csdn.net/jiafu1115/article/details/7024323

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325572844&siteId=291194637