JVM garbage collector (3)

Knowledge points of garbage collection

Reference counting

Add a reference counter to the object, and whenever the object is referenced in a place, the calculator increases by 1. If the reference fails, the calculator decrements by 1. If the number of calculators is 0, then this object is invalid.
But if the two objects are not used, but they refer to each other, it will cause each other's reference count to not be equal to 0, roughly GC cannot recycle the object

Advantages: simple implementation and high judgment efficiency.
Disadvantages: It is difficult to solve the problem of circular references between objects. For example, the following example

Accessibility analysis

When the JVM is recycling objects, it needs to determine whether the object is still in use, and can be treasonable through GC Roots Tracing
Accessibility analysis

GC Roots objects include the following:

  • Reference objects of local variables in the stack frame of the virtual machine stack
  • Reference objects for static properties in the method area
  • Constant reference objects in the method area
  • JNI reference object in the local method stack

Advantages: more precise and rigorous, can analyze the mutual reference of circular data structures;
disadvantages: more complicated to implement, need to analyze a lot of data, consume a lot of time, the analysis process requires GC pause (reference relationship cannot change), that is, pause all Java Execution thread (called "Stop The World", which is the focus of garbage collection).

4 kinds of reference objects

Reference type usage description
Strong reference Object obj = new Object() When there is a variable reference, there is not enough memory, and the JVM would rather throw OutOfMemory instead of recycling this object
Soft reference SoftReference softRef = new SoftReference (object) Used to describe useful but non-essential objects, that is, when the memory is about to overflow, it will look back at the soft reference associated object
Weak reference WeakReference weakRef= new WeakReference(object) It is used to describe some non-essential objects. Weak reference related objects can only survive until the next garbage collection, regardless of whether the current memory is enough, they will be recycled
Virtual reference ReferenceQueue refQueue= new ReferenceQueue(object) It is to set up a system notification when the referenced object is recycled by GC

Garbage collection algorithm

Mark removal algorithm

It is divided into two steps: marking and clearing. First, mark all the objects to be recycled, and mark all the objects after marking
Mark removal algorithm

Disadvantages: First, the efficiency of marking and cleaning is not high, and second, it will generate a lot of memory fragmentation

Tag sorting algorithm

The steps of the mark sorting algorithm are the same as the mark clearing algorithm, except that the subsequent steps do not directly collect objects for cleaning, but move all surviving objects to one end, and then directly clean the memory beyond the end boundary
Tag sorting algorithm

Replication algorithm

Divide the memory into two equal blocks according to the capacity, and only use one of them at a time. Whenever one block of memory is used up, copy the live objects to another block, and then clean up the
Replication algorithm
shortcomings of the previously used memory space at once : Memory usage will be reduced to half of the original, memory usage is reduced

Generational garbage algorithm

The current commercial virtual machines all use generational collection algorithms, which are divided into several areas according to the different life cycle of the object. Generally, the Java heap area is divided into the new generation and the old generation. In this case, it can be used according to the characteristics of each generation. Different big algorithms

Most of the objects in the new generation are dying, and there is a small amount of knowledge that can survive. At this time, the copy algorithm is preferred. As long as a small number of surviving objects are copied, the cleaning work can be completed.
In the old generation, the probability of survival of the object is very high, there is no additional space for allocation, so the mark-sweep and mark-sort algorithms are used for recycling

Insert picture description here

JDK garbage collector

The default garbage collector

  • 1.7 Parallel Scavenge+ Parallel Old
  • 1.8 Parallel Scavenge + Parallel Old (G1 can also be used)
  • 1.9 G1

New Generation Collectors: Serial, ParNew, Parallel Scavenge

Old generation collector: Serial Old, Parallel Old, CMS

Collector: G1

Insert picture description hereThe interconnected talents can cooperate with each other. The top is the new generation collector, the bottom is the old generation collector

Serial

Single-threaded collector, single-threaded does not only mean using a CPU or a collection instruction to complete the garbage collection work, single-threaded means that when garbage collection and recycling must stop other worker threads until it collects Completion of work

Features: single thread, simple and efficient (compared to the single thread of other collectors), for a single CPU environment, the Serial collector does not have the overhead of thread interaction, concentrating on garbage collection can naturally get the highest single thread mobile effectiveness. When the collector performs garbage collection, it must suspend all other worker threads until it ends (Stop The World).

Application scenario: Applicable to virtual machines in Client mode.
Serial collector

ParNew

ParNew is a multi-threaded version of the Serial collector. Except for the use of multi-threading, the behavior is exactly the same as the Serial collector (parameter control, collection algorithm, Stop The World, object allocation rules, recycling strategy, etc.) Server's preferred new-generation garbage collector

Features: Multi-threading and ParNew collector enable the same number of collection threads as the CPU by default. In an environment with many CPUs, you can use the -XX: ParallelGCThreads parameter to limit the number of garbage collection threads. Stop The World has the same problem as the Serial collector

Application scenario: ParNew collector is the first-generation collector of many virtual machines running in Server mode, because it is the only one that can work with the CMS collector except the Serial collector.

ParNew collector

Parallel Scavenge

Parallel Scavenge It is also a collector that uses complex algorithms. This collector is different from other collectors in that its goal is to achieve a controllable throughput

Features: The new-generation collector is also a collector using a replication algorithm, and it is also a parallel multi-thread collector (similar to the ParNew collector).

Throughput calculation formula: running user code time / (running user code time + garbage collection time)
Parallel Scavenge collector
the shorter the garbage collection pause time, the more suitable for programs that need to interact with the user. A good response speed can improve the user experience, while high throughput It can efficiently use CPU time to complete the calculation task of the program as soon as possible, which is mainly suitable for tasks that do not need too much interaction in the background calculation.

Parallel Scavenge provides two parameters for precise control of throughput

  • -XX: MaxGCPauseMillis // Maximum garbage collection pause time (greater than 0 milliseconds)
  • -XX: GCTimeRatio / throughput size (an integer greater than 0 and less than 100, percentage of throughput)
  • -XX: + UseAdaptiveSizePolicy // Delegate memory tuning to virtual machine management. When this parameter is turned on, there is no need to manually specify the detailed parameters such as the size of the new generation, the ratio of Eden to Survivor, and the age of the old generation object. The virtual machine will collect performance monitoring information based on the current system operation and dynamically adjust these parameters. To provide the most suitable dwell time or maximum throughput

Serial Old

Serial Old is an old generation version of the Serial collector. It is also a single-threaded collector, using a mark-sort algorithm. Used in conjunction with Parallel Scavenge collector in JDK1.5 and previous versions, as a backup solution for CMS collector, used when concurrently collecting Concurent Mode Failure.

Features: It is also a single-threaded collector, using a mark-sort algorithm.

Application scenario: mainly used in the virtual machine in Client mode. Can also be used in Server mode.

Serial collector

Parallel Old

Parallel Old is a garbage collector for the old version of Parallel Scavenge. The use of multi-threaded mark sorting algorithm This collector has only been provided since JDK1.6

Features: Multi-threaded, using mark-sort algorithm.

Application Scenario: Focus on high throughput and CPU resource-sensitive occasions, you can give priority to Parallel Scavenge + Parallel Old collector.
Parallel Old Collector

Concurrent Mark Sweep

The CMS (Concurrent Mark Sweep) collector is a collector for the purpose of obtaining the shortest recovery pause time.

The characteristics of the CMS collector are: based on the mark-sweep algorithm. Concurrent collection, low pause

Application scenario: It is suitable for scenarios where the response speed of the service is emphasized, and the system pause time is expected to be the shortest to bring users a better experience. Such as web programs, b / s services.
CMS collector
The whole process is divided into 4 steps:

  1. The initial mark (CMS initial mark) marks the objects that GC Roots can directly reach. The speed is very fast but there are still Stop The World problems.
  2. Concurrent mark (CMS concurrnet mark) performs GC RootsTracing to find out the surviving objects and user threads can execute concurrently.
  3. Rewrite mark (CMS mark) The change phase is to correct the mark record of the part of the object that changes the mark due to the continued operation of the user program during the concurrent mark. The time of the change phase is longer than the initial mark and the concurrent mark is shorter
  4. CMS concurrent sweep cleans up garbage objects. At this stage, the collector thread is executed concurrently with the user thread

Disadvantages

  • The mark-clearing algorithm is used, and memory fragmentation occurs. As a result, large objects cannot allocate space and have to trigger a Full GC in advance.
  • It is very sensitive to CPU resources. The less CPU, the greater the impact on the program. The default recycling thread is: (number of CPU cores +4) / 4 round up
  • When the floating garbage is concurrently cleaned, the thread is still running and must reserve memory for the user thread. If the reserved space cannot meet the user thread, a Concurrent Mode Failure will be reported and Serial Old will be used for garbage collection.

G1

The G1 algorithm divides the heap into several areas. It is still a generational collector. However, some of these areas contain a new generation. The new generation garbage collector still suspends all application threads to survive objects. Copya to the old generation or the surviving generation. The old generation is also divided into many areas. The G1 collector completes the cleaning work by copying objects from one area to another area, which means this. During normal processing, G1 completes the heap compression, so there will be no CMS memory. The problem of fragmentation exists, suitable for scenarios: server-side applications

G1 features

Parallelism and Concurrency: G1 can make full use of the hardware advantages in multi-CPU and multi-core environments, and use multiple CPUs to shorten the stop-the-world pause time. Some collectors originally need to pause the Java thread to perform GC actions. The G1 collector can still allow Java programs to continue running in a concurrent manner.
Generational collection: G1 can manage the entire Java heap by itself, and use different methods to process newly created objects and old objects that have survived for a while and survived multiple GCs to obtain better collection results.
Space integration: G1 does not generate space debris during operation, and can provide regular and available memory after collection.
Predictable pauses: In addition to pursuing low pauses, G1 can also establish predictable pause time models. Allows the user to specify that within a period of M milliseconds, the time spent on garbage collection must not exceed N milliseconds.
Insert picture description here

Operation steps of G1

Schematic diagram of G1 collector operation
Initial mark: Only mark the objects that GC Roots can directly reach, and modify the value of TAMS (Next Top at Mark Start) to allow the next stage of the user program to run concurrently and create new objects in the correct available Region. (Thread stalls are required, but take a short time.)

Concurrent marking: From GC Roots, the reachability analysis of the objects in the heap is performed to find out the surviving objects. (It takes a long time, but can be executed concurrently with the user program)

Final mark: In order to correct the part of the mark record that changes in the mark due to user program execution during concurrent mark. And the change of the object is recorded in the thread Remembered Set Logs, and the data in the Remembered Set Logs is merged into the Remembered Set. (Thread stalls are required, but can be executed in parallel.)

Screening recycling: Sort the recycling value and cost of each Region, and make a recycling plan according to the GC pause time expected by the user. (Can be executed concurrently)

Problems with G1

Why can G1 build a predictable pause time model?

Because it planned to avoid full-area garbage collection in the entire Java heap. G1 tracks the size of garbage accumulation in each Region, maintains a priority list in the background, and prioritizes the Region with the highest value each time according to the allowed collection time. This ensures that the highest possible collection efficiency can be obtained within a limited time.

The difference between G1 and other collectors:

The working scope of other collectors is the entire new generation or old generation, and the working scope of the G1 collector is the entire Java heap. When using the G1 collector, it divides the entire Java heap into multiple independent regions of equal size (Region). Although the concepts of the new generation and the old generation are also retained, the new generation and the old generation are no longer isolated from each other. They are all part of a collection of Regions (which do not need to be continuous).

Problems with G1 collector:

Regions cannot be isolated. Objects allocated in a region can have a reference relationship with any objects in the Java heap. When using the reachability analysis algorithm to determine whether the object is alive, the entire Java heap must be scanned to ensure accuracy. Other collectors also have this problem (G1 is more prominent). Will cause the efficiency of Minor GC to decrease.

How does the G1 collector solve the above problems?

Use Remembered Set to avoid a whole bunch of scans. Each Region in G1 has a corresponding Remembered Set. When the virtual machine discovery program writes the Reference type, it will generate a Write Barrier to temporarily interrupt the write operation and check whether the Reference object is in multiple Regions (ie Check whether the objects in the new generation are referenced in the old generation), if so, record the relevant reference information through the CardTable to the Remembered Set of the Region to which the referenced object belongs. When reclaiming memory, add a Remembered Set to the enumeration range of the GC root node to ensure that the full heap is not scanned and there are no omissions.

G1 provides 2 modes

YoungGC

  1. Root scan. Static and local objects are scanned
  2. Update RS (Remembered Set). Find references from old generation to young generation and update RS
  3. Handle RS. Detect objects from the 2nd generation to the older generation
  4. Object copy. Copy surviving objects to the survivor / old area
  5. Handle the reference queue. Soft reference, weak reference, virtual reference processing

Mix GC

  1. Initial marking This stage is just to mark the objects that GC Roots can be directly related to.
  2. Concurrent marking This stage starts from GCRoot to analyze the reachability of objects in the heap to find out the surviving objects. This stage takes a long time, but can be executed concurrently with the user program.
  3. The final marking stage is to correct the part of the marking record that has changed during the concurrent marking due to the continued operation of the user program.
  4. Screening and recycling In this stage, the recycling value and cost of each region are sorted first, and the recycling plan is made according to the GC pause time expected by the user. Because only a part of the Region is recycled, the time is user-controllable, and stopping user threads will greatly improve collection efficiency.

Garbage collection mechanism

Objects are preferentially allocated to the Eden area

In most cases, the object is allocated to the Eden area first. When the size of the Eden area is not enough, a MinorGC will be triggered.

Large objects are allocated directly to the old generation

Objects refer to java objects that require a lot of continuous memory, such as very long content strings and large arrays. Large objects will cause sufficient memory, but GC will still be triggered in advance to obtain continuous storage memory space. We are developing To avoid this

Long-lived objects will enter the old age

The virtual machine defines an Age calculator for each object. If the object survives after being MinorGC once born in the Eden district, it will be transferred to the Survivor space, and the age will be +1. When the age increases to a certain level (age 15) Will be promoted to the senior district

Dynamic age judgment

The virtual machine does not always require the age of the object to reach MaxTenuringThreshold to be promoted to the old generation. The dynamic age judgment is based on the following two points:

  • If the total size of all objects of the same age in Survivor space is greater than half of Survior space
  • Then objects older than or equal to that age can directly enter the old age, waiting disorderly until the age required by MaxTenuringThreshold

== Give an example: Survivor: 10M MaxTenuringThreshold: 10 Currently there is 1 one-year-old A 1M 3 two-year-old B 6M 1 three-year-old C 1M in Survivor At this time B, C can directly enter the old age , Disorderly wait until 10 years old ==

Note: Dynamic age judgment is related to TargetSurvivorRatio. Only when the cumulative size of objects from small to old age in Survivor area is greater than Survivo * TargetSurvivorRatio (default 50%), objects above this age must be promoted to the old generation. Convenient efficiency from Eden district to Survivo

Space guarantee

Minimize the frequency of FullGC

Note: -XX: -HandlePromotionFailure no longer works after Jdk1.6 update 24, as long as the continuous space of the old generation is greater than the total size of the new generation objects or the average size of the objects that have been promoted to the old generation, MinorGC is executed, otherwise FullGC is executed

About GC

Minor GC

Minor GC is used to clean up the new generation space. Minor GC will be very frequent and very fast, because many objects are dying.
Trigger condition: when the new generation cannot allocate space for new objects, such as the Eden area is full

Major GC

Major GC is mainly to clean up the old generation. Major GC will be much slower than Minor GC. Trigger condition: old age is full

Full GC

FullGC cleans up the new generation + old generation (method area). Full GC does not perform Minor GC first. You can configure Min GC before configuring Full GC, because many objects in the old generation will refer to the new generation objects. Can improve the speed of the old generation GC. For example, when using CMS in the old generation, setting CMSScavengeBeforeRemark optimization can make Minor GC run before CMS remark.

Full GC trigger conditions:

  1. Call System.gc manually
  2. Insufficient space in the old generation
  3. Insufficient space in method area
  4. After MinorGC, the object size is larger than the available space in the old generation. For example, when Eden + From is copied to To, the size of the object is larger than the size of the To area. When the object is transferred to the old age, the space of the old generation is smaller than the size of the object.

Some setting parameters of JVM

Heap area settings

  1. -Xmx: maximum heap size
  2. -Xms: initial heap size
  3. -XX: NewSize sets the minimum space for the new generation
  4. -XX: MaxNewSize sets the maximum space of the new generation +
  5. -Xmn: New generation size The old generation size can be calculated by using the heap size minus the new generation size in JDK1.4 to support the setting of NewSize = MaxNewSize = Xmn
  6. -XX: NewRatio: Set the ratio of the new generation to the old generation. If it is 3, the ratio between the new generation and the old generation is 1: 3.
  7. -XX: SurvivorRatio: Set the ratio between the Eden area and the two Survior areas in the new generation. If the setting is 3, the ratio of Eden and 2 Survior areas is 3: 1: 1
  8. -XX: MaxTenuringThreshold means that the number of survival times from the new generation to the old generation is directly transferred to the old generation if it is 0
  9. -XX: PretenureSizeThreshold = XX is the threshold for setting the object directly into the old generation, but this parameter is only valid for Serial and ParNew collectors, not for the Parallel Scavenge collector. This collector generally does not need to be set. If you need to use ParNew + CMS The combination

Non-heap area settings

  1. -XX: PermSize, -XX: MaxPermSize: set the size of the method area space before 1.8
  2. -XX: MetaspaceSize, -XX: MaxMetaspaceSize after setting the minimum and maximum value of metaspace (MetaspaceSize) 1.8
  3. -Xss: set the stack size of each thread

Collector settings

  1. -XX: + UseSerialGC: Set the serial collector to the old generation corresponding to: Serial Old Applicable scenarios: user desktop application scenarios
  2. -XX: + UseParNewGC: Multi-threaded version of Serial collector Applicable scenarios: New generation collector preferred by Server
  3. -XX: + UseParallelGC: Set the parallel collector applicable scenarios: background calculation does not require too much interaction
  4. -XX: + UseParalledlOldGC: set parallelOld ageCollector applicable scenarios: user desktop application scenarios
  5. -XX: + UseConcMarkSweepGC: setOld ageCMS collector applicable scenarios: Internet site or WEB server

Garbage collection statistics

  1. -XX:+PrintGC
  2. -XX:+PrintGCDetails
  3. -XX:+PrintGCTimeStamps
  4. -Xloggc:filename
  5. -XX: + PrintTenuringDistribution age distribution of objects in survivor area

Remarks: This article refers to the courseware of Cloud Analysis College and https://www.cnblogs.com/chenpt/p/9803298.html here

Guess you like

Origin www.cnblogs.com/burg-xun/p/12711998.html