"In-depth understanding of virtual machine" reading notes-garbage collector

Garbage collector

  • Overview
    • Connected representatives can be combined

    • Parallel Scanvenge and G1 do not use the traditional gc collector code framework, the rest share some framework code
  • Concurrency and parallelism
    • Parallel (Parallel): refers to multiple garbage collection threads working in parallel, but at this time the user thread is still in a waiting state.
    • Concurrent: refers to the simultaneous execution of user threads and garbage collection threads (but not necessarily parallel, and may be executed alternately), user programs continue to run, and garbage collection programs run on another CPU
  • Serial collector
    • The most basic and oldest collector is the only option for new generation collection before 1.3.1
    • Single-threaded collector: There is only one collection thread to perform garbage collection, and all other worker threads must be suspended.
    • The default new generation collector in Client mode is simple and efficient, suitable for a single CPU environment.
    • Schematic diagram of the serial collector's workflow

  • ParNew collector
    • Multi-threaded version of Serial collector
      • Different: multiple threads perform garbage collection
      • the same
        • Control parameters: -XX: SurvivroRatio / -XX: PretenureSizeThreshold / -XX: HandlePromotionFailure, etc.
        • Collection algorithm
        • STW
        • Object allocation rules
        • Recycling strategy
    • The new generation in Server mode is preferred and can be used in conjunction with the CMS collector.
    • Use the default new generation collector after -XX: UserConcMarkSweepGC, or use -XX: UseParNewGC to force use
    • The performance under a single CPU is not necessarily better than Serial, but as the number of CPUs increases, the better the effective use of system resources during GC.
      • The default number of open threads is the same as the number of CPUs
      • -XX: ParallelGCThreads limit the number of garbage collection threads
    • Workflow diagram of ParNew collector

  • Parallel Scavenge collector
    • The new generation-parallel multi-threaded collector of the replication algorithm, "throughput first" collector
    • The goal of the collector is to achieve controllable throughput (Throughput = user code running time / total CPU consumption time)
    • Efficient use of CPU time to complete the calculation task of the program as soon as possible, mainly suitable for tasks that do not require too much interaction in the background calculation.
    • The main parameters
      • -XX:MaxGCPauseMills
        • Control the maximum garbage collection pause time
        • For milliseconds greater than 0, GC pause time is shortened at the expense of throughput and new generation space
      • -XX:GCTimeRatio
        • Set the throughput size, the default is 99
        • An integer greater than 0 and less than 100, that is, the ratio of garbage collection time to total time, the inverse of throughput
        • If set to N, the maximum GC time allowed is N / (N + 1)
      • -XX:+UseAdaptiveSizePolicy
        • GC Adaptive Regulation Strategy (GC Ergonomics)
        • There is no need to manually set the parameters such as the size of the new generation, the partition ratio of the new generation, and the size of the objects in the old generation. The virtual machine collects performance monitoring information according to the system operation and dynamically adjusts the parameters to provide the most suitable pause time or maximum throughput
        • Simply set the basic memory parameters and set -XX: MaxGCPauseMills or -XX: GCTimeRatio as the virtual machine optimization goal.
  • Serial Old Collector
    • The older generation version of the Serial collector, using the "mark-sort" algorithm
    • Mainly used in the virtual machine in Client mode
    • The role of Server mode
      • JDk1.5 and before used with Parallel Scavenge collection
      • As a backup plan for the CMS collector, it is used when Concurrent Mode Failure occurs in concurrent collection.
    • Schematic diagram of the serial old collector workflow

  • Parallel Old Collector
    • The older generation version of the Parallel Scavenge collector uses multi-threading and "mark-organize" algorithms.
    • Mainly used with Parallel Scavenge.
  • CMS (Concurrent Mark Sweep) collector
    • A collector aimed at obtaining the shortest recovery pause time.
    • Based on "Mark-Clear" algorithm.
    • process
      • Schematic diagram of the workflow of the CMS collector

      • Initial mark (CMS initial mark)-STW
        • Mark objects directly related to GC Roots
        • Pause time is short
      • Concurrent mark (CMS concurrent mark)
        • GC Roots Tracing
        • During the concurrent marking process, the continued running of the application thread causes some objects to be promoted from the new generation to the old generation, some old generation object references will be changed, and some objects will be directly allocated to the old generation. The card will be marked as dirty for scanning in the re-marking phase.
      • Pre-cleaning stage
        • Used to mark objects that survived in the old generation, the purpose is to make the STW of the re-marking phase as short as possible.
        • The goals of this stage are the old generation objects affected by the application thread during the concurrent marking phase, including: (1) objects whose card is dirty in the old generation; (2) old generation objects referenced in the surviving area (from and to). Therefore, this stage also needs to scan the new generation + old generation.
      • Interruptible pre-cleaning
        • The same as the "pre-cleaning" phase is also to reduce the workload of the re-marking phase.
        • Before entering the re-marking phase, try to wait for a Minor GC to minimize the pause time of the re-marking phase.
        • In addition, the interruptible pre-cleaning will start when Eden reaches 50%. At this time, there is still half time from the next minor gc. This has another meaning, that is, to avoid two consecutive pauses in a short time.
        • If the following two conditions are met, the "interruptible pre-cleaning" is not turned on
          • The usage space of Eden is larger than "CMSScheduleRemarkEdenSizeThreshold", the default value of this parameter is 2M;
          • The usage rate of Eden is greater than or equal to "CMSScheduleRemarkEdenPenetration", the default value of this parameter is 50%.
        • If it is not satisfied, enter interruptible pre-cleaning, which may be executed multiple times, then there are two exits exiting this stage
          • CMSMaxAbortablePrecleanLoops is set, and the number of executions exceeds this value, the default value of this parameter is 0;
          • CMSMaxAbortablePrecleanTime, the time to perform interruptible pre-cleaning exceeds this value, the default value of this parameter is 5000 milliseconds.
        • It may be possible to interrupt the pre-cleaning process without waiting for Minor gc. If you enter the re-marking stage at this time, there are many live objects in the new generation, which will cause the STW to become longer. Therefore, CMS also provides CMSScavengeBeforeRemark Force Minor gc in turn before marking.
      • Remark (CMS remark)-STW
        • In order to correct the concurrent mark, the mark record of the part of the object that changes the mark due to the operation of the user program
        • New generation object + Gc Roots + old generation object corresponding to the card marked as dirty in front.
        • The pause time is slightly longer than the initial mark and shorter than the concurrent mark
      • Concurrent sweep (CMS concurrent sweep)
    • Disadvantages
      • Very sensitive to CPU resources. The number of recycling threads started by default is (number of CPU + 3) / 4.
      • Unable to handle floating garbage.
        • There may be a "Concurrent Mode Failure" failure leading to another full gc.
        • After appearing in the mark, the continued operation of the user program has generated new garbage that cannot be recycled in this time.
        • Due to the reserved memory space, the actual space used: -XX: CMSInitiatingOccupancyFraction. 1.5 is 60% of the default old generation; 1.6 is 92% by default.
        • If the reserved memory cannot satisfy the allocation of the user program, it will cause "Concurrent Mode Failure" and temporarily start the Serial Old collector to collect the old generation.
      • Will generate a lot of memory fragmentation.
        • If enough memory allocation objects cannot be found, full GC is triggered in advance.
        • -XX:+UseCMSCompactAtFullCollection
          • On by default
          • When the CMS can't hold the full Gc, start the merge process of memory fragmentation. This process cannot be concurrent.
        • -XX:CMSFullGCsBeforeCompaction
          • The default is 0, that is, every time Full Gc is compressed
          • It is used to set how many times to perform FullGc without compression, and perform one time with compression.
    • CMS concurrent cycle failure
      • Concurrent mode failure: During the execution of the concurrent cycle, the user's thread is still running. If at this time, if the space requested by the application thread to the old generation exceeds the reserved space (guaranteed failure), it will trigger concurrent mode failure, and then the concurrent cycle of the CMS will be replaced by a full GC-stop all applications for garbage collection And perform space compression. If we set the UseCMSInitiatingOccupancyOnly and CMSInitiatingOccupancyFraction parameters, where the value of CMSInitiatingOccupancyFraction is 70, then the reserved space is 30% of the old generation.
      • Promotion failure: When the new generation is doing minor gc, the guarantee mechanism of the CMS needs to confirm whether there is enough space in the old generation to accommodate the object to be promoted. If the guarantee mechanism finds that it is not enough, it will report concurrent mode failure. In fact, due to the fragmentation problem, it is impossible to allocate, and the promotion will be reported as failed.
      • The permanent generation space (or Java 8 meta space) is exhausted. By default, the CMS will not collect the permanent generation. Once the permanent generation space is exhausted, Full GC will be triggered.
  • G1 collector
    • Server-oriented collector
    • characteristic
      • Parallel and concurrent
        • Make full use of the hardware advantages of multi-cpu and multi-core environments. Use multiple CPUs to reduce STW time.
        • Other collectors require STW's GC action, and G1 can still be performed concurrently.
      • Generational collection
      • Spatial integration
        • From the whole is based on the "mark-organize", from the local (between regions) is based on the "copy" algorithm.
        • The above algorithm means that no memory space fragmentation will occur during G1 operation.
      • Predictable pause
        • Establish a predictable pause time model that allows users to explicitly specify that within a time segment of M milliseconds, the time spent in garbage collection does not exceed N milliseconds
    • Memory layout
      • Region
        • The entire heap space is divided into several memory areas of equal size, and each time the object space is allocated, the memory is used segment by segment.
        • It only retains the logical concepts of the new generation and the old generation. It is no longer physically isolated, but a collection of regions.
        • Each partition will not serve a certain generation, and can be switched between the young generation and the old generation as needed.
        • At startup, you can specify the partition size (1MB ~ 32MB, and must be a power of 2) through the parameter -XX: G1HeapRegionSize = n. By default, the entire heap is divided into 2048 partitions.
      • Card
        • Each partition is divided into a number of 512 Byte cards, and the cards that identify the smallest available granularity of the heap memory will be recorded in the Global Card Table.
        • The assigned object will occupy several cards that are physically continuous.
        • When looking up a reference to an object in a partition, the reference object can be found by recording the card (see RSet).
        • Every time the memory is recycled, the card in the specified partition is processed.
    • The basis of predictable time models
      • You can plan to avoid full-area garbage scans throughout the heap.
      • G1 tracks the value of each region and the accumulated value (recovered space and the experience value of the time required for recycling), maintains a priority list, and prioritizes the region with the highest value according to the allowed collection time.
    • Collected Set
      • A collection of partitions that can be recycled. The data that survives in CSet will be moved to another available partition during GC. The partition in CSet can come from Eden space, Survivor space, or the old generation. CSet will occupy less than 1% of the entire heap space.
    • Remembered Set--Avoid full heap scan
      • Each Region has a Remembered Set.
      • When the JVM discovery program writes Reference type data, it will generate a Write Barrier to temporarily interrupt the write operation and check whether the referenced object is in a different Region.
      • If it is, the relevant reference information is recorded in the Remembered Set of the Region to which the referenced object belongs via CardTable.
      • When performing garbage collection, add a Remembered Set to the enumeration range of the GC root node to avoid a full heap scan.
      • RSet is actually a hash table, key is the starting address of another region, value is a set, and the element inside is the index of the card table.
    • Snapshot-At-The-Beginning(SATB)
      • SATB is a means to maintain the correctness of concurrent GC. G1GC's concurrency theory is based on SATB. It is a marking algorithm designed by Taiichi Yuasa for incremental mark removal garbage collector. SATAB's mark optimization is mainly aimed at mark-removal garbage collector. Concurrent marking phase. According to R's argument: CMS's incremental update design makes it necessary to rescan all thread stacks and the entire young gen as root during the remark phase; G1's SATB design only needs to scan the remaining satb_mark_queue during the remark phase.
      • The SATB algorithm creates an object graph, which is a logical "snapshot" of the heap. The tag data structure includes two bitmaps: the previous bitmap and the next bitmap.
      • The previous bitmap saves the most recently completed marking information. The next bitmap will be created and updated in the concurrent marking cycle. As time goes by, the previous bitmap will become more and more outdated. Finally, at the end of the concurrent marking cycle, the next bitmap Will overwrite the previous bitmap.
      • step
        • Concurrency cycle includes initial marking, concurrent marking and final marking.
        • During the initial marking phase, the NTAMS field is set to the current top of each partition. After the start of the concurrent cycle, the allocated objects will be placed on the TAMS, and are explicitly defined as implicitly alive objects, while the objects under the TAMS need to be clearly marked.
        • There is a PTAMS pointer in Top and Bottom, which indicates that it is implicitly alive at the end of the previous marking cycle, and the position of the object area that needs to be clearly marked in the next cycle, that is, the position of NTAMS at the end of the previous marking cycle.
        • After the concurrent cycle begins, the area of ​​the object that needs to be clearly marked between Bottom and PTAMS is recorded in the previous bitmap.
        • The objects between Top and PATMS are implicitly surviving objects and are also recorded in the previous bitmap.
        • At the end of the final marking, all objects before NTAMS will be marked.
        • The objects allocated during the concurrent marking phase will be allocated to the space after NTAMS, and they will be recorded as implicitly surviving objects in the next bitmap. After a concurrent marking cycle is completed, this next bitmap will overwrite the previous bitmap, and then clear the next bitmap.
    • step
      • Initial mark (STW)-STW
        • Mark the objects directly associated with GC Roots, and modify the NTAMS field (Next Top at Mark Start) to the top position of the current partition
        • Pause time is short
      • Concurrent mark
        • GC Roots Tracing
        • Concurrent marking will use the trace algorithm to find all living objects and record them in a bitmap, because objects above TAMS are considered to be implicitly alive, so we only need to traverse those under TAMS;
        • To record the reference changes that occurred at the time of marking, the idea of ​​SATB is to set a snapshot at the beginning, and then assume that the snapshot does not change, and trace according to this snapshot.
        • At this time, if the reference of an object changes, the old value of the object needs to be recorded in a SATB buffer through pre-write barrier logs.
        • If the buffer is full, add it to a global list-G1 will have concurrently marked threads to process the global list periodically.
      • Final Mark (Finalremark) STW
        • In order to correct the concurrent mark, the mark record of the part of the object whose mark changes due to the operation of the user program.
        • The G1 garbage collector will dispose of the remaining SATB log buffer and all updated references. At the same time, the G1 garbage collector will also find all unmarked live objects.
        • The JVM records the object changes during the period to the thread Remembered Set Logs. Finally merged into the Remembered Set.
      • Screening and recycling (Live Data Counting and Evacuation),
        • First sort according to the recycling value and cost of the Region, and make a recycling plan according to the GC pause time expected by the user.
        • Concurrency is allowed, but because only a portion of the Region is recovered, the time is user-controllable, and suspending user threads can greatly improve collection efficiency.
    • Garbage collector related parameters
Published 24 original articles · Likes0 · Visits100

Guess you like

Origin blog.csdn.net/jiangxiayouyu/article/details/105614266