[JVM] 4.Java small Bai Xuecheng Gangster virtual machine and what is garbage garbage collection algorithm

In Java, memory is managed automatically by the virtual machine, the virtual machine to draw an area in memory as a space to meet the program's memory allocation request. Create a memory is still displayed by the program ape specified, but release program ape but then the object is transparent. Is the liberation of the manual recovery program ape working memory, the garbage collector to automatically recover.

In the virtual machine, which processes the space occupied by the object is no longer used by the release called garbage collection (Garbage Collection, GC) . Responsible for garbage collection program module, becoming the garbage collector (Garbage Collector) .

Now that the virtual machine has helped us to refuse automatic processing, and to understand why GC and memory allocation it?

When you need to troubleshoot a variety of memory overflow, memory leaks, garbage collection when the system becomes to achieve higher concurrency bottleneck, we need to implement the necessary monitoring technology to automatically manage virtual machines and adjusted. This is also the JVM tuning, troubleshooting, focusing on the need to have knowledge.

The focus of this article, we introduce what is trash and garbage collection algorithm, then we have to find out in the end what is garbage? Can you design a powerful garbage collection algorithms to solve all the problems of garbage collection? Certainly not, each of which garbage collection algorithm described later has its unique advantages and disadvantages it shunned. For specific scenes, flexibility in the use side is the best policy.

I hope you can learn with a problem, the greater will be the harvest.

  1. What is garbage?
  2. How to recycle garbage?
  3. Is there a garbage collection algorithm can be like a silver bullet to solve all the garbage all the same?
  4. GC classification is what? (Minor GC, Major GC, Full GC)
  5. What Stop-the-world that?
  6. How to avoid scanning whole heap?

Garbage collection algorithms .png

1 garbage collection

In the pile which kept almost all object instances Java world, on garbage collection for recycling in the heap before, the first thing is to determine which among these objects still "alive" with, which has been "dead" (that is impossible longer be used in any way object). Garbage collection, in fact, would have been assigned to go out, but no longer in use garbage collection to be able to allocate again. In the specification of the Java virtual machine, garbage refers to the death of heap space occupied by the object .

That determine how an object is alive or dead it?

1.1 reference counting algorithm

Adding a reference to an object in the counter, whenever a reference to its place, the counter value is incremented by 1; when referring to the failure, the counter value is decreased by 1; 0 any time counter object is no longer being used. In other words, we need to update all references to intercept operation, and correspondingly increase or decrease the counter of the target object .

Digression : I remember that time a research interest in iOS development, to find a company to practice, now learning to engage in iOS development, was made of a stock market simulation app. Use is Objective-C, the language initially manages memory is used in this reference counting algorithm, but later also have automatic memory management. Object contact more and found a lot of things in the nature of the principles have a lot of similarities.

Reference counting algorithm disadvantages:

  • The need for additional space to store counter, and tedious updates.
  • We can not handle circular references objects .

Which can not handle circular references objects, regarded as a reference counting major loophole.

1.2 reachability analysis algorithm

Accessibility means that if an object is a variable in the program at least by direct or indirect manner up to be referenced by other objects, the object is said to be reachable (reachable). More precisely, an object only to meet one of two conditions described below, will be determined to be reachable:

  • Itself is the root object. Root (Root) refers to a space other than the target stack access. The JVM will be marked as a root set of objects, including global variables of the system classes, and the object referenced in the stack, such as the current stack frame of local variables and parameters.
  • It is a reachable object reference.

This algorithm basic idea is to become a series of "GC Roots" object as a starting point, to start the search downward from these nodes, called search path traversed reference chain (Reference Chain), when an object to GC Roots chain is not connected to any reference (i.e., is not reachable from GC Roots to this object), the object is proved unavailable.

Reachability analysis algorithm .jpeg

GC Roots What is it? It can be understood as being a reference point in a heap outside the heap.

In the Java language, can be used as an object of GC Roots include the following categories:

  • Virtual Machine stack (Local Variable Table stack frame) in the object reference.
  • Method static property class object referenced area.
  • Object literal reference methods zone.
  • Native method stacks in the JNI (i.e., the general said method Native) object reference.
  • It started but not stop Java threads.

Reachability analysis algorithm can solve the reference counting algorithm does not solve the problem of circular references. For example, even if the objects a and b refer to each other as long as starting from the GC Roots can not reach a or b, then the reachability analysis would not have to add them into the live objects collection.

Definition and classification referenced in Java (strong references, soft references, weak references, phantom references) will be described in detail in a separate one, although the content of Java references a little upset, but many companies often interview the test sites.

Reachability analysis algorithm itself, although very simple, but in practice there are still many other problems to be solved. For example, in a multi-threaded environment, other threads may be updated reference object already visited, thereby resulting in false positives (the reference is set to null) or false negatives (the reference to the object is not visited). Manslaughter can also accept, Java virtual machine most part lost the opportunity to garbage collection. Omission on the big issue, because the garbage collector in fact still possible to recover the objects referenced memory. Once access has been referenced objects recovered from the original, it is likely to lead directly to the Java Virtual Machine Ben collapse.

2 garbage collection algorithm

Above us what is in Java garbage, then we began to describe how efficiently recycling garbage.

2.1 mark - sweep algorithm

Mark - Clear (Mark-Sweep) algorithm can be divided into two phases:

  • Mark phase: Mark all objects can be recycled.
  • Clear stage: recycling all of the objects have been marked, the release of this part of space.

The algorithm has the following disadvantages:

  1. Memory fragmentation . Since the object on the heap Java virtual machine must be a continuous distribution, and therefore may appear enough total free memory, but was unable to allocate extreme cases. Unable to find enough contiguous memory, and had to trigger a garbage collection action in advance.
  2. Allocative efficiency is low . If it is a continuous memory space, then we can pointer adder (pointer bumping) assigned to do. For a free list, Java virtual machine will need to access items in the list one by one, into the newly created object can be queried free memory.

Mark - a schematic sweep algorithm is as follows:
Clear labeling algorithm .png

2.2 replication algorithm

The process of replication algorithm is as follows:

  • Divided areas: the memory area is divided into a scaled Eden region assignment as the "main battlefield" and the two surviving region (i.e. Survivor space, is divided into two equal proportions from region to region and).
  • Copy: When collecting, cleaning the "battlefield", copy the Eden area still live objects to a piece of surviving area.
  • Clear: As in the previous stage has been to ensure that objects are still alive have been properly placed, can now "clean battlefield", and the release of another piece of Eden area and survived area.
  • Promotion: as in "copy" stage, an area can not accommodate all of the surviving "surviving" object. Directly promoted to the old era.

Replication algorithm .png

The algorithm solves the problem of memory fragmentation, but the use of heap space efficiency is extremely low . At higher survival rate target, the need for more copy operation efficiency becomes low.

2.3 mark - Collation Algorithm

The algorithm is divided into two stages:

  • Mark phase: Mark all objects can be recycled.
  • Compression phase: marking phase shifts to one end of the object space, the release of the remaining space.

The labeling process with the labeling of the algorithm - the same algorithm to clear, but the subsequent steps are not directly recycled objects to clean up, but to all surviving objects are moved to the end, then clean out the memory directly outside the terminal boundary.

Solve the problem of memory fragmentation, but also to avoid the drawbacks of replication algorithm can only use half of the memory area. It looks very good, but its memory changes more frequently, need to tidy up all the surviving reference to the object address on efficiency is much worse than the replication algorithm.

Mark - finishing schematic algorithm is as follows:
Mark - Collation Algorithm .png

2.4 generational collection algorithm

Generational collection algorithm down and no new ideas, just depending on the subject alive the memory cycle is divided into a few pieces. The Java heap is generally divided into the old and the new generation's, so you can use the most appropriate collection method based on the characteristics of each era.
JVM heap generational .png

In the new generation , each time garbage collection when there are a large number of objects found dead, only a few survive, then the choice of replication algorithm , only need to pay the cost of reproduction of a small amount of live objects to complete the collection. The old year in target because of the high survival rate, there is no extra space is allocated to its guarantee, you must use the mark - cleaning algorithm or mark - sorting algorithm to recover.

3 HotSpot algorithm

3.1 Enumeration root

Reachability analysis to find this reference chain operated from GC Roots node, for example, can be used as the primary node GC Roots global reference (e.g., constant or static property class) with the execution context (e.g., a local variable stack frame table )in. There are detailed GC Roots introducing reachability analysis algorithm above, you can see it.

3.2 Security Point (Safepoint)

Safety point, ie program execution can not stop at all the places down beginning GC, only to pause when reaching the safe point . Selected Safepoint neither be too small to allow GC to wait too long, not too often that an excessive increase in load operation.

The initial purpose of the safety point is not to let other threads to stop, but find a stable execution state. In this execution state stack Java virtual machine will not change. This way, the garbage collector will be able to perform reachability analysis "safe" place. I do not leave the security point, Java virtual machine will be able to garbage collection at the same time, continue to run this native code.

Runtime can not stop at all the places down beginning GC, only to pause when reaching the safe point. Selected safety point of the program is basically "whether to make the program long-running features' selected as the standard. " Long-running " The most obvious feature is the multiplexing instruction sequence, for example a method called cycle skip, jump, etc. abnormal, the instruction having these functions will produce Safepoint.

For security point, another issue to consider is how all the threads in the GC occurs (this does not include execution threads JNI calls) are "run" to the nearest safe point and then come to a halt.

Two solutions:

  • Preemptive interrupt (Preemptive Suspension)

    Preemptive threads executing code does not need to interrupt the initiative to cooperate, when GC occurs, first of all threads in all the disruption, if found to have local interrupted thread safety is not the point, on the resumption of the thread, let it "run" to the safety point . Now almost no virtual machine in this way to pause a thread in response GC events.

  • Active interrupt (Voluntary Suspension)

    Active thought is interrupted when the GC needs to be interrupted thread does not directly operate on the thread, just simply set a flag to flag this initiative to poll each thread of execution and found that the interrupt flag is true when it interrupts himself suspended. Polling place and mark the point of security is a coincidence, another plus to create objects need to allocate local memory.

3.3 Security Zone

Refers to a piece of code fragments, a reference relationship does not change. In this area, any place to start GC are safe. Safe Region can also be seen as being expanded in Safepoint.

4 expansion of knowledge

4.1 GC classification

Minor GC:

  • For the new generation.
  • It refers to a place in the new generation garbage collection action, because most of java objects have properties Chaosheng Xi die, so Minor GC very frequently , usually recover relatively fast speed.
  • Trigger: When Eden space is full.

Major GC:

  • For the old era.
  • Refers to occurred in the GC old age, there was Major GC, often accompanied by at least one of the GC Minor (but not absolute, there is direct strategy in the selection process Major GC collection strategy Parallel Scavenge collector's). Major GC generally slower than the speed of more than 10 times Minor GC.
  • Trigger conditions: Minor GC will object to the old era, years old at this time if the space is not enough, then the trigger Major GC.

Full GC:

  • Clean up the entire heap space. Full GC certain sense can be said to combine Minor GC and Major GC's.
  • Trigger conditions: Calling System.gc (); old's lack of space; space allocation guarantees failure.

4.2 Stop-the-world

GC must stop all Java when executing thread , which is the Stop-at The-world .

You must be able to ensure a consistent snapshot when reachability analysis, where "identity" means that the duration of the analysis of the entire execution system looks like is frozen at some point in time, the analysis can not appear the process of object references situation is still evolving relationship, which is not satisfied, then the results of the analysis accuracy can not be guaranteed.

Stop-the-world is through the security point mechanism to achieve. When the Java virtual machine receives the Stop-the-world request, it will wait for all the threads have reached a safe point before allowing the request Stop-the-world exclusive thread work.

4.3 card table

There is a scene, object-old's new generation of possible reference objects, live objects that mark the time, the need to scan all the objects of the old era. Because the object has a reference to the new generation of object, then the reference will be called GC Roots. That you do not have to scan the whole heap? It costs too much.

HotSpot solution given is a called a card table (Card Table) technology. The technique is divided into a whole stack of size 512 bytes of the card, and the card maintains a table for storing a flag for each card. Whether this flag represents the corresponding card there might point to the new generation of object. If possible, then we think this card is dirty.

Conducting Minor GC, we can not scan the entire decade old, but looking at the card table dirty card, and the card is dirty objects added to the GC Roots Minor GC's. After completion of all scanning dirty cards, Java Virtual Machine will flag will be cleared of all dirty cards.

There may want to ensure that each card object reference point to the new generation of cards are marked as dirty, then the Java virtual machine needs to intercept write each reference instance variables and make the corresponding write flag operation.

Card table can be used to reduce old age full heap space scan, which can greatly enhance the efficiency of GC .

Guess you like

Origin www.cnblogs.com/heyonggang/p/11410828.html