[Study notes] in-depth understanding of the Java Virtual Machine chapter garbage collector and memory allocation strategy

Objects dead yet?

Determination target survival:

Reference counting: adding a reference to the object in the counter, there is a place whenever it is referenced, the counter value of +1; when referring to the failure, the counter value -1. Any time counter for the object 0 is no longer being used.

Mainstream Java virtual machine do not use reference counting to manage memory, the main reason is that it is difficult to solve the problem of mutual circular references between objects.

For example: the left is the heap, the right to stack

 

Reachability analysis algorithms: a series is referred to as "GC Roots" object as a starting point, the downward search starting from the nodes (DFS), the searched path is called a reference chain, when an object to GC Roots No chain connected to a reference, then the proof is not available object.

GC Roots as the object include the following categories:

  • Virtual Machine stack (Local Variable Table stack frame) in the object reference.
  • Method static property class object referenced area.
  • Object literal reference methods zone.
  • The method of local objects JNI (Native Method) references.

 

Talk about references

References cited into strong, soft, weak, references, four kinds of virtual references cited successively reduced intensity.

  • Strong reference is ubiquitous in the program code, such as Object obj = new Object (), as long as there are strong references, the garbage collector never recovered off the object being referenced.
  • Soft references are also used to describe some, but not with the necessary objects. Before the system is going to happen out of memory exception will be soft reference object associated listed as the recovery range for a second recovery, if this has not been enough memory, memory overflow exception will be thrown. After jdk1.2, provides SoftReference class that implements the soft references.
  • Weak references are also used to describe non-essential objects, but his strength is weaker than some of the soft references are associated with weak references objects can only survive until the next garbage collection before they occur. When the garbage collector job, regardless of whether sufficient memory, have been recovered off the weak reference object is only associated. After JDK1.2, WeakReference class implementation provides weak.
  • Virtual reference, also known as ghost or phantom references cited, he is the weakest reference relationship. Whether there is a phantom reference object has completely will not affect their survival time, can not be achieved by a phantom reference object instance. The purpose is to receive a system notification when the object is recovered collector. After JDK1.2, provided PhantomReference classes to implement virtual reference.

 

finalize()

Even unreachable in reachability analysis algorithm objects, and it is not Feisibuke, to really proclaim the death of an object, at least twice to go through the labeling process: If the object during reachability analysis found no connection to the GC Roots a reference to the chain, then it will be the first mark and conduct a screening filter condition is whether this object it is necessary to finalize the implementation of the method, when the object is not covered finalize method, or finalize method has been invoked over the virtual machine, the virtual machine these two cases are considered "not necessary to perform." If the judgment is "necessary to carry out", then the object will be placed in a queue F-Queue, and consists of a Finalizer thread to execute him later, later GC would be F-Queue objects in the second mark, finalize () method is the last chance to escape death fate of the object.

Any finalize an object's methods will only be called once the system.

 

Garbage collection algorithm

Mark - sweep algorithm (the most basic collection algorithm):

First, mark all objects need to be recovered, the recovery of all unified object is marked in the mark after completion of the marking process using the above reachability analysis algorithm.

Less than: 1) efficiency, marking and clearing process efficiency is not high; 2) space problem, remove the tag will generate a lot of discrete memory fragmentation, space debris too much could lead to a greater need to be allocated in the future program is running when an object, can not find enough contiguous memory division and had another garbage collection operation in advance.

 

Heap more detailed breakdown

  • The new generation of
    Eden Garden of Eden: to create an object, it will throw here, the garbage collector frequented
    Survivor survival zone
  • Old's
    Tenured Gen: high status, pension

 

Replication algorithm:

  Available memory capacity is divided by two equal in size, uses only one of them. When this one runs out, the copy will be alive to another piece on top of the object, then put the memory has been used once cleared away. Such that each time the entire semi memory recovery zone, will not consider the complexities of memory fragmentation isochronous memory allocation, as long as the top of the stack pointer movement, in order to allocate memory, simple, efficient operation. But the cost of this algorithm is to reduce the memory to half of the original, would be a bit too high.

  Improved: Now commercial virtual machine using this collection algorithm to recover the new generation, 98% of the new generation object Chaosheng Xi die, so do not require the 1: 1 ratio of division of the memory space, but the memory is divided into a Eden larger and two smaller Survivor, which each use a Survivor Eden and, when recovered, the Eden and Survivor also live objects copied to another piece Survivor space-time, and finally clean out and just use Eden off Survivor. HotSpot virtual machine default Eden: Survivor = 8: 1. When Survivor space is not enough, need to rely on other memory (years old) allocated guarantee.

 

Mark - Collation Algorithm

Copying collection algorithm will be carried out more replication objects at high survival rate, low efficiency becomes more critical is that big old's survival, we need more space guarantee, it's old and can not use this algorithm .

For old time, marking and labeling clean-up in front of the same, but follow-up is to have all surviving objects are moved to the end, then clean out the memory directly outside the terminal boundary.

 

 

Generational collection algorithm

  Adoption of the new generation of replication algorithm, using markers years old - Collation Algorithm.

 

The garbage collector

 

 

 The figure shows seven different acting generational collectors, if there is the connection between the two collectors, suggests that they may be used with. Area in which the virtual machine, then it belongs to the new generation of collector's collector or old. Hotspot realize so many collectors, it is because there is no perfect collector appears, just choose the most suitable for a particular application of the collector.

 

Serial Collector

  • The most basic, the development of the oldest.
  • Single-threaded garbage collector to run for a while, stop for garbage collection, and then run.
  • Desktop applications (clients)

 

ParNew collector

Multi-threaded version of the Serial collector.

 

 In the garbage collector, the concurrency and parallelism of interpretation:

  • Parallel: a plurality of threads in parallel garbage collection, but this time the user thread is still in a wait state.
  • 并发:用户线程与垃圾收集线程同时执行(交替执行),用户程序在继续运行,而垃圾收集程序运行在另一个CPU中。

Parallel Scavenge收集器

吞吐量:运行用户代码时间/(运行用户代码时间+垃圾收集时间)

  • 多线程收集器
  • 达到可控制的吞吐量
  • -XX:MaxGCPauseMillis 最大垃圾收集器停顿时间 单位ms
  • -XX:GCTimeRatio 设置吞吐量大小(0,100)

CMS收集器(Concurrent Mark Sweep)

一种以获取最短回收停顿时间为目标的收集器。

基于“标记——清除”算法实现。

4个步骤:初始标记——并发标记——重新标记——并发清除

其中初始标记、重新标记需要“stop the world”,初始标记仅仅只是标记一下GC Roots能直接关联到的对象,速度很快,并发标记阶段就是进行GC RootsTracing的过程,而重新标记阶段则是为了修正并发标记期间因用户程序继续运作而导致标记产生变动的那一部分对象的标记记录,这个阶段的停顿时间一般会比 初始标记阶段稍长一些,但远比并发标记的时间短。

由于整个过程耗时最长的并发标记和并发清除过程收集器线程都可以和用户线程一起工作,所以总体上CMS收集器的内存回收过程是与用户线程一起并发执行的。

优点:并发收集、低停顿

缺点:

  • 占用大量CPU资源,导致应用程序变慢,总吞吐量降低
  • 无法处理浮动垃圾(CMS并发清理阶段用户线程还在运行着,伴随程序运行自然就还会有新的垃圾不断产生,这一部分垃圾出现在标记过程之后,CMS无法在当次收集中处理它们,只好留待下一次GC时再清理掉)
  • 可能出现“Concurrent Mode Failure”失败而导致另一次Full GC的产生。因为垃圾收集阶段用户线程还要运行,所以还需要预留足够的内存空间给用户线程使用,不能像其他垃圾收集器等到老年代几乎完全填满在进行收集。要是CMS预留的内存无法满足程序需要,就会出现“Concurrent Mode Failure”。
  • 大量空间碎片。因为是标记清除算法。

G1收集器(Garbage-First)

当今收集器技术发展的最前沿成果之一。面向移动端应用。

优点:

  • 并行与并发:G1能充分利用多CPU、多核环境下的硬件优势,使用多个CPU(CPU或者CPU核心)来缩短Stop-The-World停顿的时间,部分其他收集器原本需要停顿Java线程执行的GC动作,G1收集器仍然可以通过并发的方式让Java程序继续执行。
  • 分代收集(region):与其他收集器一样,分代概念在G1中依然得以保留。虽然G1可以不需要其他收集器配合就能独立管理整个GC堆,但它能够采用不同的方式去处理新创建的对象和已经存活了一段时间、熬过多次GC的旧对象以获取更好的收集效果。

  • 空间整合:与CMS的“标记—清理”算法不同,G1从整体来看是基于“标记—整理”算法实现的收集器,从局部(两个Region之间)上来看是基于“复制”算法实现的,但无论如何,这两种算法都意味着G1运作期间不会产生内存空间碎片,收集后能提供规整的可用内存。这种特性有利于程序长时间运行,分配大对象时不会因为无法找到连续内存空间而提前触发下一次GC。

  • 可预测的停顿:这是G1相对于CMS的另一大优势,降低停顿时间是G1和CMS共同的关注点,但G1除了追求低停顿外,还能建立可预测的停顿时间模型,能让使用者明确指定在一个长度为M毫秒的时间片段内,消耗在垃圾收集上的时间不得超过N毫秒,这几乎已经是实时Java(RTSJ)的垃圾收集器的特征了。

在G1之前的其他收集器进行收集的范围都是整个新生代或者老年代,而G1不再是这样。使用G1收集器时,Java堆的内存布局就与其他收集器有很大差别,它将整个Java堆划分为多个大小相等的独立区域(Region),虽然还保留有新生代和老年代的概念,但新生代和老年代不再是物理隔离的了,它们都是一部分Region(不需要连续)的集合。

G1收集器之所以能建立可预测的停顿时间模型,是因为它可以有计划地避免在整个Java堆中进行全区域的垃圾收集。G1跟踪各个Region里面的垃圾堆积的价值大小(回收所获得的空间大小以及回收所需时间的经验值),在后台维护一个优先列表,每次根据允许的收集时间,优先回收价值最大的Region(这也就是Garbage-First名称的来由)。这种使用Region划分内存空间以及有优先级的区域回收方式,保证了G1收集器在有限的时间内可以获取尽可能高的收集效率。

  • 初始标记:标记一下GC Roots能直接关联到的对象,需要停顿线程,但耗时很短
  • 并发标记:是从GC Root开始对堆中对象进行可达性分析,找出存活的对象,这阶段耗时较长,但可与用户程序并发执行
  • 最终标记:修正在并发标记期间因用户程序继续运作而导致标记产生变动的那一部分标记记录
  • 筛选回收:对各个Region的回收价值和成本进行排序,根据用户所期望的GC停顿时间来制定回收计划

内存分配与回收策略

对象优先在Eden分配:大多数情况下,对象在新生代Eden区中分配。当Eden区没有足够空间进行分配时,虚拟机将发起一次MinorGC。

新生代GC(Minor GC):指发生在新生代的垃圾收集动作,因为Java对象大多都具备朝生夕灭的特性,所以Minor GC非常频繁,一般回收速度也比较快。
老年代GC(Major GC/Full GC):指发生在老年代的GC,出现了Major GC,经常会伴随至少一次的MinorGC(但非绝对的,在Parallel Scavenge收集器的收集策略里就有直接进行Major GC的策略选择过程)。Major GC的速度一般会比Minor GC慢10倍以上。

大对象直接进入老年代:虚拟机提供了一个-XX:PretenureSizeThreshold参数,令大于这个设置值的对象直接在老年代分配。这样做的目的是避免在Eden区及两个Survivor区之间发生大量的内存复制。

长期存活的对象将进入老年代:虚拟机给每个对象定义了一个对象年龄(Age)计数器。如果对象在Eden出生并经过第一次Minor GC后仍然存活,并且能被Survivor容纳的话,将被移动到Survivor空间中,并且对象年龄设为1。对象在Survivor区中每“熬过”一次Minor GC,年龄就增加1岁,当它的年龄增加到一定程度(默认为15岁),就将会被晋升到老年代中。对象晋升老年代的年龄阈值,可以通过参数-XX:MaxTenuringThreshold设置。

动态对象年龄判定:为了能更好地适应不同程序的内存状况,虚拟机并不是永远地要求对象的年龄必须达到了MaxTenuringThreshold才能晋升老年代,如果在Survivor空间中相同年龄所有对象大小的总和大于Survivor空间的一半,年龄大于或等于该年龄的对象就可以直接进入老年代,无须等到MaxTenuringThreshold中要求的年龄。|

空间分配担保:在发生Minor GC之前,虚拟机会先检查老年代最大可用的连续空间是否大于新生代所有对象总空间,如果这个条件成立,那么Minor GC可以确保是安全的。如果不成立,则虚拟机会查看HandlePromotionFailure设置值是否允许担保失败。如果允许,那么会继续检查老年代最大可用的连续空间是否大于历次晋升到老年代对象的平均大小,如果大于,将尝试着进行一次Minor GC,尽管这次Minor GC是有风险的;如果小于,或者HandlePromotionFailure设置不允许冒险,那这时也要改为进行一次Full GC。

Guess you like

Origin www.cnblogs.com/mcq1999/p/12098552.html