(Reprint) JVM memory model and in garbage collection

Reprinted from the micro-channel public number: Java Advanced architecture (Java-jiagou) ----- After reading this article, my grandmother knew the JVM memory model and the garbage collection!

Sixth, memory model

6.1 memory model and run-time data area

Java virtual machine memory division during the execution of the Java program in which it will manage data for a number of different areas.

The main objective of the Java memory model program access rules defined for each variable, i.e., stored in the memory and the underlying details of such variables will be removed from the memory in a virtual machine variables.

Here we talked about the main memory, working memory and the memory area of ​​Java Java heap, stack, and other methods zone is not the same level of memory division, both of which are essentially irrelevant. If they must be forced association, from the variable, main memory, see the definition of working memory, main memory, the main object instance corresponding to the data part of the Java heap, and the working memory corresponding to the virtual machine is a stack in a partial area.

On the face of runtime data area describes a lot, in fact, focused on data storage heap and method area (non-heap), so the memory is also designed to expand the focus from these two areas (note that these two areas are shared by the threads).

For virtual machine stack, native method stacks, program counter are thread private.

6.2 graphic

A non-heap, a heap area.

Heap area is divided into two blocks, one is the Old District, a district is Young.

Young divided into two blocks, one Survivor areas (S0 + S1), a region is Eden. Eden: S0: S1 = 8: 1: 1

As large as S0 and S1, may also be called From and To.

According prior to introduction Heap may know, create a generic object and the array will be allocated on the heap memory space, the key is to heap there are so many areas, areas that create Where a target it?

6.3 Object Creation Area

Under normal circumstances, the newly created object is assigned to the Eden area, some special large objects will be allocated directly to the Old District.

For example, there are objects A, B, C, etc. created in the Eden area, but the memory space of Eden is certainly limited area, such as a 100M, 100M or if has been used to reach a set threshold, this time you need to Eden memory space clean up that garbage collection (garbage collect), so the GC we call Minor GC, Minor GC GC Young refers to the area.

After GC, some objects will be cleared away, some objects may also alive, you need to copy to Survivor areas for the survival of its objects, and then empty the Eden area of ​​these objects.

6.4 Survivor Detailed Area

As can be seen from the diagram, Survivor divided into two S0 and S1, also called From and To.

At the same time point, S0 and S1 can have a data area, the other one is empty.

The above is followed by GC, such as a start region and only Eden objects From there, To is empty.

This time to conduct a GC operation, age From objects in the region will be +1, we all know the Eden area live objects are copied to the To field, From zone can survive there will be two objects place.

If the target age-set before the age reaches a threshold value, and the object will be moved to the Old region, not reach the threshold of an object is copied to the To field.

From this time Eden area and region has been emptied (GC is certainly not the object, not the object of the GC have their own place).

This time From and To switch roles before From To become, before To become From.

That is in any case an area called To ensure Survivor is empty.

Minor GC will repeat this process, knowing the To field is filled, then all the objects will be copied to the old era.

6.5 Old District Detailed

As can be seen from the above analysis, the general area of ​​Old Age are relatively large objects, or objects relative exceeded a certain threshold.

In the Old District will have a GC operation, we GC area known as Old Major GC, GC can survive after each of the target age will be +1, and if the age exceeds a certain threshold, it will be recycled.

6.6 Object of understanding life

I am an ordinary Java objects, I was born in the Eden area, in the Eden area and I saw my little brother and long like, we play quite a long time in the Eden area. One day Eden area is too many people, I was forced to go to the district Survivor "From" area, since the area went to Survivor, I began floated, sometimes Survivor's "From" area, Survivor of time in the "To" area, of no fixed abode. Not until I was 18, my father said I was an adult, go for another try on society.

So I went over there the old generation, old generation, many people, and the age came to big, and I know a lot of people here. In the old generation, I lived for 20 years (each GC plus one year of age), and then reclaimed. (This example is not very easy to understand?)

6.7 Frequently Asked Questions

How to understand the Minor / Major / Full GC

  • Minor GC: New Generation

  • Major GC: years old

  • Full GC: Cenozoic + years old

Why do we need Survivor areas? Only Eden not do it?

  • If there is no Survivor, Eden District, once for each Minor GC, the object will be sent to the survival of the old era. As a result, years old soon to be filled, triggered Major GC (because Major GC generally accompanied Minor GC, can also be seen as triggering a Full GC).

  • Memory space is much larger than the new generation of old age, once a Full GC consumed much longer than Minor GC.

  • The implementation of a long time any harm? Frequent consumption of Full GC time is very long, it will affect the implementation and responsiveness of large programs. You might say, it's old space will be increased or slightly less.

  • If the increase of old space years, more live objects to fill the old era. While reducing the frequency of Full GC, but with old's space increases, once the Full GC occurs, the execution time required for longer.

  • If reducing old's space, although Full GC reduce the time required, but will soon be filled years old live objects, Full GC frequency increases.

So there is a sense of Survivor, is to reduce the object was sent to old age, thereby reducing the occurrence of Full GC, pre-screened to ensure Survivor, only undergo the target 16 times Minor GC can survive in the new generation, it will be sent to years old.

Why do we need two Survivor areas?

The greatest advantage is to solve the fragmentation. That is why a Survivor areas not? In the first part, we know the district must set Survivor. Suppose now that there is only one Survivor areas, we might as well simulate the process:

Just create a new object in Eden, once Eden is full, trigger a Minor GC, live objects in Eden will be moved to Survivor areas. This continues the cycle continues, the next time when Eden is full, the question is, at this time Minor GC, Eden and Survivor have a number of live objects, if this time the live objects into the Eden area of ​​hard Survivor areas, it is clear that two memory occupied part of the object is not continuous, it leads to memory fragmentation.

There is always a Survivor space is empty, another non-empty Survivor space free of debris.

The new generation of Eden: S1: Why S2 is 8: 1: 1?

The new generation of free memory: Copy the algorithm used to guarantee the memory of 9: 1

Available memory Eden: S1 region is 8: 1

I.e., the new generation of Eden: S1: S2 = 8: 1: 1

p70 replication algorithm

Modern commercial virtual machines are using this collection algorithm to recover the Cenozoic, specializing in IBM's show, the new generation of object about 98% of "raw evening toward death", it does not require a 1: 1 ratio to partitioning memory space, but the new generation memory into a larger space and two smaller Eden Survivor spaces, and wherein each use a Survivor Eden, when recovered, and the Eden Survivor further object alive once copied to another piece of Survivor space, and finally clean out Survivor Eden and just used space. HotSpot VM Eden and Survivor default size ratio is 8: 1, i.e. each new generation the available memory space for the new generation of 90% of the whole capacity (80% + 10%). If the new generation through recovery and survival of more than 10% of the object, thus resulting in an additional space is not enough space to store Survivor live objects. At this time, these objects will go directly to the old year by allocating guarantee mechanism.

Seven, Garbage Collect (garbage collection)

He said before the heap memory garbage collection, such as Full GC Minor Young district of GC, Major Old District GC, Young and Old District area. But for an object, how to determine that it is spam? You need to be recycled? How to recover it? So these problems, we also need to explore in detail.

Because Java is automatic memory management and garbage collection to do, and if do not understand the knowledge of all aspects of garbage collection, once the problem is difficult to troubleshoot and resolve, automatic garbage collection is to find objects in the Java heap, and classify objects determine, find out the object and the object is already in use will not be used, then use those objects will not be removed from the heap.

P61 garbage collection on all sectors of the runtime data area

Program counter, stack virtual machine, native method stacks with three regions born threads, with threads off; stack as the stack frame of the entry and exit method performed with an orderly stack and stack operations. Each stack frame how much memory is allocated when substantially been known class structure finalized (although some will be optimized by the JIT compiler at runtime, but the concepts discussed in this chapter-based model, generally can be considered compile knowable), so the memory allocation and recovery of these areas are equipped with certainty, in these areas do not need to give much thought to the issue of recycling, since the end of the method or the end of the thread, just as the natural memory recovered. The Java heap and method area is not the same, more of a class needs to implement memory interfaces may be different, multiple branches need a method of memory may not be the same, we are only in the program in order to know when during operation which objects will be created, which is part of the memory allocation and recovery are dynamic garbage collector concern is this memory.

P68 recovery process area

Many people think that the method area (permanent or on behalf of the HotSpot virtual machine) there is no garbage collection, Java Virtual Machine Specification did say that virtual machine implementations may not be required in the method of garbage collection area, and garbage collection method in the area " cost-effective "is generally low: in the stack, particularly in the new generation, a garbage collection routine applications can be recovered typically 70% to 95% of the space, and permanent generation garbage collection efficiency is much lower than this.

Generational garbage collection main permanent recycling of two parts: a constant and useless waste classes. Constant recycling of waste and recycling Java heap object is very similar. To recover constant literal pool, for example, if a string "abc" has entered the constant pool, but the current system does not have any a String object is called the "abc", in other words there is no constant String object reference "abc" constant pool, there is no other references to these literal, if memory recycling occurs at this time, and if necessary, the "abc" constant system will be "invited" the constant pool. Symbols other classes (interfaces), methods, fields in the constant pool reference is also similar.

7.1 How to identify an object is garbage?

To carry out garbage collection, you must first know what the object is garbage.

7.1.1 Reference counting

For an object, just hold the object referenced in the application, it means the object is not garbage, if an object does not have any pointers reference to them, it is rubbish.

7.1.2 reachability analysis

By GC Root objects, and start looking ahead to see if an object up;

It can be used as GC Root: class loader, Thread, local variable table virtual machine stack, static members, constant reference, such as local variables method stack.

  • Virtual Machine stack (Local Variable Table stack frame) in the object reference.

  • Method static property class object referenced area.

  • Object literal reference methods zone.

  • Native method stacks in the JNI (i.e., the general said method Native) object reference.

7.2 garbage collection algorithm

After been able to identify an object as spam, the next thing to consider is the recovery, how recover it? Have to have a corresponding algorithm, here are the common garbage collection algorithm.

7.2.1 mark - Clear (Mark-Sweep)

Tags: find objects in memory need to be recovered, and mark them out.

At this heap all objects will be scanned again, and thus be able to determine the object to be recovered, more time-consuming.

Clear: cleared object is marked need to be recovered, corresponding to the release of the memory space;

Disadvantages:

Will produce a large number of discrete memory fragmentation mark after clearing space debris could cause too much time in the future you need to allocate large objects in the program is running, can not find enough contiguous memory and had to trigger another garbage collection operation in advance.

(1) mark and sweep the two processes are time-consuming, inefficient

(2)会产生大量不连续的内存碎片,空间碎片太多可能会导致以后在程序运行过程中需要分配较大对象时,无法找到足够的连续内存而不得不提前触发另一次垃圾收集动作。

7.2.2  复制(Copying)

将内存划分为两块相等的区域,每次只使用其中一块,如下图所示:

当其中一块内存使用完了,就将还存活的对象复制到另外一块上面,然后把已经使用过的内存空间一次清除掉。

缺点:空间利用率降低。

7.2.3  标记-整理(Mark-Compact)

复制收集算法在对象存活率较高时就要进行较多的复制操作,效率将会变低。更关键的是,如果不想浪费50%的空间,就需要有额外的空间进行分配担保,以应对被使用的内存中所有对象都有100%存活的极端情况,所以老年代一般不能直接选用这种算法。标记过程仍然与"标记-清除"算法一样,但是后续步骤不是直接对可回收对象进行清理,而是让所有存活的对象都向一端移动,然后直接清理掉端边界以外的内存。

其实上述过程相对"复制算法"来讲,少了一个"保留区"

让所有存活的对象都向一端移动,清理掉边界意外的内存。

7.3  分代收集算法

既然上面介绍了3中垃圾收集算法,那么在堆内存中到底用哪一个呢?

P72分代收集算法

为了增加垃圾回收的效率,JVM会根据对象存活周期的不同将内存分为几块,堆中分为新生代和老年代。这样可以根据各个年代的特点采用最适当的收集算法。在新生代中,每次垃圾收集时都发现有大批对象死去,只有少量存活,那就选用复制算法,只需要付出少量存活对象的复制成本就可以完成收集。而老年代中因为对象存活率高、没有额外空间对它进行分配担保,就必须使用"标记-清除"或者"标记-整理"算法来进行回收。

小结:

Young区:复制算法(对象在被分配之后,可能生命周期比较短,Young区复制效率比较高)

Old区:标记清除或标记整理(Old区对象存活时间比较长,复制来复制去没必要,不如做个标记再清理)

7.4  垃圾收集器

如果说收集算法是内存回收的方法论,那么垃圾收集器就是内存回收的具体实现。

7.4.1  Serial收集器

Serial收集器是最基本、发展历史最悠久的收集器,曾经(在JDK1.3.1之前)是虚拟机新生代收集的唯一选择。

它是一种单线程收集器,不仅仅意味着它只会使用一个CPU或者一条收集线程去完成垃圾收集工作,更重要的是其在进行垃圾收集的时候需要暂停其他线程。

优点:简单高效,拥有很高的单线程收集效率

缺点:收集过程需要暂停所有线程

算法:复制算法

适用范围:新生代

应用:Client模式下的默认新生代收集器

7.4.2  ParNew收集器

可以把这个收集器理解为Serial收集器的多线程版本。

优点:在多CPU时,比Serial效率高。

缺点:收集过程暂停所有应用程序线程,单CPU时比Serial效率差。

算法:复制算法

适用范围:新生代

应用:运行在Server模式下的虚拟机中首选的新生代收集器

7.4.3  Parallel Scavenge收集器

Parallel Scavenge收集器是一个新生代收集器,它也是使用复制算法的收集器,又是并行的多线程收集器,看上去和ParNew一样,但是Parallel Scanvenge更关注`系统的吞吐量`。

  • 吞吐量=运行用户代码的时间/(运行用户代码的时间+垃圾收集时间)

  • 比如虚拟机总共运行了100分钟,垃圾收集时间用了1分钟,吞吐量=(100-1)/100=99%。

  • 若吞吐量越大,意味着垃圾收集的时间越短,则用户代码可以充分利用CPU资源,尽快完成程序的运算任务。

-XX:MaxGCPauseMillis控制最大的垃圾收集停顿时间,

-XX:GCRatio直接设置吞吐量的大小。

7.4.4  Serial Old收集器

Serial Old收集器是Serial收集器的老年代版本,也是一个单线程收集器,不同的是采用"标记-整理算法",运行过程和Serial收集器一样。

7.4.5  Parallel Old收集器

Parallel Old收集器是Parallel Scavenge收集器的老年代版本,使用多线程和"标记-整理算法"进行垃圾回收。

吞吐量优先

7.4.6  CMS收集器

CMS(Concurrent Mark Sweep)收集器是一种以获取`最短回收停顿时间`为目标的收集器。

采用的是"标记-清除算法",整个过程分为4步

(1)初始标记  CMS initial mark

标记GC Roots能关联到的对象  Stop The World--->速度很快

(2)并发标记  CMS concurrent mark

进行GC Roots Tracing

(3)重新标记  CMS remark

修改并发标记因用户程序变动的内容  Stop The World

(4)并发清除  CMS concurrent sweep

由于整个过程中,并发标记和并发清除,收集器线程可以与用户线程一起工作,所以总体上来说,CMS收集器的内存回收过程是与用户线程一起并发地执行的。

优点:并发收集、低停顿

缺点:产生大量空间碎片、并发阶段会降低吞吐量

7.4.7  G1收集器

G1收集器在JDK 7正式作为商用的收集器。与前几个收集器相比,G1有以下特点

P84 G1收集器

  • 并行与并发

  • 分代收集(仍然保留了分代的概念)

  • 空间整合(整体上属于“标记-整理”算法,不会导致空间碎片)

  • 可预测的停顿(比CMS更先进的地方在于能让使用者明确指定一个长度为M毫秒的时间片段内,消耗在垃圾收集上的时间不得超过N毫秒)

使用G1收集器时,Java堆的内存布局与就与其他收集器有很大差别,它将整个Java堆划分为多个大小相等的独立区域(Region),虽然还保留有新生代和老年代的概念,但新生代和老年代不再是物理隔离的了,它们都是一部分Region(不需要连续)的集合。

工作过程可以分为如下几步:

  • 初始标记(Initial Marking)

标记以下GC Roots能够关联的对象,并且修改TAMS的值,需要暂停用户线程

  • 并发标记(Concurrent Marking)

从GC Roots进行可达性分析,找出存活的对象,与用户线程并发执行

  • 最终标记(Final Marking)

修正在并发标记阶段因为用户程序的并发执行导致变动的数据,需暂停用户线程

  • 筛选回收(Live Data Counting and Evacuation)

对各个Region的回收价值和成本进行排序,根据用户所期望的GC停顿时间制定回收计划

7.4.8  垃圾收集器分类

  • 串行收集器->Serial和Serial Old

只能有一个垃圾回收线程执行,用户线程暂停。`适用于内存比较小的嵌入式设备`。

  • 并行收集器[吞吐量优先]->Parallel Scanvenge、Parallel Old

多条垃圾收集线程并行工作,但此时用户线程仍然处于等待状态。`适用于科学计算、后台处理等若交互场景`。

  • 并发收集器[停顿时间优先]->CMS、G1

用户线程和垃圾收集线程同时执行(但并不一定是并行的,可能是交替执行的),垃圾收集线程在执行的时候不会停顿用户线程的运行。`适用于相对时间有要求的场景,比如Web`。

7.4.9  常见问题

吞吐量和停顿时间

停顿时间->垃圾收集器 `进行` 垃圾回收终端应用执行响应的时间

吞吐量->运行用户代码时间/(运行用户代码时间+垃圾收集时间)

停顿时间越短就越适合需要和用户交互的程序,良好的响应速度能提升用户体验;

高吞吐量则可以高效地利用CPU时间,尽快完成程序的运算任务,主要适合在后台运算而不需要太多交互的任务。

小结:这两个指标也是评价垃圾回收器好处的标准,其实调优也就是在观察者两个变量。

如何选择合适的垃圾收集器

* 优先调整堆的大小让服务器自己来选择

* 如果内存小于100M,使用串行收集器

* 如果是单核,并且没有停顿时间要求,使用串行或JVM自己选

* 如果允许停顿时间超过1秒,选择并行或JVM自己选

* 如果响应时间最重要,并且不能超过1秒,使用并发收集器

对于G1收集

JDK 7开始使用,JDK 8非常成熟,JDK 9默认的垃圾收集器,适用于新老生代。

是否使用G1收集器?

(1)50%以上的堆被存活对象占用

(2)对象分配和晋升的速度变化非常大

(3)垃圾回收时间比较长

如何开启需要的垃圾收集器这里JVM参数信息的设置大家先不用关心,下一章节会详细写到。

(1)串行

-XX:+UseSerialGC

-XX:+UseSerialOldGC

(2)并行(吞吐量优先):

-XX:+UseParallelGC

-XX:+UseParallelOldGC

(3)并发收集器(响应时间优先)

-XX:+UseConcMarkSweepGC

-XX:+UseG1GC

Guess you like

Origin www.cnblogs.com/gocode/p/memory-model-of-jvm-and-garbage-collection.html