[Reserved] JVM structure, GC mechanism Detailed

The following article is divided into four parts

JVM structures, memory allocation, garbage collection algorithm , the garbage collector. Let's look at eleven.

A, JVM structure

The "java Virtual Machine Specification" provides the basic structure of the JVM typically as follows:

Seen from left, JVM includes four parts:

1. The class loader (ClassLoader): when starting the JVM runtime class or class needs loading into the JVM. (Right panel shows the entire process from the source file into the JVM java can be understood with regard to class loading mechanism, reference may http://blog.csdn.net/tonytfjing/article/details/47212291 )

2. Execution Engine: Responsible for execution of bytecode instructions contained in the file class (the working mechanism of execution engine, where not elaborate, here introduces the JVM structure);

3. memory area (also known as run-time data area): it's time to run in the JVM operation of the allocated memory area. Runtime main memory area can be divided into five regions, as shown:

Method region (Method Area): class structure for storing local information, including the constant pool, static variables, constructors like. Although the JVM specification method area is described as a logical part of the heap, but it has alias non-heap (non-heap), so we do not get confused. The method further comprises a region runtime constant pool.

java heap (Heap): or where the object instance stored java. This is the main area of the GC (explained later). From the stored content we can easily know, heap and method area is shared by all java threads.

java stack (Stack): java thread stack is always associated together, whenever a thread is created, JVM creates a corresponding java stack for this thread. In this java stack will contain a plurality of stack frames, each running a method which creates a stack frame for storing a local variable table, stack operation, the return value of the method and the like. Each method until the completion of the execution procedure is called, a corresponding stack frame of a process to push the stack in the stack java. So java stack is a ready-made private.

程序计数器(PC Register)：用于保存当前线程执行的内存地址。由于JVM程序是多线程执行的（线程轮流切换），所以为了保证线程切换回来后，还能恢复到原先状态，就需要一个独立的计数器，记录之前中断的地方，可见程序计数器也是线程私有的。

本地方法栈(Native Method Stack)：和java栈的作用差不多，只不过是为JVM使用到的native方法服务的。

4.本地方法接口：主要是调用C或C++实现的本地方法及返回结果。

二、内存分配

我觉得了解垃圾回收之前，得先了解JVM是怎么分配内存的，然后识别哪些内存是垃圾需要回收，最后才是用什么方式回收。

Java的内存分配原理与C/C++不同，C/C++每次申请内存时都要malloc进行系统调用，而系统调用发生在内核空间，每次都要中断进行切换，这需要一定的开销，而Java虚拟机是先一次性分配一块较大的空间，然后每次new时都在该空间上进行分配和释放，减少了系统调用的次数，节省了一定的开销，这有点类似于内存池的概念；二是有了这块空间过后，如何进行分配和回收就跟GC机制有关了。

java一般内存申请有两种：静态内存和动态内存。很容易理解，编译时就能够确定的内存就是静态内存，即内存是固定的，系统一次性分配，比如int类型变量；动态内存分配就是在程序执行时才知道要分配的存储空间大小，比如java对象的内存空间。根据上面我们知道，java栈、程序计数器、本地方法栈都是线程私有的，线程生就生，线程灭就灭，栈中的栈帧随着方法的结束也会撤销，内存自然就跟着回收了。所以这几个区域的内存分配与回收是确定的，我们不需要管的。但是java堆和方法区则不一样，我们只有在程序运行期间才知道会创建哪些对象，所以这部分内存的分配和回收都是动态的。一般我们所说的垃圾回收也是针对的这一部分。

总之Stack的内存管理是顺序分配的，而且定长，不存在内存回收问题；而Heap 则是为java对象的实例随机分配内存，不定长度，所以存在内存分配和回收的问题；

三、垃圾检测、回收算法

垃圾收集器一般必须完成两件事：检测出垃圾；回收垃圾。怎么检测出垃圾？一般有以下几种方法：

引用计数法：给一个对象添加引用计数器，每当有个地方引用它，计数器就加1；引用失效就减1。

好了，问题来了，如果我有两个对象A和B，互相引用，除此之外，没有其他任何对象引用它们，实际上这两个对象已经无法访问，即是我们说的垃圾对象。但是互相引用，计数不为0，导致无法回收，所以还有另一种方法：

可达性分析算法：以根集对象为起始点进行搜索，如果有对象不可达的话，即是垃圾对象。这里的根集一般包括java栈中引用的对象、方法区常良池中引用的对象

本地方法中引用的对象等。

总之，JVM在做垃圾回收的时候，会检查堆中的所有对象是否会被这些根集对象引用，不能够被引用的对象就会被垃圾收集器回收。一般回收算法也有如下几种：

1.标记-清除（Mark-sweep）

算法和名字一样，分为两个阶段：标记和清除。标记所有需要回收的对象，然后统一回收。这是最基础的算法，后续的收集算法都是基于这个算法扩展的。

不足：效率低；标记清除之后会产生大量碎片。效果图如下：

2.复制（Copying）

此算法把内存空间划为两个相等的区域，每次只使用其中一个区域。垃圾回收时，遍历当前使用区域，把正在使用中的对象复制到另外一个区域中。此算法每次只处理正在使用中的对象，因此复制成本比较小，同时复制过去以后还能进行相应的内存整理，不会出现“碎片”问题。当然，此算法的缺点也是很明显的，就是需要两倍内存空间。效果图如下：

3.标记-整理（Mark-Compact）

此算法结合了“标记-清除”和“复制”两个算法的优点。也是分两阶段，第一阶段从根节点开始标记所有被引用对象，第二阶段遍历整个堆，把清除未标记对象并且把存活对象“压缩”到堆的其中一块，按顺序排放。此算法避免了“标记-清除”的碎片问题，同时也避免了“复制”算法的空间问题。效果图如下：

（1,2,3 图文摘自 http://pengjiaheng.iteye.com/blog/520228，感谢原作者。）

4.分代收集算法

这是当前商业虚拟机常用的垃圾收集算法。分代的垃圾回收策略，是基于这样一个事实：不同的对象的生命周期是不一样的。因此，不同生命周期的对象可以采取不同的收集方式，以便提高回收效率。

为什么要运用分代垃圾回收策略？在java程序运行的过程中，会产生大量的对象，因每个对象所能承担的职责不同所具有的功能不同所以也有着不一样的生命周期，有的对象生命周期较长，比如Http请求中的Session对象，线程，Socket连接等；有的对象生命周期较短，比如String对象，由于其不变类的特性，有的在使用一次后即可回收。试想，在不进行对象存活时间区分的情况下，每次垃圾回收都是对整个堆空间进行回收，那么消耗的时间相对会很长，而且对于存活时间较长的对象进行的扫描工作等都是徒劳。因此就需要引入分治的思想，所谓分治的思想就是因地制宜，将对象进行代的划分，把不同生命周期的对象放在不同的代上使用不同的垃圾回收方式。

如何划分？将对象按其生命周期的不同划分成：年轻代(Young Generation)、年老代(Old Generation)、持久代(Permanent Generation)。其中持久代主要存放的是类信息，所以与java对象的回收关系不大，与回收息息相关的是年轻代和年老代。这里有个比喻很形象

“假设你是一个普通的 Java 对象，你出生在 Eden 区，在 Eden 区有许多和你差不多的小兄弟、小姐妹，可以把 Eden 区当成幼儿园，在这个幼儿园里大家玩了很长时间。Eden 区不能无休止地放你们在里面，所以当年纪稍大，你就要被送到学校去上学，这里假设从小学到高中都称为 Survivor 区。开始的时候你在 Survivor 区里面划分出来的的“From”区，读到高年级了，就进了 Survivor 区的“To”区，中间由于学习成绩不稳定，还经常来回折腾。直到你 18 岁的时候，高中毕业了，该去社会上闯闯了。于是你就去了年老代，年老代里面人也很多。在年老代里，你生活了 20 年 (每次 GC 加一岁)，最后寿终正寝，被 GC 回收。有一点没有提，你在年老代遇到了一个同学，他的名字叫爱德华 (慕光之城里的帅哥吸血鬼)，他以及他的家族永远不会死，那么他们就生活在永生代。”

具体区域可以通过VisualVM中的VisaulGC插件查看，如图（openjdk 1.7）：

年轻代：是所有新对象产生的地方。年轻代被分为3个部分——Enden区和两个Survivor区（From和to）当Eden区被对象填满时，就会执行Minor GC。并把所有存活下来的对象转移到其中一个survivor区（假设为from区）。Minor GC同样会检查存活下来的对象，并把它们转移到另一个survivor区（假设为to区）。这样在一段时间内，总会有一个空的survivor区。经过多次GC周期后，仍然存活下来的对象会被转移到年老代内存空间。通常这是在年轻代有资格提升到年老代前通过设定年龄阈值来完成的。需要注意，Survivor的两个区是对称的，没先后关系，from和to是相对的。

The old generation: experience in the young generation in the target after N recovery still has not been cleared, it will be placed in the old generation, can say that they are battle-hardened generation without death, life cycle are relatively long objects. For the old generation and permanent generation collection algorithm no longer uses as younger generations move as maneuvers, because for those veterans on the battlefield recovery is pediatrics. Full GC is usually triggered when old's memory is full, recovery of the entire heap memory.

Permanent generation: for storing static files, such as java classes, methods, and so on. Permanent generation has no significant impact on garbage collection.

Generational recovery effect is as follows:

The reason why I am here to talk about generational last, because generational involved in front of several algorithms. The young generation: involves the replication algorithm; the old generation: involving the "mark - finishing (Mark-Sweep)" algorithm.

Fourth, the garbage collector

Garbage collection algorithms is the methodology of memory recovery, achieved these methodologies is the garbage collector. Different manufacturers offer different versions of the JVM garbage collector may be different, here referring to "in-depth understanding of the Java virtual machine" that is JDK1.7 version of Hotspot virtual machine, on the garbage collector has a good summary of the blog post, I will not say , see: http://blog.csdn.net/java2000_wl/article/details/8030172

to sum up

Although I do not think java must learn to understand the underlying implementation of Java, but I think if you understand more JVM and GC, you will be more understanding of Java, absolute benefit in the future study and work. After all, our goal is not to paint the house work, not a porter, but the development of siege lion ah!

Original Address: Click to open the link