jvm Series - it Detailed and garbage collection algorithm

introduction

  I believe we all know, java memory by the java virtual machine, which is jvm managed automatically, and c and c ++ different, java developers do not need to configure the delete / free for each new operation, and not prone to memory leaks and out of memory, but also because of this, if java developers do not understand the operating mechanism of java virtual machine, when a problem occurs terms of memory, investigation work will be very difficult. In addition, the performance of the program with GC (garbage collector) is also closely related, if you want to tune the system, and sometimes it is not enough just to optimize the code level, which requires us to the principles of the garbage collection algorithm has To understanding. While the level of understanding jvm is also distinguished in one important indicator of advanced programmers. This blog to introduce the main theory (next blog will be an example to explain the way the garbage collector, common tools and jvm tuning), after all, the theory is based on practice, to be gradual, not impatient. Here let us into the jvm door, learn the secrets behind the java program running.

java memory area

  Before explaining garbage collection we first need to have a basic understanding of the memory area of java, java programs during operation of the main memory is divided into several areas, as shown in the following figure:
Memory area
  one who saw <<-depth understanding of java virtual machine >> friends on this picture will feel familiar, well, brief each memory area bar.
  The method area: shared between threads, information storage classes, constants, variables and other static data, the method area is actually a logical part of the heap, but usually we take it as a separate part, we are familiar with the constant pool is also present in the method District them.
  Heap: heap jvm is the biggest piece of the memory area, shared by all threads, almost all object instances and arrays to be allocated on the heap respective memory, attention is almost, not all, here to facilitate understanding, can be considered All (JIT due to the development of analytical techniques and escape, which occurred some subtle changes, I also recently learned).
  VM stack: This is what we often say that the stack, and thread private, with raw thread was born, with the thread die the death of the implementation of each method creates a stack frame, storing local variables (basic data types, object references) , dynamic links, and other information for export.
  Program Counter: thread private, can be understood as the memory area of the program execution position record.
  Native method stacks: stack virtual machines with similar, but storage is a native method related information.
  Here we have a basic understanding, here we began to explain to garbage collection jvm memory area.

Garbage collection

  By analyzing the previous section we already know that in addition to the (district considered as part of the method stack) stack, the survival of other areas = thread life cycle, how much memory needs to be allocated shall be known when the class structure finalized for them, so do not need much thought to the problem of memory allocation and recovery, but the heap is not the same, only the program up and running jvm to know which objects need to be created and allocated storage space, so this part of the memory is dynamic, we often say jvm garbage collection also said heap memory.
  So recycling target is based on the object to determine what can be recycled it? This involves garbage collection mechanism, in other words: how to determine the object's life and death?
  Many people would answer reference counting method, as the name suggests, this judgment method is very simple and crude: create a reference counter for each object, when there are many references to it, counter + 1, when a reference failure, counter-1, when the counter is 0 when it indicates that the object can be recovered. But, in fact mainstream jvm has not chosen reference counting to manage memory, mainly because it is difficult to solve the problem of circular references.
  Let's look at the following piece of code:

public class testGc{
	public Object instance = null;
	
	public static void test(){
		testGc a = new testGc();
		testGc b = new testGc();
		a.instance = b;
		b.instance = a;
		a = null;
		b = null;
		System.gc();
	}
}

  This is a very classic example of circular references, a and b refer to each other with each other, but in fact these two objects has no longer be used, but they refer to with each other, the reference count will never be zero, so if jvm reference counting is employed, these two objects can not be recovered by GC. But the fact is not the case, by looking at the GC logs (first do not care about how to log in, will explain later) GC can clearly see that they had been successfully recovered.
  In fact, more than java, including c #, they are subject to determination by the life and death of reachability analysis algorithm. When this basic idea of the algorithm is called by many "gc roots" of the object as a starting point, these nodes from the search down, walked the path is called the chain of references, when an object is no reference to a chain connected with gc roots , then it proves that this object can be recovered, as shown below:

Garbage collection
  I believe you have a glance, def object graph is judged to be an object can be recycled. Introduced over garbage collection, Here are some of the major garbage collection algorithm.

Garbage collection algorithm

  想要高效的回收垃圾,自然离不开算法的支持,下面介绍一下垃圾回收几种主要的算法思想。

标记-清除算法

  首先是最基础的标记-清除算法,根据名字我们可以从中得知此算法包含两个阶段,标记和清除,与gc roots不存在引用链的对象即可以进行标记,标记完毕之后就可以挨个清除,算法很好理解,过程如下图所示:
Clear labeling
  算法简单自然存在不足,主要有两点不足:1.标记和清除的效率不高。其实也很好理解,标记的时候挨个标记,清除的时候也要挨个清除。2.容易造成大量的内存碎片。经过清除的内存与存活对象占有的内存混在一起,造成了不连续的空间碎片,当需要分配较大的对象时,因为无法找到连续的足够空间便会不得不触发GC回收。

复制算法

  为了解决效率问题,复制算法应运而生,它将内存分为大小相同的两块,每次使用其中的一块,当其中一块的内存用尽之时,把存活的对象复制到另一块上面,然后把使用过的内存空间一次性(无需挨个清理,提升了效率)清理干净,循环往复,每次都只对半块区域进行回收,这样自然也不会存在空间碎片的问题。只是这种算法每次只能使用一半内存,代价未免过于高昂了。
  事实上这种算法被运用于回收jvm新生代(jvm内存分为新生代、老生代、永久代,新生代可以理解为刚生成不久的对象,老生代可以理解为经过多次垃圾回收依然坚挺着存活下来的对象,永久代可以理解为长生的对象,但也并非绝对长生,只是很少对其进行回收)的内存,研究表明,新生代中的对象98%都死的很快,所以无需按照1:1的比例来分配内存空间,而是分成了一块内存较大的eden空间和两块survivor空间(from survivor、to survivor),发生垃圾回收的时候,将eden和其中一块survivor中的对象复制到另一块survivor上,最后清理掉eden和使用过的survivor,默认eden和survivor的内存分配比例为8:1:1,所以这样只有百分之10的内存被浪费,当survivor中的内存不足时,就会通过分配担保机制将本次存活下来的对象分配到老生代之中,老生代再不够,触发full gc,再不够,坐等oom就可以了,如果你的程序经常进行full gc操作,那么很可能发生了内存泄漏。复制算法过程如下图(凑合看吧,实在不想画图):
Replication algorithm

标记-整理算法

  Clear labeling tags to organize algorithm with the same algorithm, but it will recover before the first live objects move to one end, and then clean out the memory beyond the boundary of the process as shown below: Mark Collation Algorithm
  If the high survival rate of the object, copying collectors algorithm is no longer applicable, first copy less efficient copy every time a large number of live objects, and secondly, if the survival rate is high, the inevitable need to waste more memory, even when not enough memory is allocated when the guarantee, so for older students on behalf of, the tags to organize algorithm is more suitable.

Generational collection algorithm

  Now commercial virtual machine are generational collection algorithm, in fact, it is the combination of copying collection algorithm and tags to organize algorithm, the new generation of low survival rate of the object, using the copy algorithm, the object of the old generation of high survival rate, the use of tags to organize algorithm collect, on behalf of the memory model is as follows:
Generational model diagram

summary

  Well, this time on the blog so far, after only really understand these theories in order to continue to learn the next, next time I will introduce the garbage collector and the associated knowledge jvm tuning, thank you for watching, bye!

Note: This article multiple reference depth understanding of java virtual machine << >>

Published 26 original articles · won praise 99 · views 10000 +

Guess you like

Origin blog.csdn.net/m0_37719874/article/details/103552397