V8 engine - garbage collection and memory leak analysis Chrome V8 engine optimization strategy

Disclaimer: If there is infringement, please contact me, we grow and learn together. STAY CURIOUS. STAY HUMBLE. Https://blog.csdn.net/saucxs/article/details/90272073

V8 implements accurate formula GC, GC algorithm uses a generational garbage collection mechanism. Thus, V8 memory (stack) and the new generation into the old generation in two parts.

 

I. Introduction

V8's garbage collection mechanism : JavaScript garbage collection mechanism to automatically manage memory. Garbage collection is a double-edged sword, which can benefit greatly simplifies memory management code of the program, reduce the burden on the programmer to reduce the memory leak problems caused by long-term operation brought.

However, the use of garbage collection means that the programmer can not control memory. ECMAScript is not exposed to any garbage collector interface. We can not force them to garbage collection, but can not intervene memory management

Memory management : in the browser, the life cycle of Chrome V8 Engine instance will not be long (who all right a page open a few days a few months none), and runs on the user's machine. If the unfortunate event of problems such as memory leaks, only affects an end user. And no matter how much memory V8 example, the final memory will be released when the page is closed, almost no need too much management (of course, does not mean that some large-scale Web applications do not need to manage memory). But if you use Node as a server, you need to focus on issues of memory, once the memory leak occurs, over time the entire service will be paralyzed (the server does not frequent restart).

 

Two, chrome memory limit

2.1 there is a limit

Chrome can be used by limiting the memory limit (64 to 1.4GB, 32-bit 1.0GB), which means that will not directly operate some large memory object.

Why limit 2.2

Chrome reason for limiting the size of memory, because on the surface of a V8 initially as the browser's JavaScript engine designed unlikely to have a lot of memory of the scene, but the deeper reason is due to the garbage collection mechanism V8 limit. Since the V8 JavaScript application logic and the need to ensure that the garbage collector can see is not the same, V8 in the implementation of garbage collection will block JavaScript application logic, garbage collection until the end and then re-execute the JavaScript application logic, this behavior is known as "full stop "(stop-the-world). If the V8 heap memory is 1.5GB, V8 do a little more than garbage collection takes 50ms, do a non-incremental garbage collection or even to more than 1 second. So that the browser will lose response to the user within 1s, resulting in the phenomenon of suspended animation. If there is animation, then the animation will show significantly affected

 

Three, chrome V8 heap constitution

V8 heap really is not just made up of two parts of the old generation and the new generation, the stack can be divided into several different areas:

1, the new generation of memory area: Most of the objects are assigned here, this area is very small but particularly frequent garbage back;

2, the old generation area pointer: belonging to the old generation, where the object contains pointers to other objects that may exist most of the pointer, the new generation of promotion from most objects are moved here;

3, the data of the old generation area: belonging to the old generation, just to save the original data object, these objects do not have pointers to other objects;

4, large object area: This area storage volume beyond the size of other objects, each object has its own memory, garbage collection that does not move large objects;

5, Area Code: object code, i.e. after the JIT comprising target instruction, is assigned here. The only area of ​​memory has execute permission;

. 6, Cell zone, zone attributes Cell, Map Area: storing Cell, Cell and the Map attributes, each area is the same size storage element, a simple structure.

Each zone is composed of a set of memory pages, the page memory V8 is a minimum unit for memory, in addition to a large object area larger memory pages, other pages are 1MB memory area size, and aligned in 1MB. In addition to object storage memory page, there is a header containing metadata and identifying information, and a marker for which objects are active object bitmap area. In addition, each memory page and a separate buffer tank further partitioned memory area, placed inside a group of objects, those objects may point to other objects stored in the page. The garbage collector will only be garbage collected memory area for the new generation, the old generation and the old generation area pointer Data area.

 

Four, chrome V8's garbage collection mechanism

4.1 How to determine the contents of recovery

How to determine what memory need to be recycled, which does not require recovery of memory, which is the most fundamental problem to be solved garbage payback period . We can assume that an object is a living subject if and only if it is a live object or another root object point. The root object is always a live object, which is the object referenced by the browser or V8. Local variables are pointed objects also belong to the root object, since the scope of objects which they are located is considered the root object. Global object (Node for global, as the browser window) is a natural root object. The browser DOM elements are also part of the root object.

 

4.2 How to identify and data pointers

Garbage collector needs to face a problem that it needs to decide what is the data, which is a pointer. Since many garbage collection algorithm will move the object (compact, reduce memory fragmentation) in memory, it is often necessary to rewrite the pointer:

There are three main ways to identify pointers:
1. Conservation Law: all the heap are considered to be word aligned pointer, then some data will be mistaken pointer. So some actual numbers are false pointers, will be back mistaken point to live objects, a memory leak (false pointer to the object may be dead object, but still there is a pointer to - the false pointer to it) and we can not move any memory area.
2. Compiler Hints law: If a static language, the compiler can tell us the specific location of each class among pointers, and once we know which class to instantiate an object obtained by the object will be able to know all the pointers. This is achieved JVM garbage collection mode, but this approach is not suitable for dynamic languages such as JS
3. Mark pointer method: This method requires a reservation at the last bit of each word to label this field is a pointer or data. This method requires compiler support, but simple, and good performance. V8 is used in this way. V8 All data is stored in a 32bit word width, the lowest one remains at 0, while the lowest two pointers to 01

 

4.3 V8 recovery strategy

The evolution of automatic garbage collection algorithm appeared in many algorithms, but due to different life cycle of different objects, there is no algorithm is applicable to all situations. So V8 uses a generational recovery strategy, memory is divided into two Mesozoic: the new generation and the old generation .

The target for the new generation of shorter survival time of the object, the object is in the Older Generation live objects longer or permanent memory. Using different garbage collection algorithm for the new generation and the old generation to improve the efficiency of garbage collection. Initially the object will be assigned to the new generation, the new generation objects when certain conditions are met (will be described later), the old generation will be moved to the (promotion).

 

Fifth, the new generation of algorithms

The new generation of the object is generally shorter survival time, use Scavenge GC algorithm. Scavenge In a particular implementation, the main way replication is the use of a method --cheney algorithm.

In the new generation space, memory space is divided into two parts, respectively, From and To space space. In both spaces, there must be a space is used, another space is free. Newly allocated objects will be placed in From space, when From space is filled, it will start a new generation GC. From space algorithm checks will survive and copied to the To object space, the object will be deactivated if destroyed. When the copy is complete after the From and To space swap space, so that GC is over.

 

Sixth, the old generation algorithm

Old generation objects and generally survive longer and plentiful, the use of two algorithms, namely marking algorithms to clear and marked compression algorithm .

Before talking about the algorithm, it is next to the old generation objects appear in the space under what circumstances:

1, the new generation if an object has experienced a Scavenge algorithm, if experienced, it will target the new generation of space to move from the old generation space.

2, accounting for the size of the object space To more than 25%. In this case, in order not to affect the memory allocation, it will move the object from the new generation space of the old generation space.

Older Generation in space is very complex, there are several spaces:

enum AllocationSpace {
  // TODO(v8:7464): Actually map this space's memory as read-only.
  RO_SPACE,    // 不变的对象空间
  NEW_SPACE,   // 新生代用于 GC 复制算法的空间
  OLD_SPACE,   // 老生代常驻对象空间
  CODE_SPACE,  // 老生代代码对象空间
  MAP_SPACE,   // 老生代 map 对象
  LO_SPACE,    // 老生代大空间对象
  NEW_LO_SPACE,  // 新生代大空间对象

  FIRST_SPACE = RO_SPACE,
  LAST_SPACE = NEW_LO_SPACE,
  FIRST_GROWABLE_PAGED_SPACE = OLD_SPACE,
  LAST_GROWABLE_PAGED_SPACE = MAP_SPACE
};

In the old generation, the following situations will first start mark sweep algorithm:

1, without a space time block

2, the space is more than a certain restriction object

3, the space can not be guaranteed in the new generation into the old generation object moving in

Mark Sweep is an object to be recovered labeled directly release the corresponding address space garbage collection operation, as shown below (the red area indicates an area of ​​memory that need to be recycled):

Mark Compact thought a bit like Cheney algorithm taken when the new generation garbage collection: the survival of the object moving to the side, will need to be recovered object is moved to the other side, then that needs to be recovered target area overall garbage collection.

在这个阶段中,会遍历堆中所有的对象,然后标记活的对象,在标记完成后,销毁所有没有被标记的对象。在标记大型对内存时,可能需要几百毫秒才能完成一次标记。这就会导致一些性能上的问题。为了解决这个问题,2011 年,V8 从 stop-the-world 标记切换到增量标志。在增量标记期间,GC 将标记工作分解为更小的模块,可以让 JS 应用逻辑在模块间隙执行一会,从而不至于让应用出现停顿情况。但在 2018 年,GC 技术又有了一个重大突破,这项技术名为并发标记。该技术可以让 GC 扫描和标记对象时,同时允许 JS 运行。

清除对象后会造成堆内存出现碎片的情况,当碎片超过一定限制后会启动压缩算法。在压缩过程中,将活的对象像一端移动,直到所有对象都移动完成然后清理掉不需要的内存。

 

七、内存泄露和优化

7.1 什么是内存泄露?

存泄露是指程序中已分配的堆内存由于某种原因未释放或者无法释放,造成系统内存的浪费,导致程序运行速度减慢甚至系统奔溃等后果。。

7.2 常见的内存泄露的场景

7.2.1 缓存

js开发时候喜欢用对象的键值来缓存函数的计算结果,但是缓存中存储的键越多,长期存活的对象就越多,导致垃圾回收在进行扫描和整理时,对这些对象做了很多无用功。

7.2.2 作用域未释放(闭包)

var leakArray = [];
exports.leak = function () {
    leakArray.push("leak" + Math.random());
}

模块在编译执行后形成的作用域因为模块缓存的原因,不被释放,每次调用 leak 方法,都会导致局部变量 leakArray 不停增加且不被释放。

闭包可以维持函数内部变量驻留内存,使其得不到释放。

 

7.2.3 没有必要的全局变量

声明过多的全局变量,会导致变量常驻内存,要直到进程结束才能够释放内存。

 

7.2.4 无效的DOM引用

//dom still exist
function click(){
    // 但是 button 变量的引用仍然在内存当中。
    const button = document.getElementById('button');
    button.click();
}

// 移除 button 元素
function removeBtn(){
    document.body.removeChild(document.getElementById('button'));
}

 

7.2.5 定时器未清除

// vue 的 mounted 或 react 的 componentDidMount
componentDidMount() {
    setInterval(function () {
        // ...do something
    }, 1000)
}

vue 或 react 的页面生命周期初始化时,定义了定时器,但是在离开页面后,未清除定时器,就会导致内存泄漏。

 

7.2.6 事件监听为空白

componentDidMount() {
    window.addEventListener("scroll", function () {
        // do something...
    });
}

在页面生命周期初始化时,绑定了事件监听器,但在离开页面后,未清除事件监听器,同样也会导致内存泄漏。

 

7.3 内存泄露优化

7.3.1 解除引用

确保占用最少的内存可以让页面获得更好的性能。而优化内存占用的最佳方式,就是为执行中的代码只保存必要的数据。一旦数据不再有用,最好通过将其值设置为 null 来释放其引用——这个做法叫做解除引用(dereferencing)

function createPerson(name){
    var localPerson = new Object();
    localPerson.name = name;
    return localPerson;
}

var globalPerson = createPerson("Nicholas");

// 手动解除 globalPerson 的引用
globalPerson = null;

解除一个值的引用并不意味着自动回收该值所占用的内存。解除引用的真正作用是让值脱离执行环境,以便垃圾收集器下次运行时将其回收

 

7.3.2 提供手动清空变量的方法

var leakArray = [];
exports.clear = function () {
    leakArray = [];
}

 

7.3.3 其他方法

1、在业务不需要的用到的内部函数,可以重构到函数外,实现解除闭包。

2、避免创建过多的生命周期较长的对象,或者将对象分解成多个子对象。

3、避免过多使用闭包。

4、注意清除定时器和事件监听器。

5、nodejs中使用stream或buffer来操作大文件,不会受nodejs内存限制。

6、使用redis等外部工具来缓存数据。

 

八、总结

js是一门具有自动回收垃圾收集的编程语言,在浏览器中主要是通过标记清除的方法回收垃圾,在nodejs中主要是通过分代回收,Scavenge,标记清除,增量标记等算法来回收垃圾。在日常开发中,有一些不引入注意的书写方式可能会导致内存泄露,多注意自己代码规范。

 

九、参考

1、V8的垃圾回收机制与内存限制

2、node 内存限制的问题

3、node内存控制

4、深入浅出Nodejs

5、javascript高级程序设计

 

Guess you like

Origin blog.csdn.net/saucxs/article/details/90272073
Recommended