Do you really understand the JS garbage collection mechanism?

Table of contents

foreword

Stack memory management

JS Garbage Collection Mechanism

Mark and Sweep

marking stage

clearing phase

Features of Mark Sweep

advantage

shortcoming

Reference Counting

Reference counter maintenance

Tracking of reference counts

Garbage collection trigger

recycling object

Features of reference counting

advantage

shortcoming

Generational Collection

Old Generation Recycling

New Generation Recycling

Features of Generational Recycling

advantage

shortcoming

memory leak

memory leak scenario

useless object reference

circular reference

Misuse of global variables

unreleased resources

Summarize

related code


foreword

Garbage collection is an important part of memory management in JavaScript. Developers do not need to manually allocate and free memory. The garbage collection mechanism can automatically handle the allocation and release of memory, reducing the burden on developers and reducing the risk of memory leaks. Its main purpose is to automatically detect and release unused memory so that programs can use system resources more efficiently.

It works by marking objects that are no longer needed and reclaiming the memory space they occupy so that other objects can use them.

This article will share with you, introduce the importance and definition of JavaScript garbage collection, and discuss in depth the concept of memory management, the classification of JS garbage collection mechanisms, and how to avoid memory leaks and performance optimization.

Stack memory management

In the previous article , I made a preliminary introduction to the concept of heap and stack, quoting a sentence in the article:

The stack memory is used to store program function calls, variable declarations, and some small variable values, such as booleans, partial integers, etc., and their life cycles are controlled by function calls and exits and the scope of variables. When a function is called or a variable is created, the related variable and function call will be pushed into the stack memory. If the function exits or the variable scope is destroyed, the related variable and function will be popped from the stack memory.

The role of heap memory is to store variable values, such as strings, objects, arrays, and functions. Their life cycles are controlled by the JavaScript garbage collection mechanism. When these variables are no longer needed, the garbage collection mechanism will destroy them.

In a nutshell, the heap is used to store dynamically allocated objects while the stack is used to store values ​​of primitive types and references to objects in the heap.

That is to say, the concept of garbage collector only exists in the heap memory. The allocation and release of memory is automatically handled by the JavaScript engine, and developers do not need to explicitly allocate or release memory. JavaScript engines use garbage collection to manage memory, ensuring that objects that are no longer in use are automatically reclaimed to make room for new ones.

JS Garbage Collection Mechanism

Entering today's topic, there are three types of garbage collection mechanisms, among which mark clearing and reference counting are relatively common mechanisms, and generational collection is a combination of the former two

Mark and Sweep

Mark and sweep is one of the most common garbage collection mechanisms in JS. Its workflow consists of a marking phase and a cleaning phase.

marking stage

  1. Start with the root object, such as the global object (window) or the scope chain of a function
  2. Traversing object properties and references, marking accessible objects as referenced
  3. Recursively traverse the properties and references of the active object, marking other accessible objects

clearing phase

  1. Iterate over all objects in the heap.
  2. For objects that are not marked as live, mark them as garbage objects.
  3. Frees the memory space occupied by garbage objects.
  4. Deletes objects that have been cleared from memory.

Let's write a class to simulate the operation of marking and clearing

// 标记清除, 垃圾回收机制
class MarkGC {
  marked = new Set(); // 模拟标记操作
  run(obj) {
    this.marked.clear(); // 这一步应该是放在最后的,但是看不出效果,所以改成运行前重置
    this.mark(obj);
    this.sweep(obj); // 这一步实际上没有效果,为了方便理解
    return this;
  }
  //   判断对象或属性是否已经标记
  checkMark = (obj) => typeof obj === "object" && !this.marked.has(obj);
  mark(obj) {
    const { marked } = this;
    if (this.checkMark(obj)) {
      marked.add(obj);
      Reflect.ownKeys(obj).forEach((key) => this.mark(obj[key]));
    }
  }
  sweep(obj) {
    Reflect.ownKeys(obj).forEach((key) => {
      const it = obj[key];
      if (this.checkMark(it)) {
        delete obj[key];
        this.sweep(it);
      }
    });
  }
}
// 全局对象
const globalVar = {
    obj1: { name: "Object 1" },
    obj2: { name: "Object 2" },
    obj3: { name: "Object 3" }
}
const gc = new MarkGC()
gc.run(globalVar)// 执行垃圾回收
console.log(globalVar, gc.marked);
// 删除操作
delete globalVar.obj3
delete globalVar.obj2
// 对象删除后运行垃圾回收
gc.run(globalVar)
console.log(globalVar, gc.marked);

To understand the above code, the mark removal method is mainly divided into mark operation and sweep operation. Running the mark function will store the attributes in the global object into the mark list, and then run the sweep function to clear the unmarked objects

Features of Mark Sweep

advantage

  • Comprehensive memory reclamation: The mark-and-sweep algorithm can reclaim all objects that are no longer referenced, including circularly referenced objects. Through the combination of marking phase and clearing phase, the memory space can be effectively released
  • Flexibility: The mark-and-sweep algorithm has nothing to do with the specific implementation of the programming language, and is applicable to a variety of programming languages ​​and environments. It can dynamically perform garbage collection at runtime, and operate according to the actual reference of the object
  • Predictability: The execution time of the mark-and-sweep algorithm is controllable. Garbage collection operations can be performed at the right time, avoiding a large number of memory allocation and release operations, thereby improving the response performance of the program

shortcoming

  • Pause time: The mark-and-sweep algorithm needs to stop the execution of the program during garbage collection to perform mark-and-sweep operations. This may cause the program to pause for a long time, affecting the real-time and response performance of the program
  • Space efficiency: When the mark-and-sweep algorithm performs a clear operation, it needs to traverse the entire heap to find and clear unmarked objects. This can lead to a large memory footprint during garbage collection, reducing memory utilization efficiency
  • Fragmentation problem: The mark-and-sweep algorithm will produce memory fragments after clearing objects, that is, some small and discontinuous memory spaces. This may cause difficulties in subsequent memory allocation operations, increasing the time and complexity of memory allocation

Reference Counting

Reference counting maintains a reference counter based on each object, which is used to track the number of times the object is referenced. When the reference count of an object becomes zero, that is, when there are no references pointing to it, the object is considered no longer in use and can be recycled. The rationale for the method is as follows

Reference counter maintenance

  1. Every object has a reference counter with an initial value of 0.
  2. When an object is referenced, the reference counter is incremented.
  3. When an object's reference is dereferenced or destroyed, the reference counter is decremented.

Tracking of reference counts

  1. When an object is referenced by other objects, the reference count is incremented.
  2. The reference count is decremented when other objects referenced by an object are canceled or destroyed.

Garbage collection trigger

  1. During program execution, when the garbage collector is triggered, it traverses all objects in the heap.
  2. For each object, check the value of its reference counter.
  3. If the reference counter is zero, the object is no longer referenced and can be recycled.

recycling object

  1. When an object is recycled, the memory space it occupies is freed.
  2. At the same time, the reference counts of other objects referenced by this object will decrease accordingly.
  3. If the reference counts of other objects also become zero, those objects are also collected, and the whole process is recursive.

We also use a piece of code to simply simulate the operation of reference counting

// 引用计数器
class RefCount {
  constructor() {
    this.count = 0;
  }

  increment() {
    this.count++;
  }

  decrement() {
    this.count--;
  }
}

// 对象类
class MyObject {
  constructor() {
    this.refCount = new RefCount();
    this.refCount.increment(); // 对象被创建时,引用计数加1
  }

  addReference() {
    this.refCount.increment(); // 引用增加时,引用计数加1
  }

  releaseReference() {
    this.refCount.decrement(); // 引用减少时,引用计数减1
    if (this.refCount.count === 0) {
      this.cleanup(); // 引用计数为0时,进行清理操作
    }
  }

  cleanup() {
    // 执行清理操作,释放资源
    console.log("清理完成");
  }
}
// 创建对象并建立引用关系
const obj1 = new MyObject();
// 建立引用关系
obj1.addReference();
console.log(obj1.refCount);
// 解除引用关系
obj1.releaseReference();
obj1.releaseReference();
console.log(obj1.refCount);

The RefCount class is a simple counter. Use the MyObject class to create a new class, use the addReference function of the counter to increase the number of references, and use the releaseReference to dereference the relationship. At this time, the number will be reduced by one. When the number of references is reduced to 0, the cleanup function will be executed to release resources to achieve garbage collection.

Features of reference counting

advantage

  • Real-time: The reference counting algorithm can detect in real time that the objects are no longer referenced, and recycle these objects immediately. Once the reference count of the object becomes zero, it can be recycled immediately to release the memory space occupied by the object

  • Simple and efficient: The implementation of the reference counting algorithm is relatively simple. Each object maintains a reference counter, and the reference relationship of the object is tracked by increasing and decreasing the value of the counter, which makes the reference counting algorithm more efficient in implementation
  • Handling circular references: Reference counting algorithms are usually able to handle circular references, that is, when two or more objects refer to each other, as long as their reference counts become zero, the garbage collector can reclaim these objects

shortcoming

  • Circular reference problem: The reference counting algorithm cannot handle the case of circular references. When there are circular references, even if these objects are no longer used by the program, their reference counts do not become zero, causing memory leaks
  • Additional overhead: The reference counting algorithm needs to maintain a reference counter for each object, which brings additional memory overhead. Every time the object's reference changes, the value of the counter needs to be updated, which increases runtime overhead
  • Update performance overhead: When the reference of the object changes frequently, such as a large number of increase and decrease references, the frequent update of the reference count may affect the performance of the program

Generational Collection

Generational collection is a garbage collection mechanism that combines mark clearing and reference counting. It divides memory into different generations according to the life cycle of objects.

There is an assumption in generational collection: the life cycle of most objects is relatively short, and only a few objects have a long life cycle. Based on this assumption, generational collection divides the life cycle of objects into two categories: the Young Generation heap and the Old Generation heap. The young generation heap is used to store a large number of short-term surviving objects, while the old generation heap is used to store long-term surviving objects

The principles of the two generational recycling are as follows

Old Generation Recycling

The old generation is actually the mark-clearing algorithm mentioned above, which is suitable for objects with a long survival time

New Generation Recycling

The new generation heap is divided into two equal-sized areas: From space and To space

  1. New objects are allocated to the From space
  2. When the From space is full, trigger garbage collection
  3. Starting from the root object, mark all live objects
  4. Copy the surviving objects to the To space
  5. remove dead objects
  6. Use the To space as the new From space, and use the From space as the new To space to complete garbage collection

Next, I use JS to realize the process of new generation recycling

// 新生代回收机制
class GenerationalCollection {
  // 定义堆的From空间和To空间
  fromSpace = new Set();
  toSpace = new Set();
  garbageCollect(obj) {
    this.mark(obj); // 标记阶段
    this.sweep(); // 清除阶段
    // 切换From和To的空间
    const { to, from } = this.exchangeSet(this.fromSpace, this.toSpace);
    this.fromSpace = from;
    this.toSpace = to;
    return this;
  }
  isObj = (obj) => typeof obj === "object";
  exchangeSet(from, to) {
    from.forEach((it) => {
      to.add(it);
      from.delete(it);
    });
    return { from, to };
  }
  allocate(obj) {
    this.fromSpace.add(obj);
  }
  mark(obj) {
    if (!this.isObj(obj) || obj?.marked) return;
    obj.marked = true;
    this.isObj(obj) &&
      Reflect.ownKeys(obj).forEach((key) => this.mark(obj[key]));
  }
  sweep() {
    const { fromSpace, toSpace } = this;
    fromSpace.forEach((it) => {
      if (it.marked) {
        // 将标记对象放到To空间
        toSpace.add(it);
      }
      // 从From空间中移除该对象
      fromSpace.delete(it);
    });
  }
}
// 全局对象
const globalVar = {
    obj1: { name: "Object 1" },
    obj2: { name: "Object 2" },
    obj3: { name: "Object 3" }
}
const GC = new GenerationalCollection()
// 创建对象并分配到From空间
GC.allocate(globalVar.obj1)
GC.allocate(globalVar.obj2)
console.log(GC.fromSpace, GC.toSpace);
// 执行垃圾回收
GC.garbageCollect(globalVar)
console.log(GC.fromSpace, GC.toSpace);

Briefly describe the above code, the allocate function puts the object in the From heap space, and the mark function adds marks to the object and its attributes. In the sweep clear function, if the object is both marked and in the From space, then it is copied to the To space. Finally, the two heap spaces are swapped in the garbage collection mechanism function garbageCollect to complete the entire cycle.

Features of Generational Recycling

advantage

  • Improve recycling efficiency: Generational recycling can be optimized differently for the life cycle of objects. By distinguishing the generation where the object is located, a more suitable recycling strategy can be adopted for different generations. Due to the short life cycle of the new generation objects, using the replication algorithm for recycling can quickly clean up most of the garbage objects. However, objects in the old generation have a longer life cycle, and using the mark-and-sweep method for recycling can clean up garbage objects more comprehensively.
  • Reduce pause time: Generational collection can distribute garbage collection tasks to different time periods, avoiding processing all objects at once. This can reduce the time for a single garbage collection, thereby reducing system pause time and improving system responsiveness and user experience.

shortcoming

  • Need to maintain multiple generations: Generational collection needs to maintain objects of different generations, which increases the complexity of memory management.
  • Memory allocation and copy overhead: The copy algorithm used in the new generation collection needs to copy the surviving objects to the new space, which will introduce certain memory allocation and copy overhead. At the same time, operations such as object movement and memory reorganization in generational collection will also bring certain overhead

memory leak

A memory leak refers to the situation that the memory allocated in the program cannot be released and recovered normally, resulting in continuous memory occupation and growth.

It is closely related to the garbage collection mechanism. The purpose of the garbage collection mechanism is to automatically identify and reclaim unused memory to avoid memory leaks and resource waste. However, if there is a memory leak, even if the objects are no longer used, the garbage collection mechanism cannot correctly identify these objects as garbage and release their memory. In this way, the memory footprint caused by the memory leak will gradually increase over time until the system's memory limit is reached.

memory leak scenario

Common memory leak scenarios include the following categories

useless object reference

When objects still have references, even if they are no longer needed, the garbage collection mechanism cannot reclaim these objects. For example, the event listener or timer is not properly released, resulting in the monitored object being referenced all the time, and the memory cannot be released.

Scenario: use element.addEventListener but not cancel function: removeEventListener; setInterval or setTimeout is not closed

Solution: Use removeEventListener, clearTimeout and other functions to reset

circular reference

When two or more objects refer to each other, and there is no reference relationship between these objects and other objects, even if these objects are no longer used, the garbage collection mechanism cannot reclaim them. In this case, a closed loop is formed between the objects, resulting in a memory leak.

Scenes:

const obj = {}
const obj1 = {}
obj.child = obj1
obj1.child = obj

Solution: Reasonably design the reference relationship between objects, avoid circular use of object type variables, use weak references or break circular references to solve

Misuse of global variables

Global variables exist throughout the application life cycle. If global variables are not properly managed and released, these variables will always exist in memory and cannot be recovered by the garbage collection mechanism.

Scenario: Create a variable globally. If the variable is not reset or cleared during the life cycle of the program or page, it will always be activated and will not be processed by the garbage collection mechanism.

Solution: Limit the scope of variables and avoid too many global variables. Namespaces and modules can be used in TS, that is, functions or objects of JS

unreleased resources

Resources such as open file handles, network connections, or database connections can cause memory leaks if they are not properly released after use.

Scenario: The timeout period is too long when making a network request, and the request may cause a memory leak if the request keeps waiting

Solution: Try to manually disconnect or set a timeout after using the operation, such as the requested abort function and timeout attribute. This kind of phenomenon is similar to the deadlock of the thread, and it is impossible to know when to cancel it, causing performance problems.

Summarize

The JavaScript garbage collection mechanism is the key to memory management. It can automatically detect and release unused memory to improve program performance and reliability. Understanding the classification of garbage collection, causes and avoidance of memory leaks, and best practices for performance optimization will help you develop efficient JavaScript applications.
The above is the whole content of the article, thank you for reading this, I hope you can benefit from it, if you think the article is good, I also hope that Sanlian will support the blogger, thank you very much!

related code

myCode: Some small cases or projects based on js - Gitee.com

Guess you like

Origin blog.csdn.net/time_____/article/details/131308983