The program also has a "kidney", do you know what it is?

Follow, star public account, direct access to exciting content

ID : chiphome-dy

Author: Cheng Shihui

Sorting out: Xiaoyu

There are many forms of programs, whether it can be used as an operating system, or as a hello world, or as an app or as an embedded firmware. In essence, a program is a collection of function code and resources. Our topic is the resources of the program, and the first element of the program-memory.

If the program is a person, then the skeleton can be compared to the structure of the program, the flesh is the code, and the blood is the memory. In the running process of the program, most of the instructions are executed and write-back operations, and the write-back stage will operate to the memory. It can be said that the memory is accompanied by the entire cycle of program execution, just like blood is always flowing in our body. . So what is the "kidney" of the garbage collection program in memory?

int* a;
int b =2;
main()
{
    static int c;
    {
        int d = 9;
        char* e =  malloc(10);
        printf("d=%d\r\n",d);
    }
}

This is a very simple C language code. Those with a little basic knowledge know the memory location occupied by each variable in this program.

First, global variables and static variables are placed in the data section (RW section, where uninitialized ones are placed in the ZI section, and the memory is cleared when the program is started) For example: a, b, c;

Local variables are placed in the stack space, such as: d, e;

At the same time, it also applied for a section of memory stored in the heap, but the free function was not used in the code to release it.

According to the characteristics of memory, we know that for variables of data segments such as a, b, and c, they are resident in memory, and their life cycle is permanent. For the local variables d, e in the stack. Their life cycle is only within "{}", with the push and pop instructions of stack operations, creation and death.

When e dies outside the curly braces in the program, the memory allocated in the heap loses the pointer to it, causing a memory leak. If it is in a simple program, such a situation is relatively simple to deal with, as long as we use the free function behind the program to release the memory.

However, if you have a complex logic program, such dynamically allocated memory requires a lot of effort to manage. This is why many high-level languages ​​such as java will often emphasize when comparing with C/C++ when making tutorials. Java has no pointers and has an automatic garbage collection mechanism, which is safer and the robustness of the program is more easily guaranteed. . (Of course C/C++ can also write robust programs, but some things are not so convenient). This mechanism that can automatically help the program perform automatic garbage collection of memory is the "kidney" of the program.

So why doesn't C/C++ support automatic garbage collection until now? We can find the answer from the principle of automatic garbage collection mechanism. First, let me talk about the judgment algorithm of automatic garbage collection. Generally, there are two commonly used:

A , reference counting.

The so-called reference counting method, as the name implies, is to use a counting variable inside the memory description structure to count. Whenever there is a pointer or reference to the memory block, the counter inside the description structure of the memory block is incremented. When the pointer or reference is released or changed, it decrements. When the count of the memory block decreases to 0, the memory block can be released and recycled.

The reference counting method should be said to be the simplest algorithm to realize memory reclaimability determination. The ARC mechanism supported by Apple's development platform Object-C is a typical example of using this algorithm to realize the automatic recovery mechanism. The implementation of this automatic garbage collection algorithm has one dependency and one disadvantage. It depends on the need for the compiler to automatically insert the counting code.

If you want OC to develop programs on the xcode platform, its compilation environment will automatically insert statements such as retain and release which are manually counted functions. So the essence of this automatic garbage collection is to let the compiler do what should be done manually.

If C also needs to implement a similar method for automatic recycling, then it is necessary to modify the preprocessing process of the compiler, and maintain a memory monitoring structure on top of the library functions for memory application and release, to count the memory blocks .

At the same time, the reference counting method has a very big disadvantage, that is, circular references can cause memory leaks. The following code:

fun()
{
    A* a = [ A new];
    A* a1 = a;
    B* b = [B new];
    B* b1 =  b; 
             a->b = b;
             b ->a = a;
}

When the function is executed, a and b refer to each other. But there are no pointers in the stack and in the data segment that can access the objects of a and b. In other words, the program has lost access to these two pieces of memory, but they both point to each other, causing the memory count to be unable to return to zero. Therefore, it has been unable to release, resulting in a memory leak and forming garbage.

2. Accessibility analysis method.

The reachability analysis method, as the name suggests, is to analyze whether the memory program can be "reached". That is to analyze whether the program has lost access to memory. When the program is running, the memory is constantly changing, as if the human body's blood flow does not stop. But no matter at any moment, there are roughly two categories of memory that our program must be able to access:

1. The data segment, that is, global variables and static variables.

2. The unreleased variables in the stack space are the dynamic local variables currently on the stack.

The reachability analysis method needs to rely on Runtime, that is, the runtime environment. They constantly monitor the pointer variables or references in the above two major types of memory, and periodically traverse the pointers or references of these pointers, and are recursive Traverse down step by step. On the whole, it is traversing a graph with pointer variables and references in these two types of memory as the entry point. As long as the memory block that can be traversed can be marked for reachability.

When the program enters the garbage collection cycle, it will traverse all the memory that has been allocated, and if the accessed memory block has a reachability mark, it will skip it. If there is no reachability mark, it can be released for recycling. In this way, it is possible to avoid the problem of non-return-to-zero due to mutual reference counted by reference count, but unreachable but not released. As shown in the figure below, the blue memory block will be recycled.

However, the reachability analysis algorithm is dependent on the runtime environment, that is, a virtual machine like java. Therefore, languages ​​such as C/C++ cannot support this automatic garbage collection judgment algorithm.

So having said so much, one of our diagnoses for these programming languages ​​is:

OC: Apple changed its kidney, but the kidney was not good, but it was generally fine.

java: The kidney is very good.

C/C++: If you don't have a kidney, you need a programmer to help him with kidney dialysis.

So a good language like C/C++, can we give it a "kidney" and let it live a healthier life?

The answer is yes.

Recommended reading:

嵌入式编程专辑Linux 学习专辑C/C++编程专辑
Qt进阶学习专辑关注微信公众号『技术让梦想更伟大』,后台回复“m”查看更多内容,回复“加群”加入技术交流群。
长按前往图中包含的公众号关注

Guess you like

Origin blog.csdn.net/u012846795/article/details/108656583