Runtime storage space organization

  The chapter on storage space is not the focus of the course, and the teacher teaches less. But still do a bit of tidying up for future reference.
  The focus of this chapter is to understand the memory allocation strategy in different situations, which can be roughly divided into three types: static allocation, dynamic stack allocation, and dynamic heap allocation. For the stack mode, only some very simple ones are introduced, but not for the heap mode. In fact, there is no test point in this way, it is the right to expand knowledge and understand the changes in memory during the compilation process.
  To learn storage allocation, there are several basic concepts to understand (you don’t have to memorize it, at least you need to know what it is~):
Insert picture description here
  There is a concept mentioned here called lifetime, which I talked about when I learned high-level languages. Variables have scope. What is the difference between these two? The lifetime is the length of effective time, and the scope is the effective scope. It feels like time and space. The lifetime is time, and the scope is space.
  Whether it is a variable or a function, it must have a name. The name does not refer to the value of the variable itself, but the address where the variable is stored. When accessing this variable or function, get the value according to the address. (Here I have a doubt: a pointer in C language is an address. If the above statement is correct, for int a;, a is the variable name, the address where the element a is stored, and a should be a pointer. What is the difference between that and int *a?) I thought about this problem myself: in int a, this a is not considered from the perspective of storage space, but only represents a variable name. Just need to understand the idea that the variable value can be found by the address.
  When calling a function, you need to pass parameters. The parameters in the calling function are called actual parameters, and the parameters in the called function are called formal parameters. That between formal and actual parameter passed is how to achieve it ? There are three main ways: pass address, pass value, pass name.
  As the name implies, passing the address is to send the parameter address to the called function. The called function can access according to this address. That is, the called function needs to take out space A to store the address B. In actual use, access A, take out B, and follow B. Take C. (Good detour!).
  Value transfer is to send the parameter value to the called function. In the called function, space needs to be allocated to store the parameter value, and then the value can be obtained. Passing the name is less when you hear it, this is when passing parameters, there is a one-to-one correspondence between the formal parameters and the actual parameters. When the first two functions are called, the content is already passed.
  In passing by name, the actual parameter will not be looked up until the formal parameter is called. The overall feeling is a bit like a program, which is called when used. So it is also called parameter subroutine. However, it is almost unnecessary to pass the name. The book says that it is only used in ALGOL. Just find out.
  Now think about it, the whole program is compiling and running. What should be stored? The first is the target code, the second is the data information, and the third is the control information called during the process. In fact, it is better to understand that only by storing the generated target code, the machine can execute instructions step by step to call. The data information includes various variables, and the control information in the process mainly refers to the nesting relationship between the various levels. Who calls who should know.
  To implement the above three contents into specific activity records, the definition of activity records is introduced below: the information required by an activity of the process is organized into a continuous storage unit, called activity records. It is often expressed in the form of a table, taking the Pascal language as an example: its activity record is as follows:
Insert picture description here
  Activity records can be roughly divided into three categories: connection data, form unit, and local data area. Connection data includes static link, dynamic link (what is dynamic link, static link will be described in detail later), and return address. The local data area includes the temporary unit (generated by the intermediate code), the inner vector (the first address of the array is the size range and other information), the local vector (the variable declared in the sub-function), and the formal unit (the parameter passed when the function is called).
  The following three storage allocation strategies are introduced: static storage allocation, stack storage allocation, and heap storage allocation. In fact, there is a very simple way to understand these three (of course, simple means not very rigorous~) static storage allocation refers to the pre-allocated space of the program, without dynamic allocation, and no additional space is added during compilation. Stacked storage allocation is the system automatically allocates space, the most obvious example is the recursive algorithm. The program ends and the address space is released. The characteristic of heap storage allocation is that it is actively divided by users, such as the new function in C++ language, which can also be returned, which is delete in C++. The following three are described in more detail:

Static storage allocation:

  The static storage allocation must meet the following three conditions:
Insert picture description here
  the static storage allocation includes the allocation of some declared variable spaces. In the previous article, it was introduced that when the intermediate code is generated, a large number of intermediate variables will be introduced (although in the last The optimization process will remove, but also allocate space for it). Most of the intermediate variables will be eliminated. It would be a bit wasteful to allocate space for each of them, so I chose a processing method:
Insert picture description here
  sometimes, after the intermediate variable is generated, the next statement is used, and then It will never happen again, and a mechanism is set up for this situation:
Insert picture description here
  Examples are as follows:
Insert picture description here

Stacked storage allocation

  About stack storage allocation, only a simple introduction and the C language as an example. Stack storage allocation is done in the stack space, so stack management is required. Each process takes the activity record as the basic unit, the activity record of the process is pushed into the stack, and the process ends and is popped out of the stack.
  What is the difference between static storage and stacked storage (stacked compared to static)?
Insert picture description here
  To give an abstract example:
Insert picture description here
  In this example, it does not mean that the activity record of a fixed subroutine increases upwards. This depends on the top of the stack and continuously pushes the stack downwards.
  What is the activity record of C language in the stack space?
Insert picture description here
  The sp in the figure above refers to the stack pointer, which is the top pointer top. Bringing an old word is the next level.

  In addition to simple stack implementations, there is also a stack implementation of nested procedural languages . What is the nesting process? It means that in a function, another function is called ( the function may continue to call other functions, but cannot call each other to cause an endless loop, it needs to be a reasonable call ). So what's the problem here? It is the issue of variable usage. In the nesting process, the inner layer can use the outer layer variables. There is an extreme case where the inner and outer variable names are the same, and scope coverage will occur. The inner layer calls the inner layer, which will cover the outer layer. Regardless of extreme cases, in the introduction just now, we learned that each layer is stored in the stack in units of activity records. How does the inner layer access the outer layer activity records? This part is mainly to solve this problem!
  There are two ways to solve it: static chain or Display table.

  Static chain :

  The idea of ​​the static chain is to keep the sp of the previous layer in the activity records of each layer, so that if you want to access the outer variables, you can look up the sp. This method adds a fixed sp space, which is a static method. The activity record is as follows:
Insert picture description here
  Explain why there is a static chain and a dynamic chain. The static chain is what we just said, used for variable query. The dynamic chain is the old SP in the previous stack structure, used to express the call relationship.
  Let’s use an example to illustrate:
Insert picture description here
  Think about what’s wrong with the static chain? If the nesting is particularly multi-level, the innermost layer wants to call the outermost variable and it needs to look up layer by layer. It must be very slow to hold. Is there any way to speed it up? This is the Display table!

  Display table

  The idea of ​​the Display table is that the purpose of searching layer by layer is to refer to the outer variables, so I just put all the outer variables in the activity record of this layer. This way you don't have to go to other floors to find. The number of variables is not fixed, resulting in the unfixed size of the table space, so it is also a method of dynamic allocation. The activity record structure is as follows:
Insert picture description here
  For example (of course, the serial number is used, and the activity records are all serial numbers~):
Insert picture description here
  There is an item in the above picture called: global display, which refers to the label corresponding to the beginning of the display section of the upper layer. The number of display segments in the activity record reflects the current nesting level. The global display of the first layer is 0 by default.

  Summarize the advantages and disadvantages of static chains and Display tables:

  Static chain advantages: fixed space, and occupy a small space Disadvantages: access speed is too slow
  Display table advantages: fast access speed Disadvantages: relatively large space

Heap dynamic allocation

Insert picture description here
  In the definition, "appropriately large" is used to describe the size of the space allocation, but in fact the size of the heap allocation is specified by the user.
  With so many spaces, how to choose a suitable one? In fact, there are three strategies:
Insert picture description here
  For heap allocation, the more important point is the release of memory space. If it is not released, the memory will be divided into small pieces, resulting in insufficient allocations for future use. This also warns users to add free after malloc and delete after new.
  Memory allocation has always been a big part of computer science, and now I have learned it again in the principles of compilation, and I have gained a lot of knowledge. But it's still the tip of the iceberg. Come on!

Guess you like

Origin blog.csdn.net/gls_nuaa/article/details/111826814