C++ memory allocation

  

The memory occupied by a program compiled by C/ C++ is divided into the following parts:

  1. Stack area (stack): The compiler automatically allocates and releases, stores the parameter value of the function, the value of the local variable, etc. Its operation mode is similar to the stack of the data structure.

  2. Heap area: generally allocated and released by the programmer. If the programmer does not release it, it may be reclaimed by the OS at the end of the program. It is worth noting that it is different from the heap of the data structure, and the allocation method is similar to the data structure. A linked list of structures.

  3. Global area (static): also called static data memory space, which stores global variables and static variables. The storage of global variables and static variables is placed in one area, and the initialized global variables and static variables are placed in the same area, and those that are not initialized are in the same area. Another adjacent area is released by the system after the program ends.

  4. Text constant area: The constant string is placed here, and is released by the system after the program ends.

  5. Program code area: store the binary code of the function body.

  The following figure is a schematic diagram of the memory usage of a process in the unix system

 

  As can be seen from the figure:

  1. From low address to high address are: code segment, (initialized) data segment, (uninitialized) data segment (BSS), heap, stack, command line parameters and environment variables

  2. The heap grows to high memory addresses

  3. The stack grows to a lower memory address

  Problems related to memory space that will be encountered in actual programming

  1. Memory application

  In order to solve the problem of data storage, we have 3 ways to apply for space and use them

  First, apply from the stack space (that is, define the array directly)

  Second, apply from the heap space (use malloc or new to dynamically apply for memory)

  Third, use files to store data

 

 

First, the stack and heap in the data structure

Although we often call the stack together, it is undeniable that the stack is actually two data structures: the heap and the stack.

Both heap and stack are data structures in which data items are arranged in order.

Stack: Like a bucket or box for data, it is a data structure with a last-in, first-out property.

Heap: A sorted tree data structure, each node has a value. Usually what we call the data structure of the heap refers to the binary heap. The characteristic of the heap is that the value of the root node is the smallest (or the largest), and the two subtrees of the root node are also a heap. Due to this feature of the heap, it is often used to implement priority queues. The access to the heap is arbitrary, just like when we pick up books from the bookshelf in the library. Although the books are placed in order, when we want to pick any one It is not necessary to take out all the books in front of it like a stack. The mechanism of the bookshelf is different from the box. We can directly take out the books we want.

Second, the stack and heap in memory allocation

Note: In general, when we say "stack", we actually mean "stack"!

Generally, the program is stored in Rom or Flash, and it needs to be copied to the memory for execution when running, and the memory will store different information respectively. If the stack area in the memory is at a relatively high address and the growth direction of the address is upward, the stack address grows downward.

The local variable space is allocated in the stack, and the heap area is an upwardly growing memory space used to allocate the memory space requested by the programmer. In addition, there are static areas for allocating static variables and global variable space; read-only areas for allocating constants and program code space; and some other partitions.

Program memory allocation:
  The memory occupied by a program compiled by C/C++ is divided into the following parts:  
 1. The stack area (stack) - automatically allocated and released by the compiler, storing the parameter values ​​of functions, the values ​​of local variables, etc. It operates like a stack in a data structure.  
 2. Heap area (heap) - generally allocated and released by the programmer. If the programmer does not release it, it may be reclaimed by the OS (operating system) at the end of the program. Note that it is different from the heap in the data structure, and the allocation method is similar to the linked list.  
3. Global area (static area) (static) - The storage of global variables and static variables is put together, initialized global variables and static variables are in the same area, and uninitialized global variables and uninitialized static variables are in the same area. another adjacent area. Released by the system after the program ends.  
4. Literal Constant Area - Constant strings are placed here. Released by the system after the program ends.
5. Program code area - store the binary code of the function body.  

Example explanation:

int a=0; global initialization area    

char *p1; global uninitialized area    
int main()    
{    
  int b; //stack    
  char s[]="abc"; //stack    
  char *p2; //stack    
  char *p3="123456"; //123456/ 0 is in the constant area, p3 is on the stack.    

  static int c =0;//Global (static) initialization area    
  p1 = (char *)malloc(10); //The allocated area of ​​10 and 20 bytes is in the heap area
  p2 = (char *)malloc(20 );       
  strcpy(p3,"123456"); //123456/0 is placed in the constant area, the compiler may optimize it into a place with "123456" pointed to by p3.    
}    

The following is a detailed explanation of several areas of memory allocation:

Stack: The storage area for variables that are allocated by the compiler when needed and automatically cleared when not needed. The variables inside are usually local variables, function parameters, etc. In a process, at the top of the user virtual address space is the user stack, which the compiler uses to implement function calls. Like the heap, the user stack can expand and contract dynamically during program execution.

 

Heap: It is the memory block allocated by new, and their release is ignored by the compiler and controlled by our application. Generally, a new corresponds to a delete. If the programmer does not release it, the operating system will automatically recycle it after the program ends. The heap can expand and contract dynamically.

 

The free storage area is the memory block allocated by malloc, etc. It is very similar to the heap, but it uses free to end its life.

 

Global/static storage area, global variables and static variables are allocated to the same block of memory. In the previous C language, global variables were divided into initialized and uninitialized (initialized global variables and static variables are in one area, uninitialized). The initialized global variables and static variables are in another adjacent area, and the uninitialized object storage area can be accessed and manipulated through void*, which will be released by the system after the program ends), there is no such distinction in C++, they occupy the same memory area.

Constant storage area, this is a relatively special storage area, they store constants and are not allowed to be modified (of course, you can also modify it through illegal means, and there are many ways)

void f() { int* p=new int[5]; }

  This short sentence includes the heap and the stack. When we see new, we should first think that we have allocated a piece of heap memory, so what about the pointer p? He allocated a piece of stack memory, so the meaning of this sentence is: a pointer p pointing to a piece of heap memory is stored in the stack memory. The program will first determine the size of the memory allocated in the heap, then call operator new to allocate memory, and then return the first address of this memory and put it on the stack.

So how to release it?

Use delete []p, this is to tell the compiler: I delete an array, VC6 will release the memory according to the corresponding cookie information.

What is the difference between heap and stack?

The main differences are as follows:

  1. Different management methods;

  2. The size of the space is different;

  3. Whether the fragments are different;

  4. Different growth directions;

  5. Different distribution methods;

  6. Different distribution efficiency;

 

  Management method: For the stack, it is automatically managed by the compiler without our manual control; for the heap, the release work is controlled by the programmer, which is prone to memory leaks.

 

  Space size: Generally speaking, under a 32-bit system, the heap memory can reach 4G of space. From this point of view, there is almost no limit to the heap memory. But for the stack, there is generally a certain space size. For example, under VC6, the default stack space size is 1M (it seems to be, I can't remember clearly). Of course, we can modify it: open the project, operate the menu as follows: Project->Setting->Link, select Output in Category, and then set the maximum stack value and commit in Reserve. Note: the minimum value of reserve is 4Byte; commit is reserved in the page file of virtual memory. If it is set larger, the stack will open up a larger value, which may increase the memory overhead and startup time.

 

  Fragmentation problem: For the heap, frequent new/delete will inevitably cause discontinuity in the memory space, resulting in a large number of fragments and reducing the efficiency of the program. For the stack, this problem does not exist, because the stack is a first-in-last-out queue, and they are so one-to-one that it is never possible to have a memory block popped from the middle of the stack. The content of the backward stack above him has been popped up. For details, please refer to the data structure. We will not discuss them one by one here.

 

  Growth direction: For the heap, the growth direction is upward, that is, in the direction of increasing memory addresses; for the stack, its growth direction is downward, which is in the direction of decreasing memory addresses.

 

  Allocation method: The heap is dynamically allocated, and there is no statically allocated heap. The stack has two allocation methods: static allocation and dynamic allocation. Static allocation is done by the compiler, such as the allocation of local variables. The dynamic allocation is allocated by the malloc function, but the dynamic allocation of the stack is different from that of the heap. His dynamic allocation is released by the compiler, and we do not need to implement it manually.

 

  Allocation efficiency: The stack is a data structure provided by the machine system, and the computer will provide support for the stack at the bottom layer: allocate a special register to store the address of the stack, and there are special instructions for pushing and popping the stack, which determines the efficiency of the stack. . The heap is provided by the C/C++ function library, and its mechanism is very complicated. For example, in order to allocate a piece of memory, the library function will search for available memory in the heap memory according to a certain algorithm (for the specific algorithm, please refer to the data structure/operating system). If there is not enough space (probably due to too much memory fragmentation), it is possible to call system functions to increase the memory space of the program data segment, so that there is a chance to allocate enough memory, and then proceed to return. Obviously, the heap is much less efficient than the stack.

  From here, we can see that compared with the stack, due to the use of a large number of new/delete, it is easy to cause a lot of memory fragmentation; because there is no special system support, the efficiency is very low; due to the possibility of switching between user mode and kernel mode, The application of memory becomes more expensive. Therefore, the stack is the most widely used in the program. Even the function call is completed by the stack. The parameters, return address, EBP and local variables in the function call process are stored in the stack. Therefore, we recommend that you try to use the stack instead of the heap.

  Although the stack has so many benefits, because it is not so flexible compared to the heap, sometimes it is better to use the heap to allocate a large amount of memory space.

  Whether it is the heap or the stack, it is necessary to prevent the occurrence of out-of-bounds phenomena (unless you deliberately make it out-of-bounds), because the result of crossing the boundary is either the program crash, or the destruction of the program's heap and stack structures, resulting in unexpected results, even if it is In the process of running your program, the above problems do not occur, you still have to be careful, maybe it will crash at any time, it is very difficult to debug at that time :)

There are several differences between heap and stack:


  1. Response stack     of the system after application     : As long as the remaining space of the stack is larger than the applied space, the system will provide memory for the program, otherwise an exception will be reported to indicate stack overflow.
  Heap: First of all, you should know that the operating system has a linked list that records free memory addresses. When the system receives an application from a program, it will traverse the linked list to find the first heap node whose space is larger than the requested space, and then remove the node from the heap. Delete from the free node list and allocate the space of the node to the program. In addition, for most systems, the size of this allocation will be recorded at the first address in this memory space. In this way, the delete statement in the code In order to correctly release the memory space.  
  In addition, since the size of the found heap node is not necessarily exactly equal to the size of the application, the system will automatically put the excess part back into the free list.    
   
2. Application size limit    
  stack: Under Windows, the stack is a data structure that extends to a low address and is a continuous memory area. This sentence means that the address of the top of the stack and the maximum capacity of the stack are predetermined by the system. Under WINDOWS, the size of the stack is 2M (some say 1M, in short, it is a constant determined at compile time), if When the requested space exceeds the remaining space of the stack, overflow will be prompted. Therefore, less space can be obtained from the stack.    
 Heap: The heap is a data structure that extends to high addresses and is a discontinuous area of ​​memory. This is because the system uses a linked list to store free memory addresses, which are naturally discontinuous, and the traversal direction of the linked list is from low addresses to high addresses. The size of the heap is limited by the virtual memory available in the computer system. It can be seen that the space obtained by the heap is more flexible and larger.   
   
3. Comparison of application efficiency: The    
  stack is automatically allocated by the system, which is faster. But the programmer is out of control.    
  The heap is the memory allocated by new, which is generally slow and prone to memory fragmentation, but it is the most convenient to use. 
  In addition, under WINDOWS, the best way is to use VirtualAlloc to allocate memory. He is not in the heap, nor in the stack, but directly reserves a piece of memory in the address space of the process, although it is the most inconvenient to use. But it is fast and the most flexible.     
   
4. The storage content    
  stack in the heap and the stack: When a function is called, the first thing that is pushed into the stack is the address of the next instruction after the main function (the next executable statement of the function call statement), and then the address of each of the functions. Parameters, in most C compilers, parameters are pushed onto the stack from right to left, followed by local variables in the function. Note that static variables are not pushed onto the stack.  
  When this function call ends, the local variables are popped off the stack first, then the parameters, and finally the top of the stack pointer points to the first stored address, which is the next instruction in the main function, and the program continues to run from this point.    
Heap: Generally, one byte is used to store the size of the heap at the head of the heap. The specific contents of the heap are arranged by the programmer.    

5. Memory operations on the heap and stack are out of bounds

1> Heap memory out-of-bounds is mainly because the operating memory exceeds the size allocated by calloc/malloc/new and other functions that allocate memory on the heap, and the consequence will cause the next calloc/malloc/new failure, malloc failure occurs _int_malloc error (causing abort ) are mostly caused by this situation;

2> The situation that the stack memory is out of bounds mostly occurs in the operation of the array. The subscript of the array exceeds the defined length of the array, and the consequence causes other variables to be overwritten.


To summarize:

The difference between a heap and a stack can be seen with the following analogy:    
  using a stack is like going to a restaurant to eat, just order food (issue an application), pay, and eat (use). Preparatory work such as cooking and washing vegetables, and finishing work such as washing dishes and pots, his advantage is that it is fast, but the degree of freedom is small.    
  Using the heap is like making your own favorite dishes. It is more troublesome, but it is more in line with your own taste and has a large degree of freedom.

 

Reprinted: https://blog.csdn.net/duan19920101/article/details/50989431
           https://blog.csdn.net/qq792326645/article/details/49783347

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325776600&siteId=291194637