Memory management for study notes

Memory

The program is stored in the hard disk, and it must be loaded into the memory to run, and the CPU can only read data and instructions from the memory.

Memory is only a place to store instructions and data , and it cannot perform computing functions in memory.

Generally, the memory address printed or seen refers to the virtual address , not the real physical memory address. The virtual address can only correspond to the physical address through the conversion of the CPU , and the operating system will rearrange the correspondence between the virtual address and the physical address every time the program runs.

The virtual address space is the effective range of virtual addresses that the program can use . The mapping relationship between virtual addresses and physical addresses is determined by the operating system, so the size of the virtual address space is also determined by the operating system.

The data bus determines the single data processing capability of the CPU, and the main frequency determines the number of data processing times per unit time of the CPU.

The data bus and the address bus are not the same thing. The data bus is used to transfer data between the CPU and the memory , and the address bus is used to locate data on the memory .

compile mode

Compilation mode, the current compiler provides two compilation methods: 32-bit and 64-bit

  1. In 32-bit mode : a pointer or address occupies 4 bytes of memory, a total of 32 bits, and the virtual memory space that can be accessed theoretically is 2 32 2^{32}23 2 , namely4GB, the effective virtual address range is 0-0xFFFFFFFF. That is to say, no matter how big the physical memory is, in 32-bit mode, the maximum virtual memory is only 4GB. If the remaining space in the memory is not enough to accommodate the current program, the operating system will write a part of the data in the memory that is not used temporarily to the disk.
  2. In 64-bit mode : a pointer or address occupies 8 bytes of memory, a total of 64 bits, and the virtual memory space that can be accessed theoretically is 2 64 2^{64}26 4 . The current physical memory cannot reach such a large size, so generally onlythe lower 48 bits(6 bytes) of the virtual address are used, and the total virtual address space size is2 48 2^{48}24 8 is 256TB.

A 32-bit operating system can only run 32-bit programs, and a 64-bit operating system can run both 32-bit and 64-bit programs , but running 32-bit programs on a 64-bit operating system will waste some resources.

memory alignment/completion

Memory alignment : The first data member is placed where the offset is 0, and the alignment is performed according to the alignment factor and the number of bytes it occupies, whichever is smaller.

Memory completion : After the struct or union data members are aligned, the struct or union itself must also be aligned, and the alignment is performed according to the smaller one between the alignment coefficient and the maximum data member length in the struct or union. Align local members first, then global alignment. The completion of the structure is to make the array defined by the structure meet the memory alignment requirements inside the array.

Advantages of memory alignment : Data structures are aligned on natural boundaries as much as possible. Memory is read in groups. In order to access unaligned memory, the processor needs to do two memory accesses. The unit of memory is bytes, CPU The memory is accessed through the address bus , and several bytes of data can be processed at a time, and the address bus is ordered to read several bytes of data . A 32-bit CPU can process 4 bytes of data at a time, so read 4 bytes of data from memory. A 64-bit CPU reads 8 bytes of data at a time. So a 32-bit CPU can only address memory that is a multiple of 4 , and a 64-bit CPU can only address memory that is a multiple of 8 . It cannot directly access arbitrary numbered memory addresses.

So 32-bit devices default to 4-byte alignment, and 64-bit devices default to 8-byte alignment. You can specify n-byte alignment by ==#pragma pack(n)==.

A variable is preferably within the range of one addressing step , so that the value of the variable can be read at one time . If it is stored across strides, it needs to be read twice, and the efficiency is reduced. Put a piece of data within a step as much as possible and avoid storing across steps, which is called memory alignment .

Although memory alignment is related to hardware, it is the compiler that determines the alignment . If the hardware is 64-bit, but it is compiled in 32-bit mode, it will still be aligned according to 4 bytes.

paging mechanism

​ When the program is running, only a small part of data will be frequently used. Reading the entire program into the memory will reduce the running efficiency of the program. The paging mechanism is used to divide and map the virtual address space and the physical address space. Paging is to divide the address space into several parts of equal size , such a part is called a page. Memory is swapped in and out in units of pages .

The page size is fixed and determined by the hardware, or the hardware supports pages of multiple sizes, and the operating system selects the page size. Almost all PCs currently useThe page size is 4KB. Each program has its own memory space. When a program uses a virtual address to read and write, it must be converted into an actual physical address. The page table is used to convert the virtual address to the physical address . When using the page table, as long as you know the page and the offset within the page, you can find the data and convert it into a physical address.

The size of a page is 4K= 2 12 2^{12}21 2 , the virtual address of the 32-bit system is 4G=2 32 2^{32}23 2 , including a total of 1M=2 20 2^{20}220 pages . So you can define an array containing 1M=2 20 2^{20}22 0 elements, the value of each element is the label of the page, and the length is 4 bytes =2 32 2^{32}23 2 , the entire array takes up 4MB of memory space. This array is called the page table, which records the numbers of all pages in the address space.

Each element of the page table is 4 bytes = 2 32 2^{32}23 2 , but one page only has 1M=2 20 2^{20}220 , so only 20bit is needed to represent the page number. The remaining 12 bits are used to represent the page offset. So0 - 11 represents the offset within the page, and12 - 31 represents the subscript of the page table array.

Inside the CPU there is aMMUThe component, the memory management unit, is responsible for mapping virtual addresses to physical addresses. The virtual address issued by the CPU will be handed over to the MMU first, and will become a physical address after being converted by the MMU.

The distribution of program memory in the address space is called the memory model .

A part of the memory space in the memory is used by the kernel, calledkernel space, windows allocates 2G memory with a high address to the kernel,Linux allocates 1G of high-address memory to the kernel

memory model

Linux memory model : kernel space , stack , unallocated space, dynamic link library , unallocated space, heap , global data area (global variables and static variables), constant area (general constants and string constants), code area , reserved area.

The program code area, constant area, and global data area have been allocated after the program is loaded into the memory , and they always exist during the running of the program , and cannot be destroyed or added. So global variables, static variables, and string constants can be accessed everywhere, because they always exist.

When a function is called, information such as parameters , local variables , and return addresses are pushed onto the stack, and the information is destroyed after the function finishes executing. Therefore, local variables and parameters are only valid in the current function and cannot be passed outside the function.

The only memory area that programmers control is the heap . The heap often occupies most of the virtual space. The heap memory will always exist until the program actively releases it, and will not become invalid with the end of the function. The data generated inside the function can be used outside the function as long as it is placed in the heap.

The memory of global variables has been allocated at compile time, and the default initial value is 0 . The memory of local variables is allocated when the function is called, and the initial value is uncertain and determined by the compiler.

kernel mode and user mode

The kernel space stores the kernel code and data of the operating system and is shared by all programs .

The user program calling the system API function is called a system call (system call). When a system call occurs, the user program will be suspended, and the kernel code will be executed instead to access the kernel space. This is called the kernel mode .

User space saves the code and data of the application program, which is private to the program. When the program's own code is executed, it is called user mode .

The computer often switches between the kernel mode and the user mode. When the user needs low-level operations such as input and output, memory application, etc., he must call the API function provided by the operating system to enter the kernel mode. After the operation is complete, continue to execute the code of the application, and it will return to user mode.

User mode is to execute application code and access user space; kernel mode is to execute kernel code and access kernel space .

In consideration of security and stability, the CPU can run in four different permission levels, ring0-ring3 , and provide four different levels of protection for data. However, win and linux only use two of the levels, one is the kernel mode, which corresponds to the ring0 level; the other is the user mode, which corresponds to the ring3 level.

Why not let the kernel exclusively enjoy the 4G memory space?

If the kernel is in an independent process with exclusive memory space, each system call needs to switch the process, the consumption of switching the process is huge, not only the registers need to be pushed into the stack and popped out of the stack, but also the data cache in the CPU will be invalidated, MMU Page table cache invalidation in . The kernel and the user process share the address space. When a system call occurs, the mode switching is performed. Only the registers need to be pushed into the stack and popped out of the stack, and the cache will not be invalidated.

kernel memory management

**vmalloc()** guarantees that the allocated addresses are continuous in the virtual address space, and **kmalloc()** guarantees that the allocated addresses are continuous in both the physical address space and the virtual address space. Only the hardware needs to get contiguous physical address space. Because the hardware device exists outside the mmu, the virtual address is not known.

A lot of kernel code uses kmalloc() to get memory instead of vmalloc to get memory. This is mainly due to performance considerations, because the vmalloc function must create a page table entry in order to convert discontinuous pages on the physical address into continuous pages on the virtual address. Pages obtained through vmalloc need to be mapped one by one, causing vmalloc to have much larger TLB jitter than direct memory mapping. Most of the kernel code uses kmalloc.

If vmalloc is compared with kmalloc, vmalloc will sleep and cannot be used in other situations where blocking is not allowed.

slab

In order to facilitate the frequent allocation and recycling of data, a free list is often used . The free list contains available and allocated data structure blocks. When a new data structure instance is needed, one can be grabbed from the free list, and the data can be put into it without allocating memory. When an instance of this data structure is no longer needed, put it back on the free list instead of freeing it. So the free list is equivalent to an object cache, which quickly stores frequently used object types .

The problem faced by the free linked list in the kernel is that it cannot be controlled globally. When the memory is tight, the kernel cannot notify each free linked list to shrink the cache size to release the memory. Actually the kernel doesn't know if there is any free list. Therefore, the slab layer is provided in the linux kernel, and the slab allocator plays the role of a common data structure cache layer .

The slab allocator seeks a balance between the following basic principles:

  1. Frequently used data structures are also allocated and freed frequently, so data should be cached
  2. Frequent allocation and recycling will inevitably lead to memory fragmentation . In order to avoid this situation, the cache of the free list will be stored continuously. Because the released data structure will be put back into the free list, which will not cause fragmentation.
  3. Recycled objects can be immediately put into the next allocation, so for frequent allocation and release, the free list can improve its performance .
  4. If the allocator knows the object size, page size, and total cache size, it can make more informed decisions.
  5. If part of the cache is dedicated to a single processor, allocation and deallocation can be done without SMP.
  6. If the allocator is NUMA-dependent, it can allocate from the same memory node as the requester.
  7. Objects that are deposited are shaded to prevent multiple objects from mapping to the same cache line .

The basic idea of ​​the slab allocator isPuts frequently used objects in the kernel into the cache and is kept by the system in an initially available state. The slab allocator has three basic goals :

  1. Reduce internal fragmentation of the buddy algorithm when allocating small blocks of contiguous memory .
  2. Cache frequently used objects to reduce the time overhead of allocating, initializing, and releasing objects .
  3. Objects are adjusted through shading techniques to better use hardware caches.

The slab allocator allocates a cache for each type of object , which can be regarded as a reserve of objects of the same type. The memory area occupied by each cache is divided into multiple slabs , and each slab is composed of one or more continuous page frames . Each page frame contains several objects.

All caches are organized together through double-linked lists to form cache_chain. Each kmem_cache structure does not contain a description of the specific slab, but organizes each slab through kmem-list3. The list in the slab descriptor indicates the current The slab is in one of the three slab linked lists.

Cache is divided into two categories, ordinary cache and special cache . Ordinary caches do not target specific objects in the kernel. First, caches are provided for the mem_cache structure itself. This type of cache is stored in the cache_cache variable, which represents the first element in the cache_chain list. Private caches are created by specifying specific objects as required by the kernel.

The key to the slab layer is to avoid frequent allocation and release of pages. The slab layer will only call the page allocation function when there is neither a full nor an empty slab in a given cache section.

page cache

Files are generally stored on the disk, and the CPU cannot directly access the data on the disk. It needs to read the data on the disk into the memory and then access it. Since the speed of reading and writing disk data is much slower than reading and writing memory, in order to avoid reading and writing operations on the hard disk every time a file is read and written, the linux kernel usespage cache(page cache) mechanism to cache the data in the file.

The Linux kernel divides files into multiple data blocks in units of page size (4KB). When a user reads and writes a data block in a file, the kernel first applies for a memory page (called a page cache) to bind to the data block in the file. When a user reads and writes a file, it actually reads and writes to the page cache of the file. So when reading and writing operations on files, there are two situations:

  1. When reading data from a file, if the page cache where the data to be read is located already exists, then the data in the page cache can be directly copied to the user. Otherwise, the kernel will first apply for a free memory page, then read data from the file to the page cache, and copy the page cache data to the user.
  2. When writing data to a file, if the page cache where the data to be written is located already exists, then the new data can be directly written to the page cache. Otherwise, the kernel will first apply for a free memory page, then read data from the file to the page cache, and write new data into the page cache.

MMAP

The full name of mmap is memory map, that is, memory mapping. The function of mmap is to map the file into the memory, and then indirectly operate the file on the disk through the memory operation of the mapped area.

The traditional method of modifying a file generally has three steps: read the file into the memory, modify the content of the memory, and write the modified memory content into the file. In the kernel, the page cache is associated with the data blocks of the file, so when the application reads and writes the file, the actual operation is the page cache.

In the traditional method of reading and writing files, it can be found that if the page cache can be read and written directly from the user space without copying the data from the page cache in the kernel to the buffer in the user space, the file can be accelerated. Revise. This effect can be achieved by using the mmap method.

file -> page cache -> user buffer(read) -> user buffer(write) -> page cache -> file

The mmap system call can be used to map the virtual memory address of the user space and the file, and the read and write operations on the mapped virtual memory address are the same as the read and write operations on the file.

file -> page cache -> (mmap) -> VMA

Because reading and writing files are all through the page cache, what mmap maps is also the page cache of the file, not the file itself on the disk. Since mmap maps the page cache of the file, it involves the issue of synchronization, that is, when the page cache will synchronize the modified file to the file on the disk. The Linux kernel will not actively synchronize the mmap-mapped page cache to the disk, but requires the user to actively trigger it. Synchronization generally has four times:

  1. Call the msync function to actively synchronize data.
  2. Call the munmap function to perform contact mapping on the file.
  3. when the process exits.
  4. When the system shuts down.

Function prototype:void *mmap(void *addr, size_t length, int prot, int flags, int fd, off_t offset);

Return value: successfully returns the first address of the created mapping area. Failure returns the MAP_FAILED macro.

Parameter Description:

  • addr: Specifies the starting address of the mapped virtual memory, usually set to NULL, specified by the system.
  • length: The length of the file mapped to memory.
  • prot: the protection method of the mapping area, commonly used are: read - PROT_READ, write - PROT_WRITE, read and write - PROT_READ|PROT_WRITE
  • flags: the characteristics of the mapping area, MAP_SHARED - the data written to the mapping area will be written back to the file, and allow other processes that map the file to share; MAP-PRIVATE - will generate a copy of the mapping area (copy-on-write), for Modifications made in this area are not written back to the original file.
  • fd: The file descriptor of the file to be mapped.
  • offset: file offset position, the offset from the beginning of the file, must be an integer multiple of 4K, usually 0, which means mapping from the beginning of the file.

When the mmap function is used, the size of the mapped area must be an integer multiple of the physical page size (page_size), because the minimum granularity of memory is a page, and the mapping between the process virtual address space and memory is also in units of jobs. The benefit of mmap's mapping from disk to virtual address space must also be pages.

int fd = open(filepath, O_RDWR, 0644);                           // 打开文件void *addr = mmap(NULL, 8192, PROT_WRITE, MAP_SHARED, fd, 4096); // 对文件进行映射

After the mapping is established, even if the file is closed, the mapping still exists, because the mapping is the address of the disk, not the file itself, and has nothing to do with the file handle. At the same time, the effective address space available for inter-process communication is not completely limited by the size of the mapped file. Because it is mapped by page.

IN MAP

Function prototype: int mumap(void *start, size_t length)

Return value: 0 is returned if the release is successful, and -1 is returned if it fails.

Parameter Description:

  • start: The starting address of the mapping.
  • length: The length of the portion of the file that is mapped into memory.

the stack

The memory size that the stack can use is limited, generally 1M-8M , which is determined at compile time , and the program cannot be changed during operation. If the stack memory used by the program exceeds the maximum value, a stack overflow will occur . The memory size of the stack is related to the compiler. The compiler will specify a maximum value for the stack memory. Linux gcc defaults to 8M , and VC/VS defaults to 1M.

A program can contain multiple threads, and each thread has its own stack. The maximum value of the stack is for the thread , not for the program.

The stack is also often called the stack, and the heap is still called the heap. The concept of the stack does not include the heap.

stack frame

When a function call occurs, all the information needed for the function to run will be pushed onto the stack. This is called **stack frame** or active record. The stack frame mainly includes the following aspects:

  1. Function return address and parameters : After a function is executed, it will continue to execute the next statement after the function, so the return address is the address of the next statement in memory.
  2. parameters and local variables . Some compilers pass parameters through registers instead of pushing parameters onto the stack after enabling optimization options .
  3. Temporary data automatically generated by the compiler . The temporary data automatically generated by the compiler is sometimes pushed onto the stack. For example, when the length of the function return value is large , the return value will be pushed onto the stack first , and then passed to the function caller. When the return value is small , the return value will be directly put into the register , and then passed to the function caller.
  4. Some registers that need to be saved . The reason why it needs to be saved is that when the function exits, it can restore the scene before the function call and continue to execute the three-layer function.

An example of a function call is as follows:

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| 参数 | 返回地址 | old ebp | 局部变量、返回值 | old ebx | old esi | old edi |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                ^                                                       ^
                |                                                       |
                ebp                                                    esp

The ebp register should point to the bottom of the stack, but it actually points to old ebp

When a function call occurs:

  1. Push all or part of the parameters onto the stack. If other parameters are not pushed onto the stack, use some specific registers to pass.
  2. Push the address of the instruction next to the current instruction onto the stack.
  3. Jump to function body execution.

Note: The function stacks generated by different compilers in different compilation modes are not exactly the same . Steps 2 and 3 are executed together by the instruction call. Function execution begins after jumping to the function body. When executing a function at the beginning, it is generally:

  1. push ebp: Push old ebp onto the stack.
  2. mov ebp, esp: ebp=esp (the bottom of the stack points to the top of the stack).
  3. [Optional] sub esp, XXX: allocate XXX bytes of temporary space on the stack.
  4. [Optional] push XXX: If necessary, save the register named XXX (some registers may need to remain unchanged before and after the function call, so that the function can push these registers onto the stack at the beginning of the call, and remove them after the end ).

When the function ends the call is generally:

  1. [Optional] pop XXX: Restore saved registers.
  2. mov esp, ebp: restore ESP and reclaim local variable space.
  3. pop ebp: Restore the saved value of ebp from the stack.
  4. ret: Get the return address from the stack and jump to it.

Data positioning: The value of bsp will change as the data is pushed into the stack, so the data is positioned through ebp, the value of ebp is fixed, and the offset of data relative to ebp is also fixed. The value of ebp plus the offset The displacement is the value of the data.

void func(int a, int b){
    float f = 28.5;
    int n = 100;
    //TODO:
}

func(15, 92);


+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|92|15| 返回地址 | old ebp | 28.5 | 100 | 冗余的内存 | old ebx | old esi | old dei |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                ^                                                              ^
                |                                                              |
                ebp                                                           esp

Note: GCC has a parameter **-fomit-frame-pointer** that can cancel the frame pointer (ebp), that is, it does not use any frame pointer, but uses esp to locate the position of the variable on the frame. In this way, an extra ebp register can be used, but the disadvantage is that the addressing speed on the frame will slow down, and without the frame pointer, the call track of the function cannot be accurately located.

function call

When a function call occurs, the actual parameters of the function are pushed onto the stack by the caller for use by the callee. Both parties need to agree on whether the parameters are pushed onto the stack from left to right or from right to left.

The function caller and the called party must abide by the same agreement, and the understanding must be consistent. This is called the calling convention , and generally includes the following content:

  1. The method of passing function parameters , whether to pass through the stack or through registers .
  2. The order in which function parameters are passed , that is, from left to right or from right to left on the stack.
  3. The parameter pop-up method , after the function call ends, all the parameters pushed into the stack need to be popped, so that the stack remains consistent before and after the function call.
  4. Function name modification method , the function name will be modified at compile time, and the calling convention can determine how to modify the function name.

Function calling conventions can be specified in both function declaration and function definition, the syntax is:

返回值类型  调用惯例  函数名(函数参数)

The calling convention is specified for the caller at the function declaration, and the calling convention is specified for the function itself at the function definition. __cdeclOr __attribute__the default calling convention.

Example:

void func(int a, int b){
    int p =12, q = 345;
}
int main(){
    func(90, 26);
    return 0;
}

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
|返回地址|old ebp|预留内存|old ebx|old esi|old edi|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
                ^                              ^
                |                              |
               ebp                            esp

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
|返回地址|old ebp|预留内存|old ebx|old esi|old edi|26|90|返回地址|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
                ^                                            ^
                |                                            |
               ebp                                           esp

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
|返回地址|old ebp|预留内存|old ebx|old esi|old edi|26|90|返回地址|old ebp|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
                                                                    ^
                                                                    |
                                                                 ebp,esp
    
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|返回地址|old ebp|预留内存|old ebx|old esi|old edi|26|90|返回地址|old ebp|预留内存|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
                                                                    ^        ^
                                                                    |        |
                                                                   ebp      esp 

    ...........

Function stack steps :

  1. The main function pushes the stack
  2. Execute function(90,26), first push the actual parameters 90 and 26 onto the stack, and then push the return address onto the stack. These tasks are completed by the calling main() function. The ebp pointer has not changed, only the point of esp has been changed.
  3. Execute the func() function body, push the value in the original ebp register onto the stack, and assign the value of esp to ebp, so that ebp points from the stack bottom of the main() function to the stack bottom of the func() function, and completes function switch.
  4. Reserve enough memory for local variables, return values, etc. This stack memory is allocated before the function call. This is not a real allocation, just subtract an integer from the value of esp, and the value of this integer is reserved of memory.
  5. Push the values ​​of ebp, esi, and edi registers onto the stack in turn.
  6. Put the value of the local variable into the reserved memory. There are 4 bytes of space between the first variable and old ebp, and there are several bytes of space between the variables. This blank is because in debug mode, extra memory is reserved to add debugging information, and there is no blank in the generated program in release mode.

Function popping steps :

  1. After the function func() is executed, it starts to pop out of the stack. Firstly, the registers edi, esi, and ebx are popped out of the stack.
  2. When popping data such as local variables and return values ​​out of the stack, directly assign the value of ebp to esp, so that ebp and esp point to the same location.
  3. Pop the old ebp out of the stack and assign it to the current ebp. At this time, ebp points to the position before the func() call, that is, the old ebp position of main().
  4. Finally, find the position of the next instruction according to the return address, and pop the return address and actual parameters out of the stack. At this time, esp points to the top of the stack of main().

In the actual function call process, the formal parameters do not exist and will not occupy memory space . There are only actual parameters in the memory, and they are pushed onto the stack by the calling method before the function body code is executed.

The value of an uninitialized local variable is a garbage value , because when allocating memory for a local variable, it only subtracts an integer from the value of esp to reserve enough blank memory. Different compilers will treat this in different modes. Blank memory is handled differently.

The function popping only increases the value of the esp register, making it point to the previous data, and does not destroy the previous data , so it is actually wrong to destroy the local variables immediately after the function finishes running.

Stack overflow attack : The local variable array is also allocated memory on the stack, and the C language does not detect array overflow. Data overflow causes the return address of the function to be overwritten, which is called a stack overflow error . If a stack overflow is carefully constructed so that the return address points to malicious code, it is a stack overflow attack.

dynamic memory allocation

In the address space of the process, the code area, constant area, and global data area have been allocated when the program starts, which is called static memory allocation . The memory in the stack area and the heap area can be allocated and released according to actual needs during the running of the program, and there is enough memory when the program starts, which is called dynamic memory allocation .

Stack: The memory in the stack area is allocated and released by the system and is not controlled by the programmer.

Heap: The memory in the heap area is completely under the control of the programmer. You can allocate as much as you want, and release it when you want.

When the program starts, it will allocate an appropriate size of memory for the stack area . When the function call exceeds the allocated memory , the compiler will insert a dynamic memory allocation function for the stack in the function code , so that the function is allocated when the function is called. Memory, memory is not allocated without calling.

The heap is the only memory area controlled by the programmer. The functions needed to allocate memory on the heap are: malloc(), calloc(), realloc(), and free()

have to be aware of is:

  1. Each memory allocation function must have a corresponding free() function, and the freed memory cannot be used again after it is released.
  2. It is best not to directly use numbers to specify the size of the memory space when allocating memory .
  3. free§ cannot change the value of the pointer p, and p still points to the previous memory. In order to prevent retrying to use this memory, it is recommended to manually set the value of p to NULL.

illegal memory operation

If the memory pointed to by a pointer does not have access rights, or points to a piece of memory that has been released, then the pointer cannot be manipulated. Such a pointer is called a wild pointer . Wild pointers can lead tosegment fault

If the dynamically allocated memory is not released, then this memory will be known to be occupied by the program, and will be reclaimed by the operating system until the end of the program. This is a memory leak .

the storage type of the function

There are 4 keywords in C language to indicate the storage type of variables: auto (automatic), static (static), register (register), extern (external). The storage area of ​​variables can be controlled by keywords in C language.

register

  • General registers : AX, BX, CX, DX

    • AX: accumulation register, BX: base register, CX: count register, DX: data register
  • Index register: SI, DI

    • SI: source index register, DI: destination index register
  • Stack, base register: SP, BP

    • SP: stack index register, BP: base index register
  • EAX, ECX, EDX, EBX: extensions of ax, bx, cx, dx, each 32 bits

  • ESI, EDI, ESP, EBP: extension of si, di, sp, bp, 32 bits

    • eax, ebx, ecx, edx, esi, edi, ebp, esp, etc. are the names of general-purpose registers on the CPU in X86 assembly language , which are 32-bit registers.
  • EAX is the "accumulator", which is the default register for many add and multiply instructions.

  • EBX is the "base address" (base) register, which stores the base address during memory addressing.

  • ECX is a counter (counter), which is the default counter for repeat (REP) prefix instructions and LOOP instructions.

  • EDX is always used to hold the remainder of integer division.

  • ESI/EDI are respectively called "source/destination index registers" (source/destination index), because in many string manipulation instructions, DS:ESI points to the source string, and ES:EDI points to the target string.

  • EBP is the "base pointer" (BASE POINTER), the register stores the stack bottom pointer of the current thread, and it is most often used as the "frame pointer" (frame pointer) for high-level language function calls.

  • ESP is specially used as a stack pointer, which is vividly called the top pointer of the stack. The top of the stack is an area with a small address. The more data is pushed into the stack, the smaller the ESP becomes. On 32-bit platforms, ESP is reduced by 4 bytes each time.

Guess you like

Origin blog.csdn.net/qq_41323475/article/details/127856483