Linux - process address space

Table of contents

1. Program address space

1.1 Research Background

1.2 Program Address Space

1.3 Space layout diagram code test

1.4 User Space and Kernel Space

1.5 Comparison between Linux and Windows

1.6 Analysis of virtual address and physical address under Linux

2. Process address space

2.1 Address space concept

2.2 Address space and page table mapping analysis

2.3 Copy-on-write and virtual address re-analysis

2.4 After fork, ret saves different return value analysis

3. Expansion and summary

3.1 Program Internal Address and Instruction Internal Address

3.2 Analysis of CPU Execution Instructions

3.3 Summary—Why Address Spaces?

3.4 Re-understanding suspension


1. Program address space

Program segment (Text): The mapping of the program code in memory, storing the binary code of the function body.

Initialized data (Data): The data that has initialized the variable at the beginning of the program operation.

Uninitialized data (BSS): Data that has not initialized variables at the beginning of program execution.

Stack (Stack): Store local and temporary variables. When a function is called, it stores the return pointer of the function, which is used to control the call and return of the function. The memory is automatically allocated at the beginning of the program block, and the memory is automatically released at the end. Its operation mode is similar to the stack in the data structure.

Heap (Heap): Stores dynamic memory allocation, requires programmers to manually allocate and release. Note that it is different from the heap in the data structure, and the allocation method is similar to a linked list.

1.1 Research Background

  • kernel 2.6.32
  • 32-bit platform

1.2 Program Address Space

Space layout diagram:

 In order to deepen the understanding, you can use the code for approximate testing

1.3 Space layout diagram code test

Verification one:

#include<stdio.h>      
#include<stdlib.h>      
int un_g_val;  
int g_val = 100;  
int main(){                                                                         
    printf("main: %p\n",main);//正文代码地址  
    printf("init: %p\n",&g_val);//初始化数据地址  
    printf("uninit: %p\n",&un_g_val);//未初始化数据地址  
          
    char *p1 = (char*)malloc(16);  
    char *p2 = (char*)malloc(16);  
    char *p3 = (char*)malloc(16);  
    char *p4 = (char*)malloc(16);  
          
    printf("heap1: %p\n",p1);//堆区地址  
    printf("heap2: %p\n",p2);//堆区地址  
    printf("heap3: %p\n",p3);//堆区地址  
    printf("heap4: %p\n",p4);//堆区地址  
          
    printf("stack: %p\n",&p1);//栈区地址  
    printf("stack: %p\n",&p2);//栈区地址  
    printf("stack: %p\n",&p3);//栈区地址  
    printf("stack: %p\n",&p4);//栈区地址  
    return 0;  
}

operation result:

[customer@VM-4-10-centos 15Lesson]$ ./myproc 
main: 0x40057d
init: 0x60103c
uninit: 0x601044
heap1: 0x1d25010
heap2: 0x1d25030
heap3: 0x1d25050
heap4: 0x1d25070
stack: 0x7ffebfb2c638
stack: 0x7ffebfb2c630
stack: 0x7ffebfb2c628
stack: 0x7ffebfb2c620

Verification two:

#include<stdio.h>                                                                                   
#include<unistd.h>
#include<stdlib.h>

int un_g_val;
int g_val = 100;

int main(int argc,char *argv[],char *env[]){
    printf("code addr: %p\n",main);
    printf("init global addr: %p\n",&g_val);
    printf("uninit global addr: %p\n",&un_g_val);

    static int test = 10;
    char *heap_mem = (char*)malloc(10);
    printf("heap addr: %p\n\n",heap_mem);

    printf("test stack addr: %p\n",&test);
    printf("stack addr: %p\n",&heap_mem);
    
    for(int i = 0; i < argc; ++i){
        printf("argv[%d]: %p\n",i,argv[i]);
    }
    for(int i = 0; env[i]; ++i){
        printf("env[%d]: %p\n",i,env[i]);
    }

    return 0;
}

operation result:

[customer@VM-4-10-centos adress]$ make
gcc -o myproc myproc.c -std=c99
[customer@VM-4-10-centos adress]$ ./myproc 
code addr: 0x40057d
init global addr: 0x60103c
uninit global addr: 0x601048
heap addr: 0xcd0010

test stack addr: 0x601040
stack addr: 0x7ffcacd073f0
argv[0]: 0x7ffcacd0874a
env[0]: 0x7ffcacd08753
env[1]: 0x7ffcacd08769
env[2]: 0x7ffcacd08781
env[3]: 0x7ffcacd0878c
env[4]: 0x7ffcacd0879c
env[5]: 0x7ffcacd087aa
env[6]: 0x7ffcacd087cd
env[7]: 0x7ffcacd087e0
env[8]: 0x7ffcacd087ee
env[9]: 0x7ffcacd08836
.......................


It can be seen from the running results that:

  • The address of the text code is at the low address (literal constants are hard-coded into the code area, all of which are read-only)
  • The address of the initialization data is above the address of the text code (static variables, initialized global variables)
  • Uninitialized data address above initialized data address (uninitialized static variable, uninitialized global variable)
  • The second is the heap. The heap opens up space from low addresses to high addresses, and grows to high addresses.
  • The gap area with huge difference in the middle includes shared area (follow-up study)
  • The last is the stack area, the stack is used from high address to low address, and grows towards low address

Is the address we see actually a physical address in memory? The answer is: no

When applying for space to the heap area, it is not the byte space required for the actual application, but the standard library will apply for multiple byte spaces to record the attribute information of the application space (application time, application size, access rights, etc. ) (cookie data) , when free releases space, it can obtain attribute information according to the starting position, and obtain the size of the space that needs to be released

1.4 User Space and Kernel Space

In a 32-bit environment:

The Linux virtual address space ranges from 0 to 4G. The Linux kernel divides the 4G byte space into two parts, and uses the highest 1G byte (from virtual address 0xC0000000 to 0xFFFFFFFF) for the kernel, which is called "kernel space". The lower 3G bytes (from virtual address 0x00000000 to 0xBFFFFFFF) are used by each process, which is called "user space. Because each process can enter the kernel through a system call, the Linux kernel is shared by all processes in the system. Therefore, from the perspective of a specific process, each process can have a virtual space of 4G bytes

Note: 1. Here is the 32-bit kernel address space division, and the 64-bit kernel address space division is different

1.5 Comparison between Linux and Windows

The result printed by the compiler of the above code in the windows system is different from that of Linux

The following visual studio2022 running results:

code addr: 002012E9
init global addr: 0020A000
uninit global addr: 0020A150
heap addr: 0062A000

test stack addr: 0020A004
stack addr: 0030FD00
argv[0]: 00629B20
env[0]: 0062ABE8
env[1]: 0062AC38
env[2]: 006225A8
env[3]: 00622600
env[4]: 00622670
env[5]: 006226D8
env[6]: 00622740
env[7]: 006227A0
env[8]: 006227F0
env[9]: 00622840

Therefore, the above conclusion is only valid under Linux by default.

1.6 Analysis of virtual address and physical address under Linux

Code analysis:

#include<stdio.h>    
#include<unistd.h>    
#include<stdlib.h>    
#include<sys/types.h>    
    
int g_val = 100;    
    
int main(){    
    pid_t ret = fork();    
    if(ret < 0){    
        perror("fork");    
        exit(-1);    
    }else if(ret == 0){    
        int cnt = 0;    
        while(1){    
            printf("I am child. pid = %d, ppid = %d, g_val = %d, &g_val = %p\n\n"\    
                ,getpid(),getppid(),g_val,&g_val);    
            sleep(1);    
            ++cnt;    
            if(cnt == 5){    
                g_val = 200;    
                printf("child change g_val 100 -> 200 success\n");    
            }    
        }    
    }else{    
        while(1){    
            printf("I am father. pid = %d, ppid = %d, g_val = %d, &g_val = %p\n\n"\                 
                ,getpid(),getppid(),g_val,&g_val);    
            sleep(1);    
        }    
    }    
    return 0;    
} 

operation result:

[customer@VM-4-10-centos virtual]$ ./myproc 
I am father. pid = 20975, ppid = 32118, g_val = 100, &g_val = 0x60106c

I am child. pid = 20976, ppid = 20975, g_val = 100, &g_val = 0x60106c

I am father. pid = 20975, ppid = 32118, g_val = 100, &g_val = 0x60106c

I am child. pid = 20976, ppid = 20975, g_val = 100, &g_val = 0x60106c

I am father. pid = 20975, ppid = 32118, g_val = 100, &g_val = 0x60106c

I am child. pid = 20976, ppid = 20975, g_val = 100, &g_val = 0x60106c

I am father. pid = 20975, ppid = 32118, g_val = 100, &g_val = 0x60106c

I am child. pid = 20976, ppid = 20975, g_val = 100, &g_val = 0x60106c

I am father. pid = 20975, ppid = 32118, g_val = 100, &g_val = 0x60106c

I am child. pid = 20976, ppid = 20975, g_val = 100, &g_val = 0x60106c

I am father. pid = 20975, ppid = 32118, g_val = 100, &g_val = 0x60106c

child change g_val 100 -> 200 success
I am child. pid = 20976, ppid = 20975, g_val = 200, &g_val = 0x60106c

I am father. pid = 20975, ppid = 32118, g_val = 100, &g_val = 0x60106c

I am child. pid = 20976, ppid = 20975, g_val = 200, &g_val = 0x60106c

I am father. pid = 20975, ppid = 32118, g_val = 100, &g_val = 0x60106c

I am child. pid = 20976, ppid = 20975, g_val = 200, &g_val = 0x60106c

Result analysis:

1. The process is independent, and the child process modifies the data, even if it is a global variable, it does not affect the parent process

2. It can be observed that the data is modified, but its address is the same, and it does not affect each other

We found that the output addresses of the parent and child processes are the same, but the variable contents are different! The following conclusions can be drawn:

  • The variable content is different, so the variables output by the parent and child processes are definitely not the same variable
  • But the address value is the same, indicating that this address is definitely not a physical address!
  • Under the Linux address, this address is called a virtual address
  • The addresses we see in almost all languages ​​​​such as C/C++ are all virtual addresses! The physical address cannot be seen by the user at all, and is managed by the OS

The OS must be responsible for converting virtual addresses (also called linear addresses) into physical addresses.

2. Process address space

2.1 Address space concept

The address printed above is printed after the program is loaded into the memory to form a process. Therefore, it is inaccurate to say "the address space of the program" before, and it should be said to be the process address space.

Address space (address space) represents the memory size occupied by any computer entity. Examples include peripherals, files, servers, or a network computer. Address space includes physical space and virtual space

The address space in the kernel must be a data structure in essence in the future, and it must be associated with a specific process in the future! To access physical addresses, virtual addresses need to be mapped first! Illegal requests will prohibit mapping, protecting physical memory!

The address space is a kernel data structure mm_struct , which has at least the division of each area, and the process control block PCB contains the pointer of the process address space data structure , as shown in the figure

2.2 Address space and page table mapping analysis

The address space and page table (user-level page table) are private to each process

 As long as it is ensured that the page table of each process maps different areas of physical memory, it can be done, and the processes will not interfere with each other, ensuring the independence of the process! ! (Even if the address space of each process is the same, as long as the page tables are different, it is guaranteed to be mapped to different areas of physical memory)

2.3 Copy-on-write and virtual address re-analysis

Therefore, the above 1.6 analysis of the virtual address and physical address codes and results under Linux can be explained :

#include<stdio.h>    
#include<unistd.h>    
#include<stdlib.h>    
#include<sys/types.h>    
    
int g_val = 100;    
    
int main(){    
    pid_t ret = fork();    
    if(ret < 0){    
        perror("fork");    
        exit(-1);    
    }else if(ret == 0){    
        int cnt = 0;    
        while(1){    
            printf("I am child. pid = %d, ppid = %d, g_val = %d, &g_val = %p\n\n"\    
                ,getpid(),getppid(),g_val,&g_val);    
            sleep(1);    
            ++cnt;    
            if(cnt == 5){    
                g_val = 200;    
                printf("child change g_val 100 -> 200 success\n");    
            }    
        }    
    }else{    
        while(1){    
            printf("I am father. pid = %d, ppid = %d, g_val = %d, &g_val = %p\n\n"\                 
                ,getpid(),getppid(),g_val,&g_val);    
            sleep(1);    
        }    
    }    
    return 0;    
} 

As shown in the figure below, when using the system call interface to create a new process, the data code after fork, the parent and child processes will be executed at the same time, and a new process control block will be added at the same time . The parent and child processes point to the same physical space through the same page table at the beginning . The address space corresponding to the process address space used by it is also the same, and the parent and child processes point to the same g_val, therefore, the addresses of g_val corresponding to the parent process and the child process are the same , but when the child process tries to modify the g_val variable, in order to ensure that the process The operating system recognizes that the current child process finds g_val through the page table and wants to modify g_val . At this time, the operating system will open up a new space, copy the above value, and modify the mapping relationship. Therefore, different physical memory addresses are used. independent of each other

 Parent and child processes use the same virtual address when they are created, and when they are modified, they are identified by the operating system, copied again, and open up a new space. Different physical addresses are mapped through the page table, and different physical addresses are modified at this time. The virtual address of the data at the address is not affected. This strategy is called copy-on-write.

Copy -on-write ( COW for short ) is an optimization strategy in the field of computer programming . The core idea is that if multiple callers request the same resource (such as memory or data storage on disk) at the same time, they will jointly obtain the same pointer to the same resource until a caller tries to modify the resource. When the content is displayed, the system will actually copy a private copy (private copy) to the caller, while the original resource seen by other callers remains unchanged. This process is transparent to other callers . The main advantage of this method is that if the caller does not modify the resource, no copy (private copy) will be created, so multiple callers can share the same resource when they only read.

2.4 After fork, ret saves different return value analysis

code:

#include<stdio.h>    
#include<unistd.h>    
#include<stdlib.h>    
#include<sys/types.h>    
       
    
int main(){    
    pid_t ret = fork();    
    if(ret < 0){    
        perror("fork");    
        exit(-1);    
    }else if(ret == 0){    
        int cnt = 0;    
        while(1){    
            printf("I am child. pid = %d, ppid = %d\n"\    
                ,getpid(),getppid());    
            sleep(1);    
       }    
    }else{    
        while(1){    
            printf("I am father. pid = %d, ppid = %d\n"\                 
                ,getpid(),getppid());    
            sleep(1);    
        }    
    }    
    return 0;    
} 

analyze:

The fork function will create a new process control block before the return, and at this time the return will be executed twice at the same time, and the essence of the return is to write the ret variable of the parent and child processes, and then copy-on-write will be performed. The parent and child processes each have their own physical space. Although the corresponding virtual addresses are the same, the values ​​of the corresponding physical addresses are different, but we use the same variable (virtual address) to identify them in the user layer.

3. Expansion and summary

3.1 Program Internal Address and Instruction Internal Address

When our program is compiled to form an executable program (redirectable binary file), when it is not loaded into memory, is there an address inside our program?

[customer@VM-4-10-centos adress]$ objdump myproc -afh

myproc:     file format elf64-x86-64
myproc
architecture: i386:x86-64, flags 0x00000112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x0000000000400490

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .interp       0000001c  0000000000400238  0000000000400238  00000238  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  1 .note.ABI-tag 00000020  0000000000400254  0000000000400254  00000254  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  2 .note.gnu.build-id 00000024  0000000000400274  0000000000400274  00000274  2**2
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  3 .gnu.hash     0000001c  0000000000400298  0000000000400298  00000298  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  4 .dynsym       00000078  00000000004002b8  00000000004002b8  000002b8  2**3
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  5 .dynstr       00000046  0000000000400330  0000000000400330  00000330  2**0
                  CONTENTS, ALLOC, LOAD, READONLY, DATA
  6 .gnu.version  0000000a  0000000000400376  0000000000400376  00000376  2**1
                  CONTENTS, ALLOC, LOAD, READONLY, DATA

Through the above command, we can see that the VMA (Virtual Memory Address) virtual storage address can be seen:

In fact, there is already an address, the program needs to link the dynamic and static library, and the use of the dynamic and static library function is to use the address of the function to call, the executable program actually has the address inside when it is compiled!

Don't just understand that the address space is something that the OS must abide by, in fact, the compiler must also abide by it! ! ! That is, when the compiler compiles the code, it has already formed various areas (code area, data area) for us, and uses the same addressing method as in the Linux kernel to address each variable and each line of code. Therefore, when the program is compiled, each field already has a virtual address.

The address inside the program still uses the virtual address compiled by the compiler. When the program is loaded into the memory, each line of code and each variable has a physical address (external address)

3.2 Analysis of CPU Execution Instructions

CPU execution instruction step explanation:

  1. It can be seen from the above that when a piece of code is compiled to generate a redirectable binary executable file, each line of code and each variable will form various areas according to the same addressing method of the Linux kernel, and address them, that is, the virtual memory address .
  2. When running the executable file, the PCB is built, and the process control block contains the data structure of the process address space, and the mm_struct is filled with the compiled VMA partition
  3. The address space is formed, and the operating system uses the virtual address to map the physical address using the page table. At this time, the CPU can find the corresponding instruction to be executed through the physical address.
  4. When the CPU reads the instruction from the physical address, there is also an address inside the instruction (for example, when the internal function of the code does a jump, it is addressed according to the addressing method, that is, the virtual address VMA, and the virtual address scheme is used. When it is loaded into the memory , did not change the instruction content, so the CPU reads the instruction inside as a virtual address), the inside of the instruction is a virtual address, after executing the current instruction, it will jump to the virtual address of the process address space, and find the code again according to the virtual address of the jump The page table maps the physical address of the code, and the process is executed in a loop.

3.3 Summary—Why Address Spaces?

The first reason:

  • Any illegal access or mapping will be recognized by the operating system and terminate your process!
  • All processes crash, that is, the process exits! The page table also has read and write permissions and other permissions maintained by its mapping relationship, so the operating system killed the process.
  • The existence of address space, the essence of effectively intercepting our access to physical memory space is to effectively protect physical memory.
  • Because the address space and page table are created and maintained by the OS, it means that anyone who wants to use the address space and page table for mapping must also access it under the supervision of the OS!
  • It also effectively protects all legal data in physical memory, including each process, and valid data related to the kernel.

Second reason:

  • Because of the existence of address space and table mapping, in our physical memory, future data can be loaded at any location in the physical memory.
  • It is precisely because of its existence that the allocation of physical memory and the management of processes can be independent of each other! !
  • The physical memory allocation is mapped to the memory management module in the Linux kernel, and the process management is mapped to the process management module to complete the decoupling!
  • Therefore, when we use new and malloc space in C, C++ and other languages, the essence is actually the virtual address space memory application.
  • If you apply for physical memory and don't use it immediately, space will be wasted! So, in essence, (because of the existence of the address space, the application space of the upper layer is actually applied on the address space, and the physical memory may not even give you a byte, and when you actually access the physical space, Only then execute the relevant management algorithm, help you apply for memory, and build a table mapping relationship), and then let you access the memory! The content in brackets is automatically completed by the operating system, and the user, including the process, is completely unaware of it! (Technology involved: page fault interrupt ) (memory allocation uses a delayed allocation strategy to improve the efficiency of the whole machine)

The third reason:

  • Because it can theoretically be loaded at any location in physical memory, almost all data and codes in physical memory are out of order in memory! It will undoubtedly increase the cpu access time, but, because of the existence of the page table, it can map the virtual address and physical address of the address space, so can all the memory distribution from the perspective of the process be in order! !
  • The existence of address space + page table can order the distribution of physical memory space.
  • The address space is the big cake that the OS draws for the process. Combined with the above, the data code in the physical memory that the process wants to access is not in the physical memory. Similarly, different processes can be mapped to different physical memory, which is very easy. Easy to achieve process independence! !
  • Process independence can be achieved through address space + page table

Summarize:

Because of the existence of address space, each process thinks that it has 4GB (in 32-bit environment) space, and each area is in order, and then can be mapped to different areas through the page table to realize the independence of the process! (Each process thinks it has exclusive memory and does not know the existence of other processes)

3.4 Re-understanding suspension

The essence of loading is to create a process, so is it necessary to load all the program code and data into the memory immediately, and create a kernel data structure to establish a mapping relationship?

the answer is no

In the most extreme cases, even only kernel structures are created! Such a process is given a state, which is the new state. In theory, batch loading of programs can be realized! Such as large-scale mentioned games, the loading memory is loaded in batches!

The essence of loading is swapping in. Since it can be loaded in batches, the executed code can be swapped out in batches if it is no longer used!

Even if this process will not be executed again for a short time, such as being blocked, the data and code of this process will be swapped out to the disk at this time, and this state is hang!

The global page table directory is stored in the memory descriptor mm_struct and pgd pointed to by the domain mm of the process descriptor task_struct. windman521 2011-12-21  Each process has its own page table , which is stored in task_struct.

When the page table is mapped, not only the memory is mapped, but the location in the disk can also be mapped!

Guess you like

Origin blog.csdn.net/IfYouHave/article/details/131057977