[Linux] The concept of a process--program address space (2)

  1. language-level addresses

In the past, when we were learning C language pointers, we would print addresses and have memory-level memory.
watch a piece of code
So language-level addresses are all virtual addresses, not real physical addresses.
Printing one is virtual, printing all is virtual, and the space printed out is not real memory. It's called a virtual address space.
Under the Linux address, this address is called a virtual address/linear address
The addresses we see in C/C++ language are all virtual addresses! The physical address cannot be seen by the user at all, and is managed by the OS
The OS must be responsible for translating virtual addresses into physical addresses

  1. Know the process address space

When the process is executing by itself, the process itself will think that it monopolizes system resources (in fact it is not) .
for example:
So the process address space (virtual address space) is the pie drawn by the operating system for each process.
The pie (process address space) drawn by the operating system also needs to be managed.
The essence of operating system management is: first describe and then manage.
So the operating system kernel has a data structure mm_struct to manage the process address space of each process.
mm_struct, which can describe every attribute of the process address space.
The essence of the address space is: a data structure of the kernel (mm_struct)

  1. mm_struct

Take a 32-bit machine as an example
From the virtual address, you need to map through the page table, find the physical address, and then find the corresponding data in the memory.
The above are all done by the operating system for us.
Virtual addresses are continuous address spaces, so virtual addresses are also called linear addresses.
Each process has its own virtual process address space, which needs to be mapped to the real physical address through the page table.
Each process does not see its own physical memory, only its own virtual memory, and then the operating system maps to the real physical address through the page table for the process to call data.
  1. Why is there an address space && page table

4.1. Protect the operating system in disguise

如果让进程直接访问物理内存,万一越界或者非法访问了怎么办呢?
所以直接让进程访问物理内存很不安全,万一修改到系统内存里面的数据咋办?
万一读取到系统内存的数据,是不是就隐私泄漏了呢?
所以页表不仅仅是用来映射物理地址的,还能检查地址访问是否合法.我们对物理内存做非法访问的时候就会被操作系统拦截.
所以虚拟地址空间和页表的存在是为了变相保护操作系统.

4.2.保证了进程独立性

上面我们写过一个代码,父进程创建了子进程,子进程一段时间后修改了全局变量的值,此时当我们去打印这个全局变量的值和对应的地址的时候,就会发现一个问题,为什么打印的地址一样而值却不一样?
此时我们就知道了,打印的这个地址是虚拟地址.是通过页表映射到物理地址的.
所以就很好的解释了为什么.打印的地址一样而值却不一样.

写时拷贝的问题

父进程创建子进程,可以认为子进程是父进程哪里拷贝过来的,一开始其实子进程和父进程的全局变量通过页表映射到的是同一块物理内存,但是当我们尝试写入或者修改的时候,操作系统就会在内存重新找一块内存空间,然后进行数据拷贝,更改页表映射,然后让进程进行修改.这种技术就叫做写时拷贝.是为了保证进程的独立性.两个进程互相不影响.
所以操作系统为保证进程的独立性,做了很多工作,通过地址空间,通过页表,让不同的进程,映射到不同的物理内存处.可以更加方便的进行进程和进程的数据代码解耦.保证进程独立性.

4.3.让编译代码和进程执行统一看待虚拟地址空间

当一个可执行程序在磁盘的时候,程序里面有地址吗? 有的
你不要认为只有操作系统会遵守虚拟地址空间,编译器也要遵守!
编译器在编译代码的时候,就是按照虚拟地址空间的方式对我们的代码和数据进行编址的!
所以我们在编写到代码的时候需要选择32位还是64位编译,编译出来的地址是4字节和8字节.这里32位和64位说的就是按照32位的虚拟地址空间还是64位的虚拟地址空间,进行代码和数据的编址.
逻辑地址还有其他的实现方案,所以 逻辑地址 != 虚拟地址
在Linux中:逻辑地址的实现方案和虚拟地址编址差不多,非常相似,所以有时候可以直接将逻辑地址直接填进页表的左边,当成虚拟地址.

总结:

虚拟进程的另一个作用:
让进程以统一的视角,来看待进程对应的代码和数据等各个区域,方便使用.
方便编译器也以统一的视角来进行编译代码.
因为规则是一样的.

Guess you like

Origin blog.csdn.net/zxf123567/article/details/129676057