Process address space

In addition to managing its own memory, the kernel also needs to manage user space memory.

The memory of a process in the user space is called the process address space, which is the memory seen by each user space process in the system.

Linux uses virtual memory technology, so all processes in the system share memory in a virtual manner.

For a process, it seems that it can access all the physical memory of the entire system.

The process address space consists of process-addressable virtual addresses, which is a 32-bit or 64-bit flat address space.

"Flat" means that the address space range is an independent continuous range.

Modern operating systems that use virtual memory generally use a flat address space rather than a segmented memory model.

Although a process can address 4G (32-bit) virtual memory, it does not have the right to access all virtual addresses.

The legal address space that can be accessed is called the memory area .

The memory area has related permissions, such as readable, writable, executable, etc. for related processes.

If the process accesses a memory area that is not in the valid template, or accesses the memory area in an incorrect way, the kernel will terminate the process and return a segmentation fault.

The memory area contains various memory objects:

Code snippet
Data segment
BSS segment;
User space stack;
The code segment, data segment and BSS segment of the shared library;
Memory mapped file;
Shared memory segment;
Anonymous memory mapping (for example, memory allocated by malloc());

Memory descriptor

The memory descriptor structure represents the address space of the process and is located in include\linux\mm_types.h.

The mm member in the task_struct structure represents the memory descriptor used by the process.

The mm_struct structure of the process is allocated from the mm_cachep slab cache using the allocate_mm() macro:

/*
 * Allocate and initialize an mm_struct.
 */
struct mm_struct * mm_alloc(void)
{
    struct mm_struct * mm;
    mm = allocate_mm();
    if (mm) {
        memset(mm, 0, sizeof(*mm));
        mm = mm_init(mm, current);
    }
    return mm;
}

The content of the memory descriptor of the process comes from the memory descriptor of the parent process copied by copy_mm().

static int copy_mm(unsigned long clone_flags, struct task_struct * tsk)

For threads, because the memory descriptor is common, no new memory descriptor is needed:

    if (clone_flags & CLONE_VM) {
        atomic_inc(&oldmm->mm_users);
        mm = oldmm;
        goto good_mm;
    }

Cancel the memory descriptor usage:

static void exit_mm(struct task_struct * tsk);

The mm member in the process descriptor corresponding to the kernel thread is empty, because the kernel thread does not need the process address space, so using the process descriptor is a bit wasteful.

But the kernel thread also needs to use some content in the kernel descriptor, such as the page table, so the kernel thread chooses to use the memory descriptor of the previous process.

Virtual memory area

The vm_area_struct structure is used to describe the memory area , located in include\linux\mm_types.h.

Memory areas in Linux memory are also often referred to as virtual memory areas (Virtual Memory Areas, VMA).

The vm_area_struct structure describes an independent memory range on a continuous interval in the specified address space.

Each memory area has consistent attributes.

Different memory areas in the same address space cannot overlap.

The vm_area_struct structure contains the VMA flag, which is used to describe the behavior and information of the page contained in the memory area. It is a code of conduct that the kernel needs to comply with when processing the page, not a hardware requirement.

The VMA logo is as follows:

The vm_area_struct structure contains the vm_ops member, which points to the operation function related to the specified memory area:

struct vm_operations_struct {
    void (*open)(struct vm_area_struct * area);
    void (*close)(struct vm_area_struct * area);
    int (*fault)(struct vm_area_struct *vma, struct vm_fault *vmf);
    /* notification that a previously read-only page is about to become
     * writable, if an error is returned it will cause a SIGBUS */
    int (*page_mkwrite)(struct vm_area_struct *vma, struct vm_fault *vmf);
    /* called by access_process_vm when get_user_pages() fails, typically
     * for use by special VMAs that can switch between memory and hardware
     */
    int (*access)(struct vm_area_struct *vma, unsigned long addr,
              void *buf, int len, int write);
#ifdef CONFIG_NUMA
    /*
     * set_policy() op must add a reference to any non-NULL @new mempolicy
     * to hold the policy upon return.  Caller should pass NULL @new to
     * remove a policy and fall back to surrounding context--i.e. do not
     * install a MPOL_DEFAULT policy, nor the task or system default
     * mempolicy.
     */
    int (*set_policy)(struct vm_area_struct *vma, struct mempolicy *new);
    /*
     * get_policy() op must add reference [mpol_get()] to any policy at
     * (vma,addr) marked as MPOL_SHARED.  The shared policy infrastructure
     * in mm/mempolicy.c will do this automatically.
     * get_policy() must NOT add a ref if the policy at (vma,addr) is not
     * marked as MPOL_SHARED. vma policies are protected by the mmap_sem.
     * If no [shared/vma] mempolicy exists at the addr, get_policy() op
     * must return NULL--i.e., do not "fallback" to task or system default
     * policy.
     */
    struct mempolicy *(*get_policy)(struct vm_area_struct *vma,
                    unsigned long addr);
    int (*migrate)(struct vm_area_struct *vma, const nodemask_t *from,
        const nodemask_t *to, unsigned long flags);
#endif
};

There are two members mmap and mm_rb in the memory descriptor, which are used to access the memory area. The memory area pointed to by these two members is the same, but the former is a linked list, which is fast to traverse; the latter is a red-black tree, which is fast to search for elements.

Memory area operation function:

/* Look up the first VMA which satisfies  addr < vm_end,  NULL if none. */
extern struct vm_area_struct * find_vma(struct mm_struct * mm, unsigned long addr);
extern struct vm_area_struct * find_vma_prev(struct mm_struct * mm, unsigned long addr,
                         struct vm_area_struct **pprev);
static inline struct vm_area_struct * find_vma_intersection(struct mm_struct * mm, unsigned long start_addr, unsigned long end_addr)

Create address range

The kernel uses do_mmap() to create a new linear address range and adds the address range to the address space of the process:

static inline unsigned long do_mmap(struct file *file, unsigned long addr,
    unsigned long len, unsigned long prot,
    unsigned long flag, unsigned long offset)

The user space can use the mmap() system call to obtain the function of the kernel function do_mmap().

Corresponding to the do_munmap() function, delete the specified address range from the specific process address space:

extern int do_munmap(struct mm_struct *, unsigned long, size_t);

Page table

The pgd member in the memory descriptor points to the global catalog:

Linux uses a three-level page table to complete address translation.

"Linux Kernel Design and Implementation" Reading Notes-Process Address Space

Process address space

Memory descriptor

Virtual memory area

Create address range

Page table

Guess you like