Detailed Linux process

Linux server development related video analysis:

The realization of linux kernel, process scheduler, complete fair scheduler CFS
BAT interview must: how to choose multi-thread, multi-process, coroutine and how to thread pool is the most efficient
Linux kernel, the realization of inter-process communication components

process

A program refers to an executable file stored in external storage (such as a hard disk), and a process refers to a program during execution. The process includes a text section and a data section, except for the code section and the data section , Process generally also contains open files, signals to be processed and CPU context, etc.

Process descriptor

Linux process uses struct task_struct to describe (include/linux/sched.h), as follows:

struct task_struct {
    
    
    /*
     * offsets of these are hardcoded elsewhere - touch with care
     */
     volatile long state;    /* -1 unrunnable, 0 runnable, >0 stopped */
     unsigned long flags;    /* per process flags, defined below */
     int sigpending;
     mm_segment_t addr_limit;    /* thread address space:
                        0-0xBFFFFFFF for user-thead
                        0-0xFFFFFFFF for kernel-thread
                     */
     struct exec_domain *exec_domain;
     volatile long need_resched;
     unsigned long ptrace;


     int lock_depth;     /* Lock depth */


/*
 * offset 32 begins here on 32-bit platforms. We keep
 * all fields in a single cacheline that are needed for
 * the goodness() loop in schedule().
 */
     long counter;
     long nice;
     unsigned long policy;
     struct mm_struct *mm;
     int processor;
     ...
}

Linux connects all processes using a two-way linked list. As shown in the figure below, the Insert picture description here
Linux kernel uses a trick to speed up the acquisition of the task_struct structure of the current process, which is to place task_struct at the bottom of the kernel stack, so that it can be quickly obtained through the esp register The task_struct structure of the currently running process. As shown below: Insert picture description here
Take the task_struct code of the currently running process as follows:

static inline struct task_struct * get_current(void)
{
    
    
     struct task_struct *current;
     __asm__("andl %%esp,%0; ":"=r" (current) : "0" (~8191UL));
     return current;
}

Process status

The state field of the process descriptor is used to save the current state of the process. The state of the process is as follows:

TASK_RUNNING (running)-the process is in an executable state, the process in this state is either being executed by the CPU
or waiting to be executed (the CPU is occupied by other processes).
TASK_INTERRUPTIBLE (interruptible waiting)-The process is in a waiting state. It is waiting for certain conditions to be established or receiving certain signals, the
process will be awakened and become running.
TASK_UNINTERRUPTIBLE (Uninterruptible Waiting)-The process is in a waiting state. It is waiting for certain conditions to be established. The process will be awakened and become running, but it cannot be awakened by a signal.
TASK_TRACED (tracked)-The process is in a tracked state, for example, to debug the process through the ptrace command.
TASK_STOPPED (Stop)-The process is in a stopped state and the process cannot be executed. Generally, a process that receives SIGSTOP, SIGTSTP,
SIGTTIN, and SIGTTOU signals will become TASK_STOPPED.

The conversion between the clock states is as follows: Insert picture description here
[Article benefits] C/C++ Linux server architect learning materials plus group 812855908 (data including C/C++, Linux, golang technology, Nginx, ZeroMQ, MySQL, Redis, fastdfs, MongoDB, ZK, streaming media, CDN, P2P, K8S, Docker, TCP/IP, coroutine, DPDK, ffmpeg, etc.)
Insert picture description here

Process creation

In the Linux system, the process is created using the fork() system call. The fork() call will create a child process the same as the parent process. The only difference is the return value of fork(). The parent process returns the process ID of the child process. , And the child process returns 0.

Linux uses Copy On Write when creating a child process, that is, the memory space of the parent process is used when the child process is created, and the corresponding memory page is copied only when the child process or the parent process modifies the data.

When the fork() system call is called, it will fall into the kernel space and call the sys_fork() function. The sys_fork() function will call the do_fork() function. The code is as follows (arch/i386/kernel/process.c):

asmlinkage int sys_fork(struct pt_regs regs)
{
    
    
     return do_fork(SIGCHLD, regs.esp, ®s, 0);
}

The main job of do_fork() is to apply for a process descriptor, and then initialize the various fields of the process descriptor, including calling the copy_files() function to copy the opened files, calling the copy_sighand() function to copy the signal processing function, and calling the copy_mm() function to copy the process Virtual memory space, call the copy_namespace() function to copy the namespace. The code is as follows:

int do_fork(unsigned long clone_flags, unsigned long stack_start,
        struct pt_regs *regs, unsigned long stack_size)
{
    
    
     ...
     p = alloc_task_struct(); // 申请进程描述符
     ...
     if (copy_files(clone_flags, p))
          goto bad_fork_cleanup;
     if (copy_fs(clone_flags, p))
          goto bad_fork_cleanup_files;
     if (copy_sighand(clone_flags, p))
          goto bad_fork_cleanup_fs;
     if (copy_mm(clone_flags, p))
          goto bad_fork_cleanup_sighand;
     if (copy_namespace(clone_flags, p))
          goto bad_fork_cleanup_mm;
     retval = copy_thread(0, clone_flags, stack_start, stack_size, p, regs);
     ...
     wake_up_process(p);
     ...
}

It is worth noting that do_fork() also calls the copy_thread() function. The copy_thread() function is mainly used to set the CPU execution context struct thread_struct structure of the process. The code is as follows:

int copy_thread(int nr, unsigned long clone_flags, unsigned long esp,
    unsigned long unused,
    struct task_struct * p, struct pt_regs * regs)
{
    
    
     struct pt_regs * childregs;


     // 指向栈顶(见图2)
     childregs = ((struct pt_regs *) (THREAD_SIZE + (unsigned long) p)) - 1;
     struct_cpy(childregs, regs);  // 复制父进程的栈信息
     childregs->eax = 0;   // 这个是子进程调用fork()之后的返回值, 也就是0
     childregs->esp = esp; // 设置新的栈空间


     p->thread.esp = (unsigned long) childregs;      // 子进程当前的栈地址, 调用switch_to()的时候esp设置为这个地址
     p->thread.esp0 = (unsigned long) (childregs+1); // 子进程内核空间栈地址
     p->thread.eip = (unsigned long) ret_from_fork;  // 子进程将要执行的代码地址


     savesegment(fs,p->thread.fs);
     savesegment(gs,p->thread.gs);


     unlazy_fpu(current);
     struct_cpy(&p->thread.i387, ¤t->thread.i387);


     return 0;
}

The do_fork() function finally calls the wake_up_process() function to wake up the child process and let the child process enter the running state.

Kernel thread

The Linux kernel has many tasks to do, such as flushing the data in the buffer to the hard disk regularly, and recovering the memory when the memory is insufficient. These tasks need to be completed by the kernel thread. The main difference between the kernel thread and the ordinary process is : The kernel thread does not have its own virtual space structure (struct mm). Every time a kernel thread is executed, it runs with the virtual memory space structure of the currently running process, because the kernel thread will only run in the kernel state, and each process The kernel space is the same, so it is feasible to run with the virtual memory space structure of other processes.

The kernel thread is created using the kernel_thread() function, the code is as follows:

int kernel_thread(int (*fn)(void *), void * arg, unsigned long flags)
{
    
    
    long retval, d0;


    __asm__ __volatile__(
        "movl %%esp,%%esi\n\t"
        "int $0x80\n\t"        /* Linux/i386 system call */
        "cmpl %%esp,%%esi\n\t"  /* child or parent? */
        "je 1f\n\t"     /* parent - jump */
        /* Load the argument into eax, and push it.  That way, it does
         * not matter whether the called function is compiled with
         * -mregparm or not.  */
        "movl %4,%%eax\n\t"
        "pushl %%eax\n\t"      
        "call *%5\n\t"      /* call fn */
        "movl %3,%0\n\t"    /* exit */
        "int $0x80\n"
        "1:\t"
    :"=&a" (retval), "=&S" (d0)
    :"0" (__NR_clone), "i" (__NR_exit),
    "r" (arg), "r" (fn),
    "b" (flags | CLONE_VM)
    : "memory");
    return retval;
}

Because this functional style is implemented using embedded assembly, it is a bit difficult to understand, but the main process is to create a new process by calling the _clone() function, and creating a process is to specify the process to borrow other processes by passing in the CLONE_VM flag Virtual memory space structure