System call Learning

System call Learning

The essence of the system call is a function call, but the function is called system function in kernel mode.

API and system calls the difference between:

  • Each system call corresponds to a service routine, the API may correspond to a plurality of system calls.
  • Some user-mode API provides direct services do not need to use a system call.

Why not just call the kernel function is executed? We can call this step save the system?

User-space program can not directly execute kernel code, the kernel in a protected address space, the process does not allow users to read and write in the kernel address space, which greatly improved the security of the system, the kernel can be checked at the request of the correctness of the request interface secondly, it makes programming easier, providing the equivalent of a set of interfaces. Another point is that the spacer layer has a program more portable.

In fact, the system call is: the application notification system in some way, to tell the kernel they need to perform a system call, send a request to the kernel by soft interrupt, an exception to the CPU switches to perform core state by throwing an exception handler (ie, the system calls the process program) (early to call Linux system call must be int $ 0x80 assembly instructions through execution, resulting in abnormal vector 128 (exceptions are the internal CPU interrupt occur)), then the process is passed to the kernel system call number identification system required call, call the service program into the system and then find the system call service routine procedure for processing, in fact, the service routine is actually processing data corresponding to the number of system calls. The system call handler, but from user mode to achieve the necessary kernel mode conversion process after it, and then returned by the system call service routine, and ultimately by the CPU user mode kernel mode switch back to complete the system call.

Macro-process system call:

  • Users call the number passed to the kernel of a system (EAX register responsible for delivery)

    • There are hundreds of Linux system calls, defines a unique number for each system call, this number is called the system call number, defined in linux / arch / x86 / include / asm / unistd_.h in

      #define __NR_io_setup 0
       34 __SC_COMP(__NR_io_setup, sys_io_setup, compat_sys_io_setup)
       35 #define __NR_io_destroy 1
       36 __SYSCALL(__NR_io_destroy, sys_io_destroy)
       37 #define __NR_io_submit 2
       38 __SC_COMP(__NR_io_submit, sys_io_submit, compat_sys_io_submit)
       39 #define __NR_io_cancel 3
       40 __SYSCALL(__NR_io_cancel, sys_io_cancel)
       ...
       ...
       725 #define __NR_pwritev2 287
       726 __SC_COMP(__NR_pwritev2, sys_pwritev2, compat_sys_pwritev2)
       727 #define __NR_pkey_mprotect 288
       728 __SYSCALL(__NR_pkey_mprotect,  sys_pkey_mprotect)
       729 #define __NR_pkey_alloc 289
       730 __SYSCALL(__NR_pkey_alloc,    sys_pkey_alloc)
       731 #define __NR_pkey_free 290
       732 __SYSCALL(__NR_pkey_free,     sys_pkey_free)
       733 #define __NR_statx 291
       734 __SYSCALL(__NR_statx,     sys_statx)
      
      

      We can see there are 291 system calls.

  • System call kernel function to find the appropriate handler for the system call from this number by performing a table (system call service routine) using a system call number as a subscript, find routine encapsulation system call. (Bar () in kernel mode with a corresponding sys_bar (), then sys_bar () system call service routine)

    • In order to correlate system call number and service routines, use of the kernel system call table,
  • Returns: () function (written in assembly language) by syscall_exit_work. All system calls returns an integer, most package routine returns an integer value dependent on the corresponding system call. (Negative value indicates an error condition)

System call

initialization

Execution int $ 0x80 assembly instructions: call during kernel initialization trap_init () to establish IDT (Interrupt Descriptor Table) 128 good vector corresponding entry

In trap_init arch / x86 / kernel / traps.c of () function can be seen:

#ifdef CONFIG_X86_32

set_system_trap_gate(SYSCALL_VECTOR, &system_call);

set_bit(SYSCALL_VECTOR, used_vectors);

#endif

Wherein SYSCALL_VACTOR 0x80 value, this value by the segment selector, offset type, the DPL field means associated descriptor entry.

Start treatment

system_call function implements the system call handler to save the system call number and the exception handler can be used in all the CPU registers to which respective stack, and then checking the validity of the system call number, if this number is greater than or equal to NR_syscalls, system call handler terminates. If the system call number is invalid, then jump to syscall_badsys at the execution, the results returned a negative return code.

If correct, the call number in accordance with EAX transfer system calls the corresponding service routine.

ENTRY(system_call)
 RING0_INT_FRAME          
 pushl_cfi %eax                      
 SAVE_ALL
 GET_THREAD_INFO(%ebp)
 testl $_TIF_WORK_SYSCALL_ENTRY,TI_flags(%ebp)
 jnz syscall_trace_entry
 cmpl $(nr_syscalls), %eax
 jae syscall_badsys
 syscall_call:
 call *sys_call_table(,%eax,4)
 movl %eax,PT_EAX(%esp)      
 syscall_exit:
 LOCKDEP_SYS_EXIT
 DISABLE_INTERRUPTS(CLBR_ANY)    
 TRACE_IRQS_OFF
 movl TI_flags(%ebp), %ecx
 testl $_TIF_ALLWORK_MASK, %ecx  # current->work
 jne syscall_exit_work
 restore_all:
 TRACE_IRQS_IRET
 restore_all_notrace:
 movl PT_EFLAGS(%esp), %eax  # mix EFLAGS, SS and CS
 movb PT_OLDSS(%esp), %ah
 movb PT_CS(%esp), %al
 andl $(X86_EFLAGS_VM | (SEGMENT_TI_MASK << 8) | SEGMENT_RPL_MASK), %eax
 cmpl $((SEGMENT_LDT << 8) | USER_RPL), %eax
 CFI_REMEMBER_STATE
 je ldt_ss          
 restore_nocheck:
 RESTORE_REGS 4          
 irq_return:
 INTERRUPT_RETURN

Another method to enter the system call is executed sysenter assembly language instructions. (Referred to as fast system calls from user mode to kernel mode fast switching method)

  • Packaging system call number routine loaded EAX, call __kenel_vsyscall () system call will be used to save the memory registers user-mode stack, the stack pointer is copied to the user performing sysenter ebp instruction.
  • CPU switches to kernel mode, the kernel performs a similar sysenter_entry () assembly language function, which function with system_call operation.

How to find the corresponding service routine according to the system call number?

EAX in the system call number is multiplied by four, plus the start address sys_call_table system call table, you can get a pointer to point to the appropriate service routine from this address, memory that is required to find a service routine call.

It defines the system call table in arch / x86 / kernel / syscall_table_32.S in

ENTRY(sys_call_table)
.long sys_restart_syscall   
system call, used for r   
       .long sys_exit
       .long ptregs_fork
       .long sys_read
       .long sys_write
       .long sys_open      

When the end of the service routine execution

  • the system_call () returns the value obtained from the EAX, this returns the value of the position presence of the user mode stack was stored in the EAX register and then terminates execution syscall_exit system call handler.
  • When the command is issued syenter employed, sysenter_entry () and the system_call () function has the same operation. Wherein iret need to get from the kernel stack when switching to kernel mode five parameters stored, and then re-switch to the CPU user mode, the user data ECX obtain a pointer to the current stack.

Parameter passing system call

Before seeing the system calls only passed the EAX register system call number, but some system calls need to pass multiple parameters (such as mmap), then the system follow two principles when passing parameters in registers:

  • Register more than the median length parameter to pass through the development of their address.
  • More than 6 parameters of the system call by transmitting a single register points to a parameter value memory area in the process address space.

In writing system before Linux system programming frequently used file operations call, which is the file descriptor given kernel parameter validation, if not it will return a negative parameter request system call failed.

Experimental section (adding system calls)

Know the principle is more clear idea of ​​doing the experiment needs to be done is to increase the system call log collection system. Currently doing, it is a major step

  • Adding system call number

  • Increase the system call table entries

  • Increase system calls

  • Makefile and modify the function declaration

  • Written in code needed to intercept system calls, the kernel provides interface functions

  • Recompile the kernel

  • Writing code in user mode kernel hanging on the hook function

  • Run user code printing system calls log

Published 15 original articles · won praise 13 · views 9057

Guess you like

Origin blog.csdn.net/weixin_43122409/article/details/88346717