System call process and method under Linux

table of Contents

1. System call process

2. Three methods of system call

2.1. Library functions provided by glibc

2.2, use syscall to call directly

2.3, trapped by int instruction


System Call is a set of interfaces provided by the operating system for the interaction between processes running in user mode and hardware devices (such as CPUs, disks, printers, etc.). When the user process needs to make a system call, the CPU switches to the kernel mode through a soft interrupt to start executing the kernel system call function.

1. System call process

Take Linux 0.11 as an example to briefly describe the calling process, and did not verify whether the modern operating system has changed, but the basic idea should be similar. As shown below:

Introduce the system call process under Linux Introduce the system call process under Linux

First, the application program can directly call the API provided by the system, which can be done in the user mode (Ring3).

Then the corresponding API will save the corresponding system call number in the eax register (this step is implemented by inline assembly), and then use int 0x80 to trigger the interrupt (inline assembly), and enter the interrupt processing function (the function is Completely written by assembly code), this time it enters the kernel mode (Ring0).

In the interrupt handling function, the system call corresponding to the system call number is called. In this function, the two registers ds (data segment register) and es (extra register) will be set to point to the kernel space . In this way, we cannot transfer data from user mode to kernel mode (such as open(const char * filename, int flag, ...), the address of the string pointed to by the filename pointer is in user space, If you take the corresponding place in the kernel space, there is no such string at all), what should I do? The fs register in the interrupt handling function is set to point to the user space , so the problem is solved.

In the system call, the corresponding operations are performed, such as opening files, writing files, and so on.

After processing, it will return to the interrupt processing function, and the return value will be stored in the eax register.

Returning from the interrupt handling function to the API still saves the return value in the eax register. At this time, it is restored from kernel mode to user mode.

Take the value from eax in the API, and do the corresponding judgment to return a different value to indicate the completion of the operation.

Why can so many system calls be called using int 0x80 interrupt?

In the protected mode, there are various interrupts, and the system call is bound to the 0x80 interrupt. When a system call is to be called, int 0x80 is triggered, and the interrupt handling function knows which system call it wants to call through eax. The reason for this is that there are too many system calls and the interrupt number will not be enough, so one is used for centralized management.

There is a table in the operating system, which is used to store the addresses of various system call functions. This table is an array, so the addresses of different functions can be accessed by subscripts. Therefore, one interrupt number + various system call numbers can manage multiple system calls.

The above is taken from "Introducing the System Call Process under Linux"

2. Three methods of system call

The following introduces three methods of system call occurrence under Linux.

2.1. Library functions provided by glibc

glibc is an open source standard C library used under Linux. It is the libc library released by GNU, that is, the runtime library. glibc provides programmers with a rich API (Application Programming Interface). In addition to user-mode services such as string processing and mathematical operations, the most important thing is to encapsulate the system services provided by the operating system, that is, the encapsulation of system calls. So what is the relationship between the system call API provided by glibc and the kernel-specific system call?

  • Usually, each specific system call corresponds to at least one library function encapsulated by glibc. For example, the open file system call provided by the system  sys_open corresponds to a open function in glibc  ;
  • Secondly, glibc a separate API call may invoke multiple systems, such as provided by glibc  printf function will call such as  sys_open, sys_mmap, sys_write, sys_close like the system call;
  • In addition, multiple APIs may only correspond to the same system call. For example , the  functions implemented under glibc, such as,,  mallocand so on calloc, freeare used to allocate and release memory, all of which use kernel  sys_brk system calls.

For example, we use the chmod function provided by glibc to change the file  etc/passwd attribute to 444:

#include <sys/types.h>
#include <sys/stat.h>
#include <errno.h>
#include <stdio.h>

int main()
{
	int rc = 0;
	rc = chmod("/etc/passwd", 0444);
	if (rc == -1)
		fprintf(stderr, "chmod failed, errno = %d\n", errno);
	else
		printf("chmod success!\n");
	return 0;
}

In normal use the user to compile , as output:
the chmod failed, errno = 1
The above system call returns a value of -1, indicating that the system call failed, an error code, in the / usr / include / asm-generic / errno-base The error code description in the .h file is as follows:
#define EPERM 1 /* Operation not permitted */
That is, there is no permission to perform the operation. We cannot modify the attributes of the /etc/passwd file with ordinary user permissions. The result is correct.

2.2, use syscall to call directly

There are many advantages to using the above method. First, you don’t need to know more details, such as the chmod system call number. You only need to understand the prototype of the API provided by glibc. Second, the method has better portability and you can easily change If the program is ported to other platforms, or the glibc library is replaced with another library, the program only needs to be changed a little.
But one disadvantage is that if glibc does not encapsulate a system call provided by a certain kernel, I cannot call the system call through the above method . For example, I added a system call by compiling the kernel. At this time, glibc cannot have the encapsulation API of your new system call. At this time, we can use the syscall function provided by glibc to call directly. The function is defined in the  unistd.h header file, and the function prototype is as follows:

long int syscall (long int sysno, ...)
  • sysno  is the system call number, and each system call has a unique system call number to identify it. In sys/syscall.h there all possible system calls the macro definition.
  • ...  Are the remaining variable-length parameters, which are the parameters of the system call. Depending on the system call, it can take 0 to 5 parameters. If it exceeds the parameters that a specific system call can take, the extra parameters will be ignore.
  • Return value  The return value of this function is the return value of a specific system call. After the system call is successful, you can convert the return value to a specific type. If the system call fails, it returns -1 and the error code is stored in errno it.

Also take the above modification of the attributes of the /etc/passwd file as an example, this time use syscall to directly call:

    ...
	//rc = chmod("/etc/passwd", 0444);
	rc = syscall(SYS_chmod, "/etc/passwd", 0444);
    ...

Compile and execute under ordinary users, and the output result is the same as the above example.

2.3, trapped by int instruction

If we know the whole process of the system call, we should be able to know that the user mode program int 0x80 gets into the kernel mode through the soft interrupt instruction ( sysenterinstructions are introduced in Intel Pentium II ), the parameter is passed through the register, and eax is the system call number. , Ebx, ecx, edx, esi, and edi to pass up to five parameters in turn. When the system call returns, the return value is stored in eax.

Still taking the above modification of file attributes as an example, write the section that calls the system call as inline assembly code:

#include <stdio.h>
#include <sys/types.h>
#include <sys/syscall.h>
#include <errno.h>

int main()
{
	long rc;
	char *file_name = "/etc/passwd";
	unsigned short mode = 0444;

	asm(
		"int $0x80"
		: "=a" (rc)
		: "0" (SYS_chmod), "b" ((long)file_name), "c" ((long)mode)
	);

	if ((unsigned long)rc >= (unsigned long)-132) {
		errno = -rc;
		rc = -1;
	}

	if (rc == -1)
		fprintf(stderr, "chmode failed, errno = %d\n", errno);
	else
		printf("success!\n");

	return 0;
}

If the return value stored in the eax register (stored in the variable rc) is between -1 and -132, it must be interpreted as an error code ( /usr/include/asm-generic/errno.h the maximum error code defined in the file is 132). At this time, write the error code In errno, set the system call return value to -1; otherwise, it returns the value in eax.

The above program is compiled and run under 32-bit Linux with ordinary user rights, and the result is the same as the previous two! , In a 64-bit environment, chmode failed, errno = 22: The parameter is invalid.

 

Guess you like

Origin blog.csdn.net/wangquan1992/article/details/108496821