A little trick about signal processing functions

Insert picture description here This work is licensed under the Creative Commons Attribution-Non-Commercial Use-Share 4.0 International License Agreement in the same way .
This work ( Lizhao Long Bowen by Li Zhaolong creation) by Li Zhaolong confirmation, please indicate the copyright.

Article Directory

introduction

I haven't had anything to do recently, and I started to think about the gadgets that I wanted to play in college for two and a half years but hadn't played. An interesting idea came to my mind almost instantly, that is, how to achieve user-mode preemptive scheduling. This problem is actually a problem in a previous game. In fact, it is to implement a preemptive user mode thread. This gadget itself belongs to a relatively hacky thing, and most people will not touch this gadget when it is idle.

But from a certain point of view, I can't strictly be regarded as an ordinary person now. After all, ordinary people who lie down and sleep without a trace of guilt? And I'm really free and okay. So let's meet this coquettish little thing!

Briefly

The most troublesome aspect of user-mode preemptive scheduling is how to efficiently implement interrupts and scheduling to the coroutine that should be scheduled. The latter is actually what the scheduling algorithm needs to do, and we will not mention it for the time being. So the key to the problem is how to switch the context in the interrupt handler. In the LUTF project, we use signal processing functions to implement interrupts. We can use a signal processing function implementation mechanism to achieve preemptive scheduling.

That is [2]:

When the signal processing function is executed, the kernel will copy the register context sigcontext on the kernel stack of the process to the user mode stack, and then push a sigreturn system call as the return address, and then wait for the signal processing function to complete, sigreturn will It automatically falls into the kernel, and then copies the sigcontext in the user mode back to the kernel stack to completely complete the signal processing and restore the register context of the process.

It can be seen that the process of signal processing is also similar to a coroutine, except that the scheduling is relatively simple.

Of course, the first element of learning is to ask why in everything, why does the kernel play like this? Let's take a look at caltech's eperating system PPT, the picture below is from CS124 Lec15:

Insert picture description here

It is written very clearly. The reason for this is that the kernel stack will be cleared when returning from the user mode to the kernel mode, so that we can no longer return to the interrupted thread. This is actually a bit similar to a coroutine. Scheduled.

Insert picture description here
Because when the final signal processing function returns, it needs to use the data on the user stack to switch to the interrupted thread, so a system call is needed to complete this thing. This system call is sigreturn, the principle here is actually a function call The principle of, after the function ends, there is a first recovery of the stack frame with the help of ebp, and then a call old eipback to the context of the previous function, here is equivalent old eipto sigreturnthe address of.
Insert picture description here
Here to mention that 32-bit is ebp, esp, and 64-bit is rbp, rsp.

Of course, the course also mentioned that this mechanism can implement a user-mode thread library:
Insert picture description here

Magic 72 bytes

It is mentioned in the blog of dog250 that we can find by offsetting the first local variable of the signal processing function by 32 bytes rt_sigframe, and then by offsetting 40 bytes sigcontext. The offset of the following 40 bytes is easy to calculate. Look at the structure. The definition of the body will be known.

Let's take a look at the definition of the structure pushed onto the stack:

#ifdef CONFIG_X86_64

struct rt_sigframe {
    
    
    char __user *pretcode;
    struct ucontext uc;
    struct siginfo info;
    /* fp state follows here */
};
...
/* 一路追溯，看看rt_sigframe展开后的样子 */
// include/uapi/asm-generic/ucontext.h
struct ucontext {
    
    
    unsigned long     uc_flags;
    struct ucontext  *uc_link;
    stack_t       uc_stack;
    struct sigcontext uc_mcontext;  // 这个就是我们要找的东西！
    sigset_t      uc_sigmask;   /* mask last for extensibility */
};

typedef struct sigaltstack {
    
    
	void __user *ss_sp;
	int ss_flags;
	size_t ss_size;
} stack_t;

We can see that if rt_sigframethe first address 40is obtained sigcontext, the first address can be obtained by offsetting one byte , which contains all the register information.

So how is the 32-byte offset calculated?

The description in the Great God’s blog is not clear, only the attribution of 16 bytes of the 32 bytes is mentioned. It is still unclear how the 32-bit offset of dog250 is calculated, let's verify it.

Environment name	value
system	5.9.6-arch1-1
GCC	10.2.0

We try to print the stack information of the next signal processing function, the code comes from [2]:

#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <signal.h>

int i, j, k = 0;
unsigned char *stack_buffer;
unsigned char *p;

void sig_start(int signo)
{
    
    
	unsigned long a = 0x1234567820000304;


	p = (unsigned char*)&a;
    printf("signo : %p; p : %p\n",&signo, &a);
	stack_buffer = (unsigned char *)&a;

	// 以下按照8字节为一组，打印堆栈的内容
	printf("----begin stack----\n");
	for (i = 0; i < 32; i++) {
    
    
		for (j = 0; j < 8; j++) {
    
    
			printf(" %.2x", stack_buffer[k]);
			k++;
		}
		printf("\n");
	}
	printf("----end stack---- : \n",);

	if (signo = SIGINT)
		signal(SIGINT, NULL);
	if (signo = SIGHUP)
		signal(SIGHUP, NULL);
	return;
}

int main()
{
    
    
	printf("process id is %d  %p %p\n",getpid(), main, sig_start);

	signal(SIGINT, sig_start);
	signal(SIGHUP, sig_start);
    unsigned long esp = 0x1234567820000304;
    printf("esp : %p\n", &esp);
    
	printf("esp : %p\n", &esp);
	for (;;);
}

Let's debug it with GDB:

Insert picture description here

We can see that the address of esp is 0x7fffffffdbd0and the value stored at a 16-byte upward offset 0x00007fffffffdbe0is the original rbp, so is it the original rip? Theoretically speaking, it is not, because the scheduling of signals registered with the user mode signal processing function is not to transfer control to the user mode, but the kernel directly executes this handler.

Insert picture description here
So the rip here is definitely different, we only need info rto verify it before triggering the signal in GDB .

Insert picture description here

Obviously you can see that rip is not a value on the stack, so it is basically certain that the handler is scheduled from the kernel.

Insert picture description here

Let's take a look at the address space on the stack, and find that the address of the formal parameter is actually lower than the local variable a, which is 0x7fffffffd5102.

We assume that eight bytes are a frame, and the first frame is determined as the local variable a. From the stack space, it can be determined that the third frame is ebp. The great god said that the fourth frame is sigreturn. This will be verified for a while, this great god is actually There is also a problem. So what is the second frame? When things get to this point, it’s impossible to guess without running the assembly. The answer is given first. It is a safety mechanism on the stack. From the assembly point of view, its function is to detect whether the operation on the stack finally returns to the correct position, not Exist in imagination old rip.

Insert picture description here
It clearly shows what the mysterious variable in the second frame is. First push %rbp, move rsp. At this time, move rsp down 4 frames (we defined the frame earlier), and put the value of the signal processing function parameter in rbp At the next 20 bytes. Then comes the highlight! Put a variable of the check stack into rax, and put the value of rax at the position where rbp is offset by 8 bytes. Isn't this the second frame we mentioned earlier? It's really easy to come by, hahaha!

Then clear rax, put the value of the local variable into rax, and finally put it at a 16-byte downward offset from rbp.

Next, we verify the sigreturnaddress in the fourth frame that the great god said .

Insert picture description here
The above picture shows the ten instructions on the signal processing function PC. We can clearly see that the value of the stack check just now is placed in rax, and then the exclusive OR is used to determine whether it is equal. If it is not equal, it is called __stack_chk_fail@plt. This guess is a Check the mechanism. If they are equal, they will be executed in order, that is, they will be executed sig_start+528, and then executed leaveq. Its function is movq %rbp, %rsp; popq %rbpthat now rsp points to it 0x7fffffffd138, then executes retq, and then prints 0x7fffffffd138the contents of the value, that is __restore_rt, this thing must be called last __kernal_sigreturn[6].
Insert picture description here

Okay, now we probably know the ins and outs of this matter. I just want to say one word when I get here, unblocked!

Finally, I would like to thank the group’s 18-level core team, Hu Qingwei, for his careful guidance. Without him, I might still think that the second frame is old rip.

Finally talk about

In [2] we can actually see how a simple preemptive scheduling is done, and it is technically feasible. Although his code runs on my machine, there will be a segmentation error, but the user mode implemented by our team itself Based on this 72 bytes in the preemptive scheduling framework, it runs normally.

However, it is conceivable that the implementation of user mode threads in this way has the following problems:

It is not cross-platform. Although it can be expected that most of the Linux distribution signal mechanisms are implemented in this way, there is no document telling us what the offset is, but at least it can run on OpenEuler.
Low efficiency, each time the user mode thread switch needs to switch from the kernel mode to the user mode signal processing function, then call sigreturn and then back to the kernel, and finally switch to a new thread to run, a switch requires two user mode to the kernel mode Switching, so there is no advantage in efficiency compared to kernel threads.

And to be honest, I really can’t think of the usefulness of this thing. It may only have a little advantage in creation. It does not need to be maintained by each thread task_struct, and the scheduling logic can be implemented by itself. There is no need to follow the kernel set. These at least seem to be irrelevant now.

to sum up

This article looks a bit messy, but if the reader can figure out two problems, namely the implementation mechanism of the signal and the 72-byte offset, it can be said that it is not a loss. For the first question, please refer to [4]lec15, and the second After the above verification, I think there is no big problem. The only doubt may be __restore_rtwhether it was called in the end sig_return.

reference: