1. The realization principle of the function stack
2. Create a coroutine context environment: coctx_make
3. The context switch of the coroutine: co_swap
3.1 Context switch: coctx_swap
When calling the co_resume function to execute a coroutine, two functions need to be called: coctx_make and co_swap. The former is used to create the context of the coroutine, and the latter is used to switch the context. These two functions are the key to the realization of the coroutine.
When introducing the context of creating a coroutine and context switching, let's first introduce the implementation principle of the function stack.
1. The realization principle of the function stack
1.1 Function stack frame
When a program is running, Linux will allocate memory for a program. The memory area includes code segment, data segment, heap, stack, etc. The address of the heap increases from low address to high address, and the stack address from high address to low address increase. The following figure shows the structure of a stack.
In the figure, there are two stack frames with specific structures drawn, namely function A and function B. There is an area identified by an ellipsis at the top of the stack frame of function A. This area saves the register value of the previous stack frame and the local variables created by function A itself. The following parameters n to parameter 1 are the calling parameters of function A to be passed to function B. So how can function B be obtained? The answer is to use registers.
When the CPU calculates, it puts many variables in registers. According to the different hardware systems, the number and functions of registers are also different. Generally in x86 32-bit, the register %esp
saves the value of the top pointer of the stack frame, and %ebp
saves the value of the bottom pointer of the stack frame, so the head and end of the current stack frame can be known through the %esp
sum %ebp
. In addition to these two registers, there are other general-purpose registers ( %eax
, %edx
etc.) that are used to store temporary values during program execution. When the pushl instruction is executed, it means that a value is pushed into the stack. At this time, the value of %esp will be reduced by 4 bytes. When the popl instruction is executed, it means that a value is popped from the stack. At this time, the value of %esp will increase by 4 bytes. In general, the value of the %esp register always points to the top of the stack.
After understanding the basic knowledge of registers, we can now know how function B can obtain the parameters passed to it by function A. The address of parameter 1 is %ebp + 8
, the address of parameter 2 is , and the address of %ebp + 12
parameter n is %ebp + 4 + 4 * n
. I believe you have already understood that these parameters can be obtained by looking up the pointer at the bottom of the stack, and the reason why these parameters are here is of course that function A has prepared them in advance. In addition 返回地址
, it is stored at the bottom of all parameters . This is the address of the instruction to be executed next after the function B returns.
After looking at function A, look at function B again. At the top of the stack frame of function B 被保存的 %ebp
, this refers to the pointer to the bottom of the stack of function A. After all, %ebp
this register is only one, so when a new function is put on the stack, the old one must be saved and restored after the function is popped off the stack. Below this old stack bottom pointer are other register variables that need to be saved and local variables used internally by function B itself. The next step is 参数构造区域
, that is, function B is about to call another function, and the parameters are prepared here. It can be seen that the stack frame structure of function B and function A is similar.
1.2 Function call examples
int caller()
{
int arg1 = 534;
int arg2 = 1057;
int sum = swap_add(&arg1, &arg2);
int diff = arg1 - arg2;
return sum * diff;
}
int swap_add(int *xp, int *yp)
{
int x = *xp;
int y = *yp;
*xp = y;
*yp = x;
return x + y;
}
Next, we analyze this program line by line. First, the caller function, as shown in the figure below, is the assembly code on the left and the call stack of the function on the right.
First look at the first three lines of code:
pushl %ebp // 保存旧的 %ebp
movl %esp, %ebp // 将 %ebp 设置为 %esp
subl $24, %esp // 将 %esp 减 24 开辟栈空间
These three lines are actually preparing the stack frame. The first line saves the old one %ebp
, which is the bottom pointer of the stack frame of the caller outer function. At this time, the new stack space has not been created, but the old %ebp
row space will be used as the bottom of the new stack frame, which is the bottom pointer of the stack frame, so the second line will be %esp
the value of the stack pointer (always pointing to the top of the stack) Set to the %ebp
top. The third line will %esp
move down 24 bytes. This line actually caller
opens up stack space for the function . As can be seen from the figure, the following space is used to save caller
the local variables arg1 and arg2, and the parameters passed to the next function. Some of the space is unused, this is for address alignment, it does not affect our analysis and can be ignored.
After opening up the stack frame, the caller
internal logic is executed . caller
First, two local variables ( arg1
and arg2
. The corresponding assembly code are created:
movl $534, -4(%ebp)
movl $1057, -8(%ebp)
Which -4(%ebp)
represents %ebp - 4
the position, that is, in FIG arg1
location, the arg2
position is %ebp - 8
a position. These two lines save the 534
sum 1057
to these two locations. Continuing on are these lines:
leal -8(%ebp), %eax // 把 %ebp - 8 这个地址保存到 %eax
movl %eax, 4(%esp) // 把 %eax 的值保存到 %esp + 4 这个位置上
leal -4(%ebp), %eax // 把 %ebp - 4 这个地址保存到 %eax
movl %eax, ($esp) // 把 %eax 的值保存到 %esp 这个位置上
The first line %ebp - 8
stored to this address %eax
, whereas %ebp - 8
is arg2
an address, the address on the next line %esp + 4
in this position, i.e. the figures &arg2
that the region of the block. In fact, this line is swap_add
preparing parameters for the function &arg2
, and the following two lines are preparing parameters &arg1
.
The next line is call swap_add
. This line is to call the function swap_add
. The instruction will push the return address of the function onto the stack and set the program counter PC to the function swap_add的起始地址。
. The return address here is the address of swap_add
the code to be executed after the function returns, that is, the int diff = arg1 - arg2
address. We first enter the swap_add
function, the following is the corresponding code execution diagram:
pushl %ebp // 保存旧的 %ebp
movl %esp, %ebp // 将 %ebp 设置为 %esp
pushl %ebx // 保存 %ebx
swap_add
The first three lines of the corresponding assembly code are caller
similar. They also save the old frame pointer, but because there is swap_add
no need to save additional variables, only one more register is needed %ebx
, so the old value of this register is saved here, but it is not %esp
directly moved down. A length of operation.
movl 8(%ebp), %edx // 从 %ebp + 8 取值保存到 %edx
movl 12(%ebp), %ecx // 从 %ebp + 12 取值保存到 %ecx
These two lines are from caller
stored parameters &arg1
and &arg2
obtaining the local address values, and obtain the address arg1
and arg2
actual values.
mov1 %edx, %ebx
mov1 %ecx, %eax
mov1 %eax, %edx
mov1 %ebx, %exc
These 4 lines are swap operations. Look at the following lines:
addl %ebx, %eax // 将返回值保存到寄存器 %eax
pop %ebx
pop %ebp
ret
swap_add
The return value of the function is stored in %eax
and caller
is obtained from this register for a while. swap_add
The last few lines pop operation, %ebx
and %ebp
were restored caller
values. Finally, when the execution ret
returns to the caller
middle, the ret instruction will pop the return address from the stack and set the value of the program counter PC to the value of the return address.
Next, we continue to return to the caller
middle, just executed call swap_add
, the following lines are executed int diff = arg1 - arg2
, and the results are saved in the %edx
middle. The last line is calculation sum * diff
, and the corresponding assembly code is imull %edx, %eax
. Here is the %edx
and the %eax
values are multiplied and the results stored into %eax
the. In the above analysis, we know that %eax
holds swap_add
the return value here is from %eax
out the return value is calculated, and the results continue to be saved to %eax
, whereas this value is caller
the return value, so call caller
function can also be from this register Get the return value. caller
The last line of assembly code of the function is ret
that this will destroy caller
the stack frame and restore the old value of the corresponding register. At this point, the calling process caller
with swap_add
this function is all analyzed.
2. Coroutine function: CoRoutineFunc
static int CoRoutineFunc( stCoRoutine_t *co,void * )
{
if( co->pfn )
{
co->pfn( co->arg );
}
co->cEnd = 1; // 协程执行结束标识
stCoRoutineEnv_t *env = co->env;
co_yield_env( env );
return 0;
}
// 协程执行结束,从线程环境栈减1,并切换到另外一个协程
void co_yield_env( stCoRoutineEnv_t *env )
{
stCoRoutine_t *last = env->pCallStack[ env->iCallStackSize - 2 ];
stCoRoutine_t *curr = env->pCallStack[ env->iCallStackSize - 1 ];
env->iCallStackSize--;
co_swap( curr, last);
}
3. Create a coroutine context environment: coctx_make
/* 用于分配coctx_swap两个参数内存区域的结构体,仅32位下使用,64位下两个参数直接由寄存器传递 */
struct coctx_param_t
{
const void *s1;
const void *s2;
};
int coctx_make( coctx_t *ctx,coctx_pfn_t pfn,const void *s,const void *s1 )
{
//make room for coctx_param
/*
* ctx->ss_sp 对应的空间是在堆上分配的,在协程创建时初始化,地址是从低到高的增长,而栈是往低地址方向增长的,
* 所以要使用这一块人为改变的栈帧区域,首先地址要调到最高位,即ss_sp + ss_size的位置
*/
char *sp = ctx->ss_sp + ctx->ss_size - sizeof(coctx_param_t);
sp = (char*)((unsigned long)sp & -16L); // 16字节对齐
/* 栈中保存函数的参数 */
coctx_param_t* param = (coctx_param_t*)sp ;
param->s1 = s;
param->s2 = s1;
memset(ctx->regs, 0, sizeof(ctx->regs));
ctx->regs[ kESP ] = (char*)(sp) - sizeof(void*); // 保存栈栈顶指针,kESP = 7
ctx->regs[ kEIP ] = (char*)pfn; // 保存函数指针,kEIP = 0
//------- ss_sp + ss_size
//|pading | 这里是对齐区域
//|s2 |
//|s1 |
//|-------- <- 原esp
//|返回地址 |
//|返回地址 |
//|-------- <- sp(原esp - sizeof(void*) * 2)
//| |
//--------- ss_sp
return 0;
}
4. Context switch of the coroutine: co_swap
/* 当前准备让出CPU的协程叫做current协程,把即将调入执行的叫做 pending 协程 */
void co_swap(stCoRoutine_t* curr, stCoRoutine_t* pending_co)
{
stCoRoutineEnv_t* env = co_get_curr_thread_env();
// 在函数头放一个局部变量,可以获取sp栈顶指针
char c;
curr->stack_sp= &c;
if (!pending_co->cIsShareStack)
{
env->pending_co = NULL;
env->ocupy_co = NULL;
}
else
{
env->pending_co = pending_co;
/* 获取当前占用共享栈的是哪个协程 */
stCoRoutine_t* ocupy_co = pending_co->stack_mem->ocupy_co;
/* 将共享栈的占用协程设置为即将换入的协程 */
pending_co->stack_mem->ocupy_co = pending_co;
/* 保存换出的协程 */
env->ocupy_co = ocupy_co;
/* 保存换出的协程的栈内容到协程实体的结构体中 */
if (ocupy_co && ocupy_co != pending_co)
{
save_stack_buffer(ocupy_co);
}
}
/* 切换协程的上下文 */
coctx_swap(&(curr->ctx),&(pending_co->ctx) );
// stack buffer may be overwrite, so get again;
stCoRoutineEnv_t* curr_env = co_get_curr_thread_env();
stCoRoutine_t* update_ocupy_co = curr_env->ocupy_co;
stCoRoutine_t* update_pending_co = curr_env->pending_co;
if (update_ocupy_co && update_pending_co && update_ocupy_co != update_pending_co)
{
/* 将save_buffer中的栈内容复制到共享栈中 */
if (update_pending_co->save_buffer && update_pending_co->save_size > 0)
{
memcpy(update_pending_co->stack_sp, update_pending_co->save_buffer, update_pending_co->save_size);
}
}
}
/* 将协程的共享栈内容保存到协程实体的结构体中 */
void save_stack_buffer(stCoRoutine_t* ocupy_co)
{
///copy out
stStackMem_t* stack_mem = ocupy_co->stack_mem;
int len = stack_mem->stack_bp - ocupy_co->stack_sp;
if (ocupy_co->save_buffer)
{
free(ocupy_co->save_buffer), ocupy_co->save_buffer = NULL;
}
ocupy_co->save_buffer = (char*)malloc(len); //malloc buf;
ocupy_co->save_size = len;
memcpy(ocupy_co->save_buffer, ocupy_co->stack_sp, len);
}
After coctx_swap is executed, the CPU ran to execute the code in pendding, that is to say, after executing the statement of coctx_swap, the next statement to be executed is not stCoRoutineEnv_t* curr_env = co_get_curr_thread_env();, but in pedding Statement. Pay special attention to this point. So when is the statement after the coctx_swap statement executed? It will continue to execute here after the coroutine is executed by other places co_resume. The rest is simple. When switching out, copy the contents of the stack and save it in a buffer. When switching back, copy the contents of the buffer to the stack. This is the execution process of the coroutine.
4.1 Context switch: coctx_swap
.globl coctx_swap
#if !defined( __APPLE__ )
.type coctx_swap, @function
#endif
coctx_swap:
#if defined(__i386__)
leal 4(%esp), %eax // 把%esp + 4的地址保存到%eax中
movl 4(%esp), %esp // %esp 保存 %esp + 4地址指向的值
leal 32(%esp), %esp // %esp = %esp + 32,此时%esp指向parm a : ®s[7] + sizeof(void*)
// 接下来把所有的寄存器值保存到当前协程的8个寄存器数组中
pushl %eax // esp ->parm a
pushl %ebp
pushl %esi
pushl %edi
pushl %edx
pushl %ecx
pushl %ebx
pushl -4(%eax)
// 更新%esp的值
movl 4(%eax), %esp // parm b -> ®s[0]
// 把即将运行的协程的寄存器值从内存中弹出保存到CPU的寄存器中
popl %eax //ret func addr
popl %ebx
popl %ecx
popl %edx
popl %edi
popl %esi
popl %ebp
popl %esp
pushl %eax //set ret func addr
xorl %eax, %eax
ret