The conditions in the machine code control the control flow and switch statements, and the loop Loop

The last article wrote about the basic knowledge concepts of machine code, and this article summarizes the implementation of flow control, conditional judgment, and looping.

A piece of machine code leads to

In machine code, the jmp instruction is generally used to jump to a certain code block. For example, a machine code may look like this:

decision:
    subq $8, %rsp
    testl %edi, %edi
    je .L2
    call op1
    jmp .L1
.L2:
    call op2
.L1:
    addq $8, %rsp
    ret

With line-by-line execution, once a jmp instruction is encountered, you can go directly to the corresponding code block and skip the middle part, just like GOTO in C language.

Wait, what is that je? Before talking about je, first write the condition code Condition Codes.

Condition Codes

Condition codes (Condition Codes) are a set of flag bits used to indicate the status of the operation result . These flags are usually stored in single-bit registers (which can be considered as additional small registers such as %rax and %rbx introduced earlier).

CF (Carry Flag) : Carry flag, used for unsigned arithmetic.
SF (Sign Flag) : Sign flag, used for signed arithmetic.
ZF (Zero Flag) : Zero flag, indicating whether the operation result is zero.
OF (Overflow Flag) : Overflow flag, used for signed arithmetic.

In the GDB debugger, these flags will be printed as a register named "eflags", such as:

eflags 0x246 [ PF ZF IF ] Z set, CSO clear

When performing arithmetic operations, condition codes are usually set implicitly (as an additional result of the operation). Take addq Src, Dest the operation, for example, which performs an addition operation: t = a + b. During this process, the condition code will be automatically set according to the following rules:

CF (Carry Flag) : set when the most significant bit generates a carry (indicating unsigned overflow).
ZF (Zero Flag) : set t == 0 at that time .
SF (Sign Flag) : set t < 0 at that time (indicating that the result is negative).
OF (Overflow Flag) : set when a complement (signed) overflow occurs. Such situations include:
- When a > 0, b > 0 and t < 0 when.
- When a < 0, b < 0 and t >= 0 when.

Comparison instruction (cmp) and test instruction (test)

It can be noticed that in the sample code at the beginning, there is a testl command above je. Here we will lead to cmp and test.

cmpa instruction is used to compare the sum of two operands b. Its execution process is as follows:

Compute b - a( sub same as instruction).
Sets a condition code based on the result, but does not change the value of the operand b .

testa instruction is used to test the sum of two operands b. Its execution process is as follows:

Compute b & a( and same as instruction).
Sets the condition codes (only SF and ZF) according to the result, but does not change the value of the operand b .

jX Instructions jump instruction

Finally came to je! je is actually a type of jX, and it actually proceeds to the next step based on the results of previous commands such as cmp or test. jX instructions are usually used to implement control structures such as conditional branches and loops. For example:

je: Jump when zero flag (ZF) is set.
jne: Jump when zero flag (ZF) is not set.
jg: Jump when the signed number comparison result is greater than.
jl: Jump when the signed number comparison result is less than.

setX command

The logic of setX and jX is very similar. The SetX instruction allows to set the low byte (lower 8 bits) of the target register to 0 or 1 according to the combination of condition codes. The SetX instruction does not change the remaining bytes of the destination register. For example:

sete: Set the low byte of the destination register to 1 when the zero flag (ZF) is set, otherwise to 0.
setne: Set the low byte of the destination register to 1 when the zero flag (ZF) is not set, otherwise to 0.
setg: When the signed number comparison result is greater than, set the low byte of the target register to 1, otherwise set to 0.
setl: When the signed number comparison result is less than, set the low byte of the target register to 1, otherwise set to 0.

For example, suppose we want to compare two integers (stored in registers %rax and %rbx ) for equality, and store the result in %rcx the low byte of the register (that is 1 , equal, 0 not equal). It can be like this:

cmp %rax, %rbx   ; 比较 %rax 和 %rbx 的值
sete %cl         ; 如果它们相等（即零标志 ZF 设置），则将 %rcx 的低字节（%cl）设置为 1，否则设置为 0

A little different is that setX needs multiple parameters after it, such as the target register (such as %cl and %bl). The arguments to setX are always these lower registers (%al, %r8b, etc.). As we mentioned before, the low-order register is actually the name of the low-order part of the normal register. For example, %al is actually the last byte of register %rax. %eax is actually the last 4 bytes of %rax. The number of registers is different, and the application operation command should be careful. Of course, there are special commands to match registers with different numbers, such as movzbl:

movzbl %al, %eax

movzbl It is an x86 assembly instruction, the full name is "Move with Zero-Extend Byte to Long". This instruction moves a byte (8 bits) of data from a source operand to a destination operand and zero-extends it to a longword (32 bits) or a quadword (64 bits, depending on the size of the operand) .

Okay, off topic, back on track.

An interesting example of a conditional code block:

long absdiff (long x, long y) {
    long result;
    if (x > y)
        result = x-y;
    else
        result = y-x;
    return result;
}

machine code:

absdiff:
    movq %rdi, %rax # x
    subq %rsi, %rax # result = x-y
    movq %rsi, %rdx
    subq %rdi, %rdx # eval = y-x
    cmpq %rsi, %rdi # x:y
    cmovle %rdx, %rax # if <=, result = eval
    ret

At first glance, it seems a bit strange. Why didn't you cmp xy first and then jump according to the result? Why didn't there be a jump. This is because the conditional move instruction Conditional Move Instructions is used here, the values in both cases are calculated in advance , and then cmovle is used to complete the same requirement. The main advantage of conditional move instructions is that they do not disrupt sequential execution in the instruction stream. In the pipelined architecture of modern processors, branch instructions (such as jumps) can cause interruptions in the order of instructions in the pipeline, reducing performance. Conditional move instructions do not require a control transfer and are therefore more efficient in the processor pipeline. As for the meaning of le in movle, needless to say, it is the same as jX and setX above.

Of course conditional move instructions are not the best choice in all cases. Conditional move instructions require two values to be calculated before execution. This means that if the calculation cost is high, the conditional move instruction may not be the best choice; another example is that the calculation is risky or the calculation will modify global variables, etc. Here are some code examples that are not suitable for conditional move instructions.

val = Test(x) ? Hard1(x) : Hard2(x); //计算量大
val = p ? *p : 0; //风险计算
val = Test(x) ? FunctionWithSideEffect1(x) : FunctionWithSideEffect2(x); //造成不必要的执行

switch statement and loop Loop

Finally, let's talk about switch and loop loops. Here is an example switch statement:

long switch_eg(long x, long y, long z)
{
    long w = 1;
    switch(x) {
        // case statements...
    }
    return w;
}

The machine code is as follows:

switch_eg:
    movq %rdx, %rcx
    cmpq $6, %rdi
    ja .L8
    jmp *.L4(,%rdi,8)

In the above assembly code, we can see two kinds of jump instructions: direct jump ja.L8 and indirect jump jmp

*.L4(,%rdi,8). We are already familiar with the direct jump, and the indirect jump is a jump table generated by the machine for the addresses of different code blocks in the switch statement. The addresses of these code blocks are often linked together, by adding an offset from a starting position way to jump. For example, in this category, the jump table jump table is as follows:

.section .rodata
.align 8
.L4:
    .quad .L8
    .quad .L3
    .quad .L5
    .quad .L9
    .quad .L8
    .quad .L7
    .quad .L7

In this example, the starting address of the jump table is .L4. Since each address in the table takes 8 bytes, we need to multiply x(stored in %rdi) by 8 to get the correct offset. Then, from .L4the start, add the offset x*8to get the actual jump target.

In switch_egfunctions, indirect jumps are used to xselect the corresponding casebranch based on the value of . Indirect jumps are only valid in the range of 0 ≤ x ≤ 6, because the jump table .L4contains only 7 target addresses (corresponding xto the case of values 0 to 6). For xthe case greater than 6, the program will execute the default branch, that is, jump directly to .L8（对应 ja .L8）.

Of course, as for why L8 corresponds to x=0 and L3 corresponds to x=1, it can be obtained by comparing the machine code of L8 with the original code. .

Loops loop

There are many ways to write loops in C language, but whether it is while-do, do-while, for, etc., they are almost the same after being converted into machine code. Because no matter what kind of loop it is, it has several parts such as "init" initialization, "condition", "body" and "update". The same kind of logic you write with for, while, or even goto statements, and the resulting machine The code is estimated to be the same. no more detailed writing~