Table of contents
2. Integer operation instruction
4.1 Arithmetic Left Shift and Logical Left Shift
4.2 Arithmetic Right Shift and Logical Right Shift
5. Special arithmetic operations
1. Load a valid address
instruction | Effect | describe |
leaq S, D | D ← &S | load effective address |
The load effective address ( load effective address ) instruction leaq is a variant of the movq instruction. In a 64-bit system, the address length is 64 bits. Therefore, the size suffix of the lea instruction is q. There are no other variants, and its target operand must be a register .
The leaq instruction is very special. Its general format is leaq (register) register . It looks like reading data from memory to a register. In fact, leaq never references memory, that is to say, the leaq instruction does not access memory. The following program to illustrate
int main()
{
int x = 10;
int *ptr = &x;
return 0;
}
00000000004004ed <main>:
4004ed: 55 push %rbp
4004ee: 48 89 e5 mov %rsp,%rbp
4004f1: c7 45 f4 0a 00 00 00 movl $0xa,-0xc(%rbp)
4004f8: 48 8d 45 f4 lea -0xc(%rbp),%rax // 取a的地址放进%rax
4004fc: 48 89 45 f8 mov %rax,-0x8(%rbp)
400500: b8 00 00 00 00 mov $0x0,%eax
400505: 5d pop %rbp
400506: c3 retq
400507: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
40050e: 00 00
4004f8: 48 8d 45 f4 lea -0xc(%rbp),%rax
This instruction is expressed in the mov instruction: take the value stored in %rbp as the base address, add the offset 0xc as the address, then fetch the data at this address, and transfer the data to the register %rax
But in the leaq instruction, it means: take the value stored in %rbp as the base address, add the offset 0xc as the address, and transfer this address to the register, which is the behavior of & x
%rbp is the frame register, which holds the top position of the stack frame of the main function
Suppose the value of %rbp is 10000, and the value stored at address 1000c is 10
- The movq instruction transfers 10 to register %rax
- The leaq instruction transfers 1000c to register %rax
The leaq instruction completes the addition of a simple base address and offset. In fact, the leaq instruction can not only complete the addition of addresses, but is also commonly used in ordinary arithmetic operations, such as the following instruction
leaq 7(%rdx, %rdx, 4), %rax
Assuming that the value of the register %rdx is x, this instruction means to set the value of %rax to x + 4x +7, refer to the calculation of the addressing mode under linux, such as the following code
long scale(long x, long y, long z)
{
long t = x + 4 * y + 12 * z;
return t;
}
/*
long scale(long x, long y, long z)
x in %rdi, y in %rsi, z in %rdx
*/
scale:
leaq (%rdi,%rsi,4), %rax x + 4*y
leaq (%rdx,%rdx,2), %rdx z + 2*z = 3*z
leaq (%rax,%rdx,4), %rax (x+4*y) + 4*(3*z) = x + 4*y + 12*z
ret
Therefore, the leaq instruction can also complete addition and limited multiplication calculations. One thing to note is that the scaling factors in the addressing mode can only be 1, 2, 4, 8, which means that when the leaq instruction completes multiplication, it can only be compared with 1 , 2, 4, and 8 are multiplied. In the above code, the second line cannot use leaq(%rax, %rdx, 12) to complete the calculation in one step, but it must be divided into two steps for this reason
2. Integer operation instruction
instruction | Effect | describe |
INC D | D ← D + 1 | plus 1 |
DEC D | D ← D - 1 | minus 1 |
NO D | D ← - D | account |
NOT D | D ← ~ D | complement |
ADD S, D | D ← D + S | add |
SUB S, D | D ← D - S | reduce |
IMUL S, D | D ← D * S | take |
These integer operations are used with operand size descriptors depending on the size of the operand, so there are four different instructions
The first four instructions inc, dec, neg and not have only one operand, which is the source and the destination, so they are called unary operations . The operand can be a register or a memory location
The last three instructions add, sub, and imul have two operands, and the second operand is used as both the source and the destination, so it is called a binary operation
2.1 INC and DEC
The INC ( Increment ) instruction adds 1 from the operand, and the DEC ( Decrement ) instruction subtracts 1 from the operand, neither affects CF
command format
- inc reg/mem
- dec reg/mem
Use the code below to see what the inc and dec instructions do
#include <stdio.h>
int main() {
int x = 10;
// printf("The value of x before the increment: %d\n", x); 10
__asm__ ( "inc %0\n" : "=r" (x) : "0" (x) );
// printf("The value of x after the increment: %d\n", x); 11
// printf("The value of x before the increment: %d\n", x); 11
__asm__ ( "dec %0\n" : "=r" (x) : "0" (x) );
// printf("The value of x after the increment: %d\n", x); 10
return 0;
}
00000000004004ed <main>:
4004ed: 55 push %rbp
4004ee: 48 89 e5 mov %rsp,%rbp
4004f1: c7 45 fc 0a 00 00 00 movl $0xa,-0x4(%rbp)
4004f8: 8b 45 fc mov -0x4(%rbp),%eax
4004fb: ff c0 inc %eax // 把%eax中的值加1
4004fd: 89 45 fc mov %eax,-0x4(%rbp)
400500: 8b 45 fc mov -0x4(%rbp),%eax
400503: ff c8 dec %eax // 把%eax中的值减1
400505: 89 45 fc mov %eax,-0x4(%rbp)
400508: b8 00 00 00 00 mov $0x0,%eax
40050d: 5d pop %rbp
40050e: c3 retq
40050f: 90 nop
2.2 NEG
NEG ( negative ): Convert the number to the corresponding two's complement, so as to obtain its opposite number , the affected flags are
Carry flag CF, zero flag ZF, sign flag SF, overflow flag OF, auxiliary carry flag AF and parity flag PF (in the lower 8 bits of the result, whether the number of value 1 is an even number).
command format
- neg reg
- I don't know
Use the code below to see what the neg command does
int main()
{
int i = 10;
i = -i;
i = ~i;
return 0;
}
00000000004004ed <main>:
4004ed: 55 push %rbp
4004ee: 48 89 e5 mov %rsp,%rbp
4004f1: c7 45 fc 0a 00 00 00 movl $0xa,-0x4(%rbp)
4004f8: f7 5d fc negl -0x4(%rbp) // i = -i;
4004fb: b8 00 00 00 00 mov $0x0,%eax
400500: 5d pop %rbp
400501: c3 retq
400502: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
400509: 00 00 00
40050c: 0f 1f 40 00 nopl 0x0(%rax)
2.3 ADD, SUB and IMUL
ADD ( addition ): The instruction adds the source and destination operands of the same size
SUB ( subtraction ): The instruction subtracts the source and destination operands of the same size
IMUL ( multiplication ): The instruction multiplies the source and destination operands of the same size
command format
- add source operand, destination operand
- sub source operand, destination operand
- imul source operand, destination operand
The source operand of the instruction can be: immediate value, register, memory location
The destination operand of the instruction can be: register, memory location
Use the code below to see what the add, sub, and imul instructions do
int main()
{
int a = 10;
int b = a + 10;
int c = b - 15;
a = a * b;
}
00000000004004ed <main>:
4004ed: 55 push %rbp
4004ee: 48 89 e5 mov %rsp,%rbp
4004f1: c7 45 fc 0a 00 00 00 movl $0xa,-0x4(%rbp) // int a = 10;
4004f8: 8b 45 fc mov -0x4(%rbp),%eax
4004fb: 83 c0 0a add $0xa,%eax // int b = a + 10;
4004fe: 89 45 f8 mov %eax,-0x8(%rbp)
400501: 8b 45 f8 mov -0x8(%rbp),%eax
400504: 83 e8 0f sub $0xf,%eax // int c = b - 15;
400507: 89 45 f4 mov %eax,-0xc(%rbp)
40050a: 8b 45 fc mov -0x4(%rbp),%eax
40050d: 0f af 45 f8 imul -0x8(%rbp),%eax // a = a * b;
400511: 89 45 fc mov %eax,-0x4(%rbp)
400514: 5d pop %rbp
400515: c3 retq
400516: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
40051d: 00 00 00
3. Boolean instructions
There are bitwise operators in C language
Corresponding to the following commands respectively
instruction | Effect | describe |
AND S, D | D ← D & S | and |
OR S, D | D ← D | 1 | or |
XOR S, D | D ← D ^ S | XOR |
NOT D | D ← ~ D | complement |
3.1 AND
The AND instruction performs a Boolean bitwise "AND" operation between the corresponding data bits of each pair of operands and stores the result in the destination operand
command format
- AND reg/mem/imm, reg/mem
The AND instruction always makes CF=0, OF=0, and modifies the values of SF, ZF and PF according to the value of the destination operand
Refer to the following code
int main()
{
int x = 10; // 00000000 00000000 00000000 00001010
int y = x & 8; // 00000000 00000000 00000000 00001000
return 0;
}
00000000004004ed <main>:
4004ed: 55 push %rbp
4004ee: 48 89 e5 mov %rsp,%rbp
4004f1: c7 45 fc 0a 00 00 00 movl $0xa,-0x4(%rbp) // int x = 10;
4004f8: 8b 45 fc mov -0x4(%rbp),%eax
4004fb: 83 e0 08 and $0x8,%eax // int y = x & 8;
4004fe: 89 45 f8 mov %eax,-0x8(%rbp)
400501: b8 00 00 00 00 mov $0x0,%eax
400506: 5d pop %rbp
400507: c3 retq
400508: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
40050f: 00
3.2 OR
The OR instruction performs a Boolean bit OR operation between the corresponding data bits of each pair of operands and stores the result in the destination operand
command format
- OR reg/mem/imm, reg/mem
The OR instruction always makes CF=0, OF=0, and modifies the values of SF, ZF and PF according to the value of the destination operand
Refer to the following code
int main()
{
int x = 10; // 00000000 00000000 00000000 00001010
int y = x | 8; // 00000000 00000000 00000000 00001000
return 0;
}
00000000004004ed <main>:
4004ed: 55 push %rbp
4004ee: 48 89 e5 mov %rsp,%rbp
4004f1: c7 45 fc 0a 00 00 00 movl $0xa,-0x4(%rbp) // int x = 10;
4004f8: 8b 45 fc mov -0x4(%rbp),%eax
4004fb: 83 c8 08 or $0x8,%eax // int y = x | 8
4004fe: 89 45 f8 mov %eax,-0x8(%rbp)
400501: b8 00 00 00 00 mov $0x0,%eax
400506: 5d pop %rbp
400507: c3 retq
400508: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
40050f: 00
3.3 XOR
The XOR instruction performs a Boolean "exclusive OR" operation between the corresponding data bits of each pair of operands , and stores the result in the destination operand
command format
- XOR reg/mem/imm, reg/mem
The OR instruction always makes CF=0, OF=0, and modifies the values of SF, ZF and PF according to the value of the destination operand
Refer to the following code
int main()
{
int x = 10; // 00000000 00000000 00000000 00001010
int y = x ^ 8; // 00000000 00000000 00000000 00001000
return 0;
}
00000000004004ed <main>:
4004ed: 55 push %rbp
4004ee: 48 89 e5 mov %rsp,%rbp
4004f1: c7 45 fc 0a 00 00 00 movl $0xa,-0x4(%rbp) // int x = 10;
4004f8: 8b 45 fc mov -0x4(%rbp),%eax
4004fb: 83 f0 08 xor $0x8,%eax // int y = x ^ 8;
4004fe: 89 45 f8 mov %eax,-0x8(%rbp)
400501: b8 00 00 00 00 mov $0x0,%eax
400506: 5d pop %rbp
400507: c3 retq
400508: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
40050f: 00
3.4 NOT
The NOT instruction inverts all data bits of an operand
command format
- NOT reg/mem
The NOT instruction does not modify any status flags
Refer to the following code
int main()
{
int x = 10; // 00000000 00000000 00000000 00001010
int y = ~x; // 11111111 11111111 11111111 11110101
return 0;
}
00000000004004ed <main>:
4004ed: 55 push %rbp
4004ee: 48 89 e5 mov %rsp,%rbp
4004f1: c7 45 fc 0a 00 00 00 movl $0xa,-0x4(%rbp) // int x = 10;
4004f8: 8b 45 fc mov -0x4(%rbp),%eax
4004fb: f7 d0 not %eax // ~x;
4004fd: 89 45 f8 mov %eax,-0x8(%rbp) // int y = ~x;
400500: b8 00 00 00 00 mov $0x0,%eax
400505: 5d pop %rbp
400506: c3 retq
400507: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
40050e: 00 00
4. Shift operation
The shift operator in C language is divided into left shift operator (<<) and right shift operator (>>), and the shift operation is divided into left shift and right shift
instruction | Effect | describe |
SAL k, D | D ← D << k | arithmetic shift left |
SHL k, D | D ← D << k | logical shift left (equivalent to SAL) |
SAR k, D | D ← D >> k | Arithmetic right shift |
SHL k, D | D ← D >> k | logical shift right |
shift operation
- The first operand is the shift amount k, which is the number of bits shifted by the binary bit
- The second operand is the number to be shifted
Note: The shift amount can be an immediate value, or placed in the single-byte register %cl (it can only be placed here)
%cl is 8 bits long and can represent 0~255, so the maximum shift amount can reach 255 bits, but obviously there is no such long data type, so in fact, the shift operation is determined according to the number of digits to be shifted. What are the values of %cl,
In x86-64, the shift operation operates on w-bit long data values. The shift amount is determined by the low m bits of the %cl register. Here, the m power of 2 is equal to w, and the high bits will be ignored.
For example, at this time, %cl is 0xFF
%cl | 1111 1111 |
for different data types
- Data of char type, 8 bits long, take the lower three bits 111 in %cl, so it will move 7 bits
- The data of short type is 16 bits long, and the lower four bits in %cl are 1111, so it will be moved by 15 bits
- The int type data is 32 bits long, and the lower five bits in %cl are 11111, so 31 bits will be moved
4.1 Arithmetic Left Shift and Logical Left Shift
SAL ( Arithmetic Left Shift ): Perform a logical left shift operation on the destination operand, fill the lower bits with 0, and send the highest bits to CF
SHL (Logic Left Shift ): Equivalent to SAL instruction
command format
- sal imm8/CL, reg/mem
- shl imm8/CL, reg/mem
Refer to the following code
int main()
{
int x = 10; // 00000000 00000000 00000000 00001010
int y = x << 2; // 00000000 00000000 00000000 00101000
return 0;
}
00000000004004ed <main>:
4004ed: 55 push %rbp
4004ee: 48 89 e5 mov %rsp,%rbp
4004f1: c7 45 fc 0a 00 00 00 movl $0xa,-0x4(%rbp) // int x = 10;
4004f8: 8b 45 fc mov -0x4(%rbp),%eax
4004fb: c1 e0 02 shl $0x2,%eax // x << 2
4004fe: 89 45 f8 mov %eax,-0x8(%rbp) // int y = x << 2;
400501: b8 00 00 00 00 mov $0x0,%eax
400506: 5d pop %rbp
400507: c3 retq
400508: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
40050f: 00
4.2 Arithmetic Right Shift and Logical Right Shift
SHR ( Logic Shift Right ): Perform a logical right shift operation on the destination operand, the shifted data bits are filled with 0, and the lowest bit is sent to CF
command format
- shr imm8/CL, reg/mem
SAL ( Arithmetic Right Shift ): Fill the vacated bit with the highest bit, and copy the lowest bit to CF
command format
- sar imm8/CL, reg/mem
Refer to the following code (the arithmetic right shift is the most special here, and only the arithmetic right shift is demonstrated)
int main()
{
int x1 = 10; // 00000000 00000000 00000000 00001010
int y1 = x1 >> 2; // 00000000 00000000 00000000 00000010
int x2 = -10; // 11111111 11111111 11111111 11110110
int y2 = x2 >> 2; // 11111111 11111111 11111111 11111101
return 0;
}
00000000004004ed <main>:
4004ed: 55 push %rbp
4004ee: 48 89 e5 mov %rsp,%rbp
4004f1: c7 45 fc 0a 00 00 00 movl $0xa,-0x4(%rbp)
4004f8: 8b 45 fc mov -0x4(%rbp),%eax
4004fb: c1 f8 02 sar $0x2,%eax // 算术右移,以0填充
4004fe: 89 45 f8 mov %eax,-0x8(%rbp)
400501: c7 45 f4 f6 ff ff ff movl $0xfffffff6,-0xc(%rbp)
400508: 8b 45 f4 mov -0xc(%rbp),%eax
40050b: c1 f8 02 sar $0x2,%eax // 算术右移,以1填充
40050e: 89 45 f0 mov %eax,-0x10(%rbp)
400511: b8 00 00 00 00 mov $0x0,%eax
400516: 5d pop %rbp
400517: c3 retq
400518: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
40051f: 00
Since the arithmetic right shift distinguishes between signed and unsigned numbers, using the arithmetic right shift to operate on the complement can replace part of the integer operation. For the following arith function
long arith(long x, long y, long z)
{
long t1 = x ^ y;
long t2 = z * 48;
long t3 = t1 & 0x0F0F0F0F;
long t4 = t2 - t3;
return t4;
}
The corresponding compilation is
/*
long arith(long x, long y, long z)
x in %rdi, y in %rsi, z in %rdx
*/
arith:
xorq %rsi, %rdi t1 = x ^ y
leaq (%rdx,%rdx,2), %rax 3*z
salq $4, %rax t2 = 16 * (3*z) = 48*z
andl $252645135, %edi t3 = t1 & 0x0F0F0F0F
subq %rdi, %rax Return t2 - t3
ret
Here use salq $4, %rax instead of multiplication, which can speed up the operation
5. Special arithmetic operations
The product obtained by multiplying two 64-bit signed or unsigned numbers requires 128 bits to represent. The x86-64 instruction set provides a certain degree of support for 128-bit operations. Intel refers to 16-byte numbers as oct words.
The following table is supported to generate the full 128-bit product of two 64-bit numbers and integer division instructions
instruction | Effect | describe |
imuq S | R[ %rdx ]:R[ %rax ] ← S × R[ %rax ] | signed multiplication |
mulq S | R[ %rdx ]:R[ %rax ] ← S × R[ %rax ] | unsigned multiplication |
cqto | R[ %rdx ]:R[ %rax ] ← SignExtend(R[ %rax ]) | convert to horoscope |
idivq S | R[ %rdx ] ← R[ %rdx ]:R[ %rax ]mod S R[ %rax ] ← R[ %rdx ]:R[ %rax ]÷ S |
signed division |
divq S | R[ %rdx ] ← R[ %rdx ]:R[ %rax ]mod S R[ %rax ] ← R[ %rdx ]:R[ %rax ]÷ S |
unsigned division |
Two registers %rdx (64-bit) and %rax (64-bit) form a 128-bit eight word, set or clear CF and OF according to whether the high part of the product is 0
For unsigned multiplication (mulq) and signed multiplication (imulq), both are single-operand multiplication instructions, and both need to store one parameter in the register %rax, and the other as the source operand of the instruction. out, the product is placed in registers %rdx and %rax
%rdx (64-bit) | %rax (64-bit) |
The following is an example, see the details in section 3.5.5 of the original CASPP book
#include <inttypes.h>
typedef unsigned __int128 uint128_t;
void store_uprod(uint128_t *dest, uint64_t x, uint64_t y)
{
*dest = x * (uint128_t)y;
}
/*
void store_uprod(uint128_t *dest, uint64_t x, uint64_t y)
dest in %rdi, x in %rsi, y in %rdx
*/
store_uprod:
movq %rsi, %rax Copy x to multiplicand
mulq %rdx Multiply by y
movq %rax, (%rdi) Store lower 8 bytes at dest
movq %rdx, 8(%rdi) Store upper 8 bytes at dest+8
ret