ARM64 reverse engineering basis

Why learn ARM64?

The android 5.0 system began to introduce Arm64-v8a, which is used to support the new AArch64 architecture, which is the arm64 assembly we want to learn. At present, the android system has been developed to anroid 11 version. Therefore, the mainstream APKs now support the AArch64 architecture. Then we use IDA (disassembly tool) for static reverse analysis of so files, or IDA dynamic debugging of so files, all need to deal with arm64 assembly code, so for learning and mastering ARM64 assembly to read the disassembly code can achieve twice the result with half the effort.

What is the composition of ARM64 assembly?

• There are 34 registers in the assembly. Including 31 general registers, SP register, PC register, CPSR register. • Among the 31 general-purpose registers:
X0-X30: indicates that it is a 64-bit register.
W0-W30: indicates that it is a 32-bit register. •X31: Also called the zero register (it is generally used for variable initialization), it also has two manifestations: XZR: it is a 64-bit zero register, which is stored in 8 bytes in the memory. WZR: It means that it is a 32-bit zero register, which is stored in 4 bytes in the memory. •SP: Save the stack pointer (top pointer), use SP or WSP to access the SP register, which is used to manipulate local variable addresses. • PC: Program counter (PC pointer register), it is used to point to the next instruction to be executed. • CPSR: status register • FP (X29): save the stack frame address (stack bottom pointer) • LP (X30): usually call X30 the link register of the program, which saves the next instruction that needs to be executed after the end of the subroutine.

What is the purpose of general registers?

image

• When X0-X30 registers are used to access the operation data in the assembly code, it represents a 64-bit data. • When using the W0-W30 register to access the operation data in the assembly code, it represents a 32-bit data.

image

 

What is the stack register and how to use it?

1. Stack structure and function?

The stack is a storage space with a special access method, first-in-last-out (last-in, first-out), from high address to low address, the bottom of the stack is the high address, and the top of the stack is the low address.
Its main function: used to store parameters and local variables (temporary variables).

 2. What are the instructions of the stack register?

SP: Stack top register
FP: Stack bottom register

 3. What instructions are there to manipulate stack registers?

STP instruction: indicates the stack instruction
LDP instruction: indicates the stack instruction
, the assembly code fragment of the stack operation

image

Assembly code snippet for pop operation

image

What are the statuses of the status register?

In the ARM64 assembly instruction set, there are some instructions that affect the status register when executed, such as add, sub, or assembly instructions, etc., most of them are arithmetic instructions (for logic or arithmetic operations)

image

The lower 8 bits of CPSR (including I, F, T and M[0~4]) are called control bits, and the program cannot be modified. Unless the CPU is running in privileged mode, the program can modify the control bits! N, Z, C, V All are condition code flags. Their content can be changed by the results of arithmetic or logical operations, and can determine whether a certain instruction is executed.

 1. N (Negative) logo

The 31st bit of CPSR is N, the sign flag bit. It records whether the result of the related instruction is negative after execution. If it is negative, N=1, if it is non-negative, N=0.

 2. Z (Zero) logo

The 30th bit of CPSR is Z, the zero flag bit. It records whether the result of the relevant instruction is 0 after execution. If the result is 0, then Z = 1; if the result is not 0, then Z = 0.

 3. C (Carry) logo

The 29th bit of CPSR is C, the carry flag. In general, operations with unsigned numbers are performed.
Addition operation (add instruction): When the operation result has a carry (unsigned number overflow), C=1, otherwise C=0. Subtraction operation (sub instruction): When a borrow occurs during operation (unsigned number overflow), C=0, otherwise C=1.

4. V (Overflow) overflow flag

The 28th bit of CPSR is V, the overflow flag bit. When performing a signed number operation, if it exceeds the range that the machine can identify, it is called an overflow.
There will be overflow in the following situations.
Positive numbers + positive numbers are negative numbers overflow.
Negative numbers + negative numbers are positive numbers overflow.
Positive numbers + negative numbers cannot overflow.

ARM64-specific assembly instructions

•Adrp instruction (address page)

It is an address read instruction, which is used to calculate the relative offset from the specified data address to the current PC value.
The following is its specific usage and explanation

image

Get the base address of a page with a size of 4KB, and there is the address of the global variable g in this page; ADRP means that the base address of the page is stored in the register X6; the
ADD instruction will calculate the address of g, X6+#_g@PAGEOFF , #_G@PAGEOFF is an offset; in this way, the address of g is X6;

•Memory read and write instructions (ldr, ldur, ldp, str, stur, stp)

STR, STP, STUR are instructions for storing data (Note: those beginning with ST are storage instructions).
LDR, LDP, and LDUR are instructions for fetching data (Note: those beginning with LD
are instructions for fetching data) The following are specific instructions and assembly code Analyze the
STR instruction: read the data from the register and store it in the memory.
STUR instruction: read out the negative data in the register and store it in the memory.
STP instruction: indicates the stack instruction.

 

image

LDR instruction: Take the data out of the memory and store it in the register.
LDUR instruction: Take out the negative data in the memory and store it in the register.
LDP instruction: indicates the stack instruction

image

What are the functions of ARM64 assembly that need to be paid attention to?

 

 

1. What is the function calling convention?
ARM64 uses the function calling convention of ATPCS (ARM-Thumb Procedure Call Standard/ARM-Thumb Procedure Call Standard). 2. What are the instructions that the function needs to use?
B: Unconditional jump, which is generally the jump judged by if and switch conditions inside
the function. Bl: The jump with function return value, which is generally used to call other functions. RET: Subroutine return instruction, the return address is stored in X30 register (LR link register) by default.
LR: Save the next instruction
that needs to be executed after the subroutine ends. PC: Represents the address of the currently executed instruction. 3. How to store and transfer function parameters?
3.1. Normally, the parameters of the function are stored in the 8 registers X0-X7 (W0-W7 for 32 bits). If the function parameters exceed 8, then the stack storage method is needed to store the parameters.
3.2. If the function parameters are less than 8 parameters, then the parameters are passed from left to right in turn, and if there are more than 8 parameters, then the parameters are pushed into the stack from right to left in turn.
3.3 The specific situation of the 8 parameters and 9 parameters in the function (but in the development process, the case of more than 8 parameters is relatively small, so in the reverse process, the case of more than 8 parameters will be relatively small) 3.3 .1 If there are eight parameters in the function, directly use the registers W0-W7 to represent the source code:

image


       ARM64 assembly code:

image

      3.3.2 If there are nine parameters in the function, you need to use the stack register to pass the parameters Source code:

image

     ARM64 assembly code:

image

4. How to balance the stack in the function?
Stack balance requires the callee to restore balance.       

image

image


5. How to receive the return value of the function?
The return value of the function is usually stored in the X0 register, and ret puts the address value of the X0 register into the X30 register (that is, the LR link register).

image

 

ARM assembly foundation for Android reverse engineering

 

The big guys keep a concern before leaving, and follow-up wonderful articles continueimage

image

Guess you like

Origin blog.csdn.net/u011426115/article/details/112352721