binutils toolset - usage of objdump

The following content comes from the study and arrangement of network resources. If there is any infringement, please inform and delete it.

1. Tool Introduction

objdump is mainly used to display the contents of the target file.

The display here refers to the disassembly of the binary file content into assembly code, so "display" is approximately equivalent to "disassembly".

Reasons for using the objdump tool to disassemble:

(1) Reverse cracking. Disassemble the executable program to obtain the assembly code, and then infer the logic of the entire program based on the assembly code. This is not something ordinary people can do. It is very difficult to understand a large number of programs written in assembly language, let alone reverse the code logic of others.

(2) Debugging program. Disassembly debugging can help us understand and detect whether the generated executable program is normal, especially when understanding concepts such as link scripts and link addresses.

(3) Disassembly of executable files generated by compiling and linking the source code of C language can help us understand the corresponding relationship between C language and assembly language, and help us understand C language in depth.
 

2. The meaning of the options

Through the man manual, we can see that the format of the tool is as follows.

objdump [options] obj_file #[]表示可选,obj_file表示目标文件

The objdump tool supports many options, and only some commonly used options are listed here. 

options describe

-d

--disassemble

Indicates the disassembly of the segment containing the machine code of the instruction (that is, the segment that needs to execute the instruction).

-D

--disassemble-all

Indicates to disassemble all sections.

-m machine

--architecture=machine

Specifies the architecture to use when disassembling object files. This option is useful when the file to be disassembled does not describe the architecture itself. The architectures that can be specified here can be listed with the -i option.

-S

--source

Disassemble the source code as much as possible, especially when the debugging parameter -g is specified when compiling, the effect is more obvious. (That is, if you want to display the disassembly code and the source code alternately, you need to use the -g parameter when compiling, that is, you need debugging information)

-s filename Display the header file information of the file filename and the corresponding hexadecimal file code

-i

--info

Display a list of architectures and target formats available for the -b or -m option.

-l filename (lowercase L) Insert filename and line number in disassembly code.

-j section

--section=section

Only display the information of the specified section.

-a filename Displays the format of the file filename.
-f filename Display the header information of the file filename.
-h filename Display the header information of each section of the file filename (ie section overview)
-x filename Display all header file information of the file filename.
-C filename Reverse parse the C++ symbol name.
-t filename Output the content of the symbol list (variable names and function names are symbols) of the file filename.

3. Interpretation of disassembly files for application 

For example, the LED lighting project on the S5PV210 development board consists of a Makefile and a start.S file.

1. Makefile

The content of the Makefile is as follows. Its function is to compile the source files .S and .c into .o files first, and then link the .o files into executable files in .elf format. Among them, " arm-linux-objdump -D led.elf > led_elf.dis" means to disassemble led.elf into ed_elf.dis.

led.bin: start.o 
	arm-linux-ld -Ttext 0x0 -o led.elf $^ #链接:将.o文件链接成led.elf文件
	arm-linux-objcopy -O binary led.elf led.bin #复制与转换:将.elf格式转换为.bin格式
	arm-linux-objdump -D led.elf > led_elf.dis #反汇编,将led.elf文件反汇编成led_elf.dis文件
	gcc mkv210_image.c -o mkx210
	./mkx210 led.bin 210.bin
	
%.o : %.S
	arm-linux-gcc -o $@ $< -c

%.o : %.c
	arm-linux-gcc -o $@ $< -c 

clean:
	rm *.o *.elf *.bin *.dis mkx210 -f

2. Source code start.S file 

The content of the start.S file is as follows, which consists of start, light, delay and infinite loop. Here we don't pay attention to the specific implemented functions, but focus on comparing with the files generated by disassembly.

//.globl 表明后面的变量有全局属性,对应于C语言的全局变量
.globl _start

_start:
	 设置GPJ0CON的bit[0:15],配置GPJ0_0/1/2/3引脚为输出功能
	// 设置GPJ0CON的bit[12:23],配置GPJ0_3/4/5引脚为输出功能
	ldr r1, =0xE0200240 					
	ldr r0, =0x00111000
	str r0, [r1]

	mov r2, #0x1000

	//设置GPD0_1为输出模式
	ldr r1, =0xE02000A0 					
	ldr r0, =0x00000010
	str r0, [r1]	
	
led_blink:
	 设置GPJ2DAT的bit[0:3],使GPJ2_0/1/2/3引脚输出低电平,LED亮
	// 设置GPJ0DAT的bit[3:5],使GPJ0_3/4/5引脚输出低电平,LED亮
	ldr r1, =0xE0200244 					
	mov r0, #0
	str r0, [r1]
	
	ldr r1, =0xE02000A4					
	mov r0, #0
	str r0, [r1]

	// 延时
	bl delay							

	 设置GPJ2DAT的bit[0:3],使GPJ2_0/1/2/3引脚输出高电平,LED灭
	// 设置GPJ0DAT的bit[3:5],使GPJ0_3/4/5引脚输出高电平,LED灭
	ldr r1, =0xE0200244 					
	mov r0, #0x38
	str r0, [r1]
	
	ldr r1, =0xE02000A4					
	mov r0, #0x2
	str r0, [r1]

	// 延时
	bl delay	

	sub r2, r2, #1
	cmp r2,#0
	bne led_blink

halt:   
	b halt

delay:
	mov r0, #0x900000
delay_loop:
	cmp r0, #0
	sub r0, r0, #1
	bne delay_loop
	mov pc, lr

3. Disassemble the contents of the led_elf.dis file

After executing make, the content of the obtained disassembly file led_elf.dis is as follows.

(1) The first line indicates that this assembler is generated by disassembling led.elf, and the program is 32 little endian.

(2) 00000000 <_start>, where <_start> is the label , corresponding to the _start label in the start.S file, and 00000000 is the address of the label <_start>. In fact, the label is equivalent to the function name in the C language. In the C language, the function name can also be used to represent the first address of the function, which can be confirmed here. The labels in the disassembly file are obtained from the assembly file, so that we can easily find the corresponding parts of the disassembly file and the assembly file.

(3) The disassembly file is divided into three columns, corresponding to: instruction address, instruction machine code, and instruction disassembled from the instruction machine code.

led.elf:     file format elf32-littlearm  //第一行

Disassembly of section .text:
//第一列  第二列     第三列 
00000000 <_start>:
   0:	e59f1070 	ldr	r1, [pc, #112]	; 78 <delay_loop+0x10>
   4:	e59f0070 	ldr	r0, [pc, #112]	; 7c <delay_loop+0x14>
   8:	e5810000 	str	r0, [r1]
   c:	e3a02a01 	mov	r2, #4096	; 0x1000
  10:	e59f1068 	ldr	r1, [pc, #104]	; 80 <delay_loop+0x18>
  14:	e3a00010 	mov	r0, #16
  18:	e5810000 	str	r0, [r1]

0000001c <led_blink>:
  1c:	e59f1060 	ldr	r1, [pc, #96]	; 84 <delay_loop+0x1c>
  20:	e3a00000 	mov	r0, #0
  24:	e5810000 	str	r0, [r1]
  28:	e59f1058 	ldr	r1, [pc, #88]	; 88 <delay_loop+0x20>
  2c:	e3a00000 	mov	r0, #0
  30:	e5810000 	str	r0, [r1]
  34:	eb00000a 	bl	64 <delay>
  38:	e59f1044 	ldr	r1, [pc, #68]	; 84 <delay_loop+0x1c>
  3c:	e3a00038 	mov	r0, #56	; 0x38
  40:	e5810000 	str	r0, [r1]
  44:	e59f103c 	ldr	r1, [pc, #60]	; 88 <delay_loop+0x20>
  48:	e3a00002 	mov	r0, #2
  4c:	e5810000 	str	r0, [r1]
  50:	eb000003 	bl	64 <delay>
  54:	e2422001 	sub	r2, r2, #1
  58:	e3520000 	cmp	r2, #0
  5c:	1affffee 	bne	1c <led_blink>

00000060 <halt>:
  60:	eafffffe 	b	60 <halt>

00000064 <delay>:
  64:	e3a00609 	mov	r0, #9437184	; 0x900000

00000068 <delay_loop>:
  68:	e3500000 	cmp	r0, #0
  6c:	e2400001 	sub	r0, r0, #1
  70:	1afffffc 	bne	68 <delay_loop>
  74:	e1a0f00e 	mov	pc, lr
  78:	e0200240 	eor	r0, r0, r0, asr #4
  7c:	00111000 	andseq	r1, r1, r0
  80:	e02000a0 	eor	r0, r0, r0, lsr #1
  84:	e0200244 	eor	r0, r0, r4, asr #4
  88:	e02000a4 	eor	r0, r0, r4, lsr #1

Disassembly of section .ARM.attributes:

00000000 <.ARM.attributes>:
   0:	00001a41 	andeq	r1, r0, r1, asr #20
   4:	61656100 	cmnvs	r5, r0, lsl #2
   8:	01006962 	tsteq	r0, r2, ror #18
   c:	00000010 	andeq	r0, r0, r0, lsl r0
  10:	45543505 	ldrbmi	r3, [r4, #-1285]	; 0x505
  14:	08040600 	stmdaeq	r4, {r9, sl}
  18:	Address 0x00000018 is out of bounds.

4. Interpretation of the disassembled led_elf.dis file

//汇编文件
_start:
	 设置GPJ0CON的bit[0:15],配置GPJ0_0/1/2/3引脚为输出功能
	// 设置GPJ0CON的bit[12:23],配置GPJ0_3/4/5引脚为输出功能
	ldr r1, =0xE0200240 					
	ldr r0, =0x00111000
	str r0, [r1]

	mov r2, #0x1000

//对应的反汇编文件部分
00000000 <_start>:
   0:	e59f1070 	ldr	r1, [pc, #112]	; 78 <delay_loop+0x10>
   4:	e59f0070 	ldr	r0, [pc, #112]	; 7c <delay_loop+0x14>
   8:	e5810000 	str	r0, [r1]
   c:	e3a02a01 	mov	r2, #4096	; 0x1000
   ......
  70:	1afffffc 	bne	68 <delay_loop>
  74:	e1a0f00e 	mov	pc, lr
  78:	e0200240 	eor	r0, r0, r0, asr #4
  7c:	00111000 	andseq	r1, r1, r0
  80:	e02000a0 	eor	r0, r0, r0, lsr #1
  84:	e0200244 	eor	r0, r0, r4, asr #4
  88:	e02000a4 	eor	r0, r0, r4, lsr #1

(1)ldr r1, [pc, #112]

This sentence corresponds to the ldr r1 of the assembly file, =0xE0200240, and the function is to store 0xE0200240 in the r1 register.

[pc, #112] indicates the data at the address of pc+70 (#112 is decimal, 70 here is hexadecimal), at this time, PC points to the next two levels of the current address, that is, pc = 0 + 8, So pc + 70 = 78. The data stored at address 78 is e0200240, which is exactly equal to the data 0xE0200240 to be loaded by the assembly statement. So ldr r1, [pc, #112] and ldr r1, =0xE0200240 achieve the same function.

Note that the PC points to the next two stages of the current address because of the existence of the pipeline. The number of stages of the pipeline of different types of ARM chips is different, but in the disassembly file for uniformity, it is processed according to the 3-stage pipeline.

(2)ldr r0, [pc, #112]

Corresponding to ldr r0 of the assembly file, =0x00111000, the interpretation method is the same as above. The first statement was PC=0+8, now this is the second statement, so PC=4+8.

(3)str r0, [r1]

It is consistent with the assembly statement.

(4)mov r2, #4096

Corresponding to assembly mov r2, #0x1000, the two are the same, 4096 in decimal is equal to 0x1000 in hexadecimal.

Why load data to registers, some are loaded directly (mov r2, #4096), and some are loaded with relative addressing (ldr r1, [pc, #112])? This involves legal immediate numbers and illegal immediate numbers. Simply put, the data is too large to express the data part of a statement, so the data to be loaded is placed at a certain address, and when it is needed, go to this address. Take, the ldr at this time is also a pseudo-instruction.

4. View the list of symbols in the application

See What segments are ELF format files composed of? _The muddled blog-CSDN blog .

Guess you like

Origin blog.csdn.net/oqqHuTu12345678/article/details/129469477