The following content comes from the study and arrangement of network resources. If there is any infringement, please inform and delete it.
1. Tool Introduction
objdump is mainly used to display the contents of the target file.
The display here refers to the disassembly of the binary file content into assembly code, so "display" is approximately equivalent to "disassembly".
Reasons for using the objdump tool to disassemble:
(1) Reverse cracking. Disassemble the executable program to obtain the assembly code, and then infer the logic of the entire program based on the assembly code. This is not something ordinary people can do. It is very difficult to understand a large number of programs written in assembly language, let alone reverse the code logic of others.
(2) Debugging program. Disassembly debugging can help us understand and detect whether the generated executable program is normal, especially when understanding concepts such as link scripts and link addresses.
(3) Disassembly of executable files generated by compiling and linking the source code of C language can help us understand the corresponding relationship between C language and assembly language, and help us understand C language in depth.
2. The meaning of the options
Through the man manual, we can see that the format of the tool is as follows.
objdump [options] obj_file #[]表示可选,obj_file表示目标文件
The objdump tool supports many options, and only some commonly used options are listed here.
options | describe |
-d --disassemble |
Indicates the disassembly of the segment containing the machine code of the instruction (that is, the segment that needs to execute the instruction). |
-D --disassemble-all |
Indicates to disassemble all sections. |
-m machine --architecture=machine |
Specifies the architecture to use when disassembling object files. This option is useful when the file to be disassembled does not describe the architecture itself. The architectures that can be specified here can be listed with the -i option. |
-S --source |
Disassemble the source code as much as possible, especially when the debugging parameter -g is specified when compiling, the effect is more obvious. (That is, if you want to display the disassembly code and the source code alternately, you need to use the -g parameter when compiling, that is, you need debugging information) |
-s filename | Display the header file information of the file filename and the corresponding hexadecimal file code |
-i --info |
Display a list of architectures and target formats available for the -b or -m option. |
-l filename (lowercase L) | Insert filename and line number in disassembly code. |
-j section --section=section |
Only display the information of the specified section. |
-a filename | Displays the format of the file filename. |
-f filename | Display the header information of the file filename. |
-h filename | Display the header information of each section of the file filename (ie section overview) |
-x filename | Display all header file information of the file filename. |
-C filename | Reverse parse the C++ symbol name. |
-t filename | Output the content of the symbol list (variable names and function names are symbols) of the file filename. |
3. Interpretation of disassembly files for application
For example, the LED lighting project on the S5PV210 development board consists of a Makefile and a start.S file.
1. Makefile
The content of the Makefile is as follows. Its function is to compile the source files .S and .c into .o files first, and then link the .o files into executable files in .elf format. Among them, " arm-linux-objdump -D led.elf > led_elf.dis" means to disassemble led.elf into ed_elf.dis.
led.bin: start.o
arm-linux-ld -Ttext 0x0 -o led.elf $^ #链接:将.o文件链接成led.elf文件
arm-linux-objcopy -O binary led.elf led.bin #复制与转换:将.elf格式转换为.bin格式
arm-linux-objdump -D led.elf > led_elf.dis #反汇编,将led.elf文件反汇编成led_elf.dis文件
gcc mkv210_image.c -o mkx210
./mkx210 led.bin 210.bin
%.o : %.S
arm-linux-gcc -o $@ $< -c
%.o : %.c
arm-linux-gcc -o $@ $< -c
clean:
rm *.o *.elf *.bin *.dis mkx210 -f
2. Source code start.S file
The content of the start.S file is as follows, which consists of start, light, delay and infinite loop. Here we don't pay attention to the specific implemented functions, but focus on comparing with the files generated by disassembly.
//.globl 表明后面的变量有全局属性,对应于C语言的全局变量
.globl _start
_start:
设置GPJ0CON的bit[0:15],配置GPJ0_0/1/2/3引脚为输出功能
// 设置GPJ0CON的bit[12:23],配置GPJ0_3/4/5引脚为输出功能
ldr r1, =0xE0200240
ldr r0, =0x00111000
str r0, [r1]
mov r2, #0x1000
//设置GPD0_1为输出模式
ldr r1, =0xE02000A0
ldr r0, =0x00000010
str r0, [r1]
led_blink:
设置GPJ2DAT的bit[0:3],使GPJ2_0/1/2/3引脚输出低电平,LED亮
// 设置GPJ0DAT的bit[3:5],使GPJ0_3/4/5引脚输出低电平,LED亮
ldr r1, =0xE0200244
mov r0, #0
str r0, [r1]
ldr r1, =0xE02000A4
mov r0, #0
str r0, [r1]
// 延时
bl delay
设置GPJ2DAT的bit[0:3],使GPJ2_0/1/2/3引脚输出高电平,LED灭
// 设置GPJ0DAT的bit[3:5],使GPJ0_3/4/5引脚输出高电平,LED灭
ldr r1, =0xE0200244
mov r0, #0x38
str r0, [r1]
ldr r1, =0xE02000A4
mov r0, #0x2
str r0, [r1]
// 延时
bl delay
sub r2, r2, #1
cmp r2,#0
bne led_blink
halt:
b halt
delay:
mov r0, #0x900000
delay_loop:
cmp r0, #0
sub r0, r0, #1
bne delay_loop
mov pc, lr
3. Disassemble the contents of the led_elf.dis file
After executing make, the content of the obtained disassembly file led_elf.dis is as follows.
(1) The first line indicates that this assembler is generated by disassembling led.elf, and the program is 32 little endian.
(2) 00000000 <_start>, where <_start> is the label , corresponding to the _start label in the start.S file, and 00000000 is the address of the label <_start>. In fact, the label is equivalent to the function name in the C language. In the C language, the function name can also be used to represent the first address of the function, which can be confirmed here. The labels in the disassembly file are obtained from the assembly file, so that we can easily find the corresponding parts of the disassembly file and the assembly file.
(3) The disassembly file is divided into three columns, corresponding to: instruction address, instruction machine code, and instruction disassembled from the instruction machine code.
led.elf: file format elf32-littlearm //第一行
Disassembly of section .text:
//第一列 第二列 第三列
00000000 <_start>:
0: e59f1070 ldr r1, [pc, #112] ; 78 <delay_loop+0x10>
4: e59f0070 ldr r0, [pc, #112] ; 7c <delay_loop+0x14>
8: e5810000 str r0, [r1]
c: e3a02a01 mov r2, #4096 ; 0x1000
10: e59f1068 ldr r1, [pc, #104] ; 80 <delay_loop+0x18>
14: e3a00010 mov r0, #16
18: e5810000 str r0, [r1]
0000001c <led_blink>:
1c: e59f1060 ldr r1, [pc, #96] ; 84 <delay_loop+0x1c>
20: e3a00000 mov r0, #0
24: e5810000 str r0, [r1]
28: e59f1058 ldr r1, [pc, #88] ; 88 <delay_loop+0x20>
2c: e3a00000 mov r0, #0
30: e5810000 str r0, [r1]
34: eb00000a bl 64 <delay>
38: e59f1044 ldr r1, [pc, #68] ; 84 <delay_loop+0x1c>
3c: e3a00038 mov r0, #56 ; 0x38
40: e5810000 str r0, [r1]
44: e59f103c ldr r1, [pc, #60] ; 88 <delay_loop+0x20>
48: e3a00002 mov r0, #2
4c: e5810000 str r0, [r1]
50: eb000003 bl 64 <delay>
54: e2422001 sub r2, r2, #1
58: e3520000 cmp r2, #0
5c: 1affffee bne 1c <led_blink>
00000060 <halt>:
60: eafffffe b 60 <halt>
00000064 <delay>:
64: e3a00609 mov r0, #9437184 ; 0x900000
00000068 <delay_loop>:
68: e3500000 cmp r0, #0
6c: e2400001 sub r0, r0, #1
70: 1afffffc bne 68 <delay_loop>
74: e1a0f00e mov pc, lr
78: e0200240 eor r0, r0, r0, asr #4
7c: 00111000 andseq r1, r1, r0
80: e02000a0 eor r0, r0, r0, lsr #1
84: e0200244 eor r0, r0, r4, asr #4
88: e02000a4 eor r0, r0, r4, lsr #1
Disassembly of section .ARM.attributes:
00000000 <.ARM.attributes>:
0: 00001a41 andeq r1, r0, r1, asr #20
4: 61656100 cmnvs r5, r0, lsl #2
8: 01006962 tsteq r0, r2, ror #18
c: 00000010 andeq r0, r0, r0, lsl r0
10: 45543505 ldrbmi r3, [r4, #-1285] ; 0x505
14: 08040600 stmdaeq r4, {r9, sl}
18: Address 0x00000018 is out of bounds.
4. Interpretation of the disassembled led_elf.dis file
//汇编文件
_start:
设置GPJ0CON的bit[0:15],配置GPJ0_0/1/2/3引脚为输出功能
// 设置GPJ0CON的bit[12:23],配置GPJ0_3/4/5引脚为输出功能
ldr r1, =0xE0200240
ldr r0, =0x00111000
str r0, [r1]
mov r2, #0x1000
//对应的反汇编文件部分
00000000 <_start>:
0: e59f1070 ldr r1, [pc, #112] ; 78 <delay_loop+0x10>
4: e59f0070 ldr r0, [pc, #112] ; 7c <delay_loop+0x14>
8: e5810000 str r0, [r1]
c: e3a02a01 mov r2, #4096 ; 0x1000
......
70: 1afffffc bne 68 <delay_loop>
74: e1a0f00e mov pc, lr
78: e0200240 eor r0, r0, r0, asr #4
7c: 00111000 andseq r1, r1, r0
80: e02000a0 eor r0, r0, r0, lsr #1
84: e0200244 eor r0, r0, r4, asr #4
88: e02000a4 eor r0, r0, r4, lsr #1
(1)ldr r1, [pc, #112]
This sentence corresponds to the ldr r1 of the assembly file, =0xE0200240, and the function is to store 0xE0200240 in the r1 register.
[pc, #112] indicates the data at the address of pc+70 (#112 is decimal, 70 here is hexadecimal), at this time, PC points to the next two levels of the current address, that is, pc = 0 + 8, So pc + 70 = 78. The data stored at address 78 is e0200240, which is exactly equal to the data 0xE0200240 to be loaded by the assembly statement. So ldr r1, [pc, #112] and ldr r1, =0xE0200240 achieve the same function.
Note that the PC points to the next two stages of the current address because of the existence of the pipeline. The number of stages of the pipeline of different types of ARM chips is different, but in the disassembly file for uniformity, it is processed according to the 3-stage pipeline.
(2)ldr r0, [pc, #112]
Corresponding to ldr r0 of the assembly file, =0x00111000, the interpretation method is the same as above. The first statement was PC=0+8, now this is the second statement, so PC=4+8.
(3)str r0, [r1]
It is consistent with the assembly statement.
(4)mov r2, #4096
Corresponding to assembly mov r2, #0x1000, the two are the same, 4096 in decimal is equal to 0x1000 in hexadecimal.
Why load data to registers, some are loaded directly (mov r2, #4096), and some are loaded with relative addressing (ldr r1, [pc, #112])? This involves legal immediate numbers and illegal immediate numbers. Simply put, the data is too large to express the data part of a statement, so the data to be loaded is placed at a certain address, and when it is needed, go to this address. Take, the ldr at this time is also a pseudo-instruction.
4. View the list of symbols in the application
See What segments are ELF format files composed of? _The muddled blog-CSDN blog .