嵌入式linux编程arm初步接触之优化级别

使用上一遍文章的代码,更改一下Makefile选项测试,原来反汇编生成的代码及Makefile文件内容如下

led_elf:     file format elf32-littlearm 
Disassembly of section .text: 
00000000 <_start>:
   0:	e3a00453 	mov	r0, #1392508928	; 0x53000000
   4:	e3a01000 	mov	r1, #0	; 0x0
   8:	e5801000 	str	r1, [r0]
   c:	e3a0da01 	mov	sp, #4096	; 0x1000
  10:	eb000000 	bl	18 <main> 
00000014 <halt_loop>:
  14:	eafffffe 	b	14 <halt_loop>
00000018 <main>:
  18:	e1a0c00d 	mov	ip, sp
  1c:	e92dd800 	stmdb	sp!, {fp, ip, lr, pc}
  20:	e24cb004 	sub	fp, ip, #4	; 0x4
  24:	e3a03456 	mov	r3, #1442840576	; 0x56000000
  28:	e2833050 	add	r3, r3, #80	; 0x50
  2c:	e3a02c01 	mov	r2, #256	; 0x100
  30:	e5832000 	str	r2, [r3]
  34:	e3a03456 	mov	r3, #1442840576	; 0x56000000
  38:	e2833054 	add	r3, r3, #84	; 0x54
  3c:	e3a02000 	mov	r2, #0	; 0x0
  40:	e5832000 	str	r2, [r3]
  44:	e3a03000 	mov	r3, #0	; 0x0
  48:	e1a00003 	mov	r0, r3
  4c:	e89da800 	ldmia	sp, {fp, sp, pc}

led.bin:startup.S led.c
	arm-linux-gcc -g -c -o startup.o startup.S
	arm-linux-gcc -g -c -o led.o led.c
	arm-linux-ld -Ttext 0x00000000 -g startup.o led.o -o led_elf
	arm-linux-objcopy -O binary -S led_elf led.bin
	arm-linux-objdump -D -m arm led_elf > led.dis

决定去掉-g选项,同时优化级别定位O1,makefile及反汇编代码如下

led_elf:     file format elf32-littlearm
Disassembly of section .text:
00000000 <_start>:
   0:	e3a00453 	mov	r0, #1392508928	; 0x53000000
   4:	e3a01000 	mov	r1, #0	; 0x0
   8:	e5801000 	str	r1, [r0]
   c:	e3a0da01 	mov	sp, #4096	; 0x1000
  10:	eb000000 	bl	18 <main>
00000014 <halt_loop>:
  14:	eafffffe 	b	14 <halt_loop>
00000018 <main>:
  18:	e3a03456 	mov	r3, #1442840576	; 0x56000000
  1c:	e2833050 	add	r3, r3, #80	; 0x50
  20:	e3a02c01 	mov	r2, #256	; 0x100
  24:	e4032050 	str	r2, [r3], #-80
  28:	e2833054 	add	r3, r3, #84	; 0x54
  2c:	e3a00000 	mov	r0, #0	; 0x0
  30:	e5830000 	str	r0, [r3]
  34:	e1a0f00e 	mov	pc, lr

led.bin:startup.S led.c
	arm-linux-gcc -O -c -o startup.o startup.S
	arm-linux-gcc -O -c -o led.o led.c
	arm-linux-ld -Ttext 0x00000000 -g startup.o led.o -o led_elf
	arm-linux-objcopy -O binary -S led_elf led.bin
	arm-linux-objdump -D -m arm led_elf > led.dis

可以清楚地看到,优化以后代码大小从0x4c缩减到0x34。编译器认为main函数没有局部变量,没有对寄存器操作,所以,调用main函数的时候省去压栈及设置堆栈操作,返回的时候直接使用mov pc,lr返回,优化了一些不必要的操作,我们再试试,选择O2选项,看看优化后的代码是怎样的

led_elf:     file format elf32-littlearm
Disassembly of section .text:
00000000 <_start>:
   0:	e3a00453 	mov	r0, #1392508928	; 0x53000000
   4:	e3a01000 	mov	r1, #0	; 0x0
   8:	e5801000 	str	r1, [r0]
   c:	e3a0da01 	mov	sp, #4096	; 0x1000
  10:	eb000000 	bl	18 <main>
00000014 <halt_loop>:
  14:	eafffffe 	b	14 <halt_loop>
00000018 <main>:
  18:	e3a02000 	mov	r2, #0	; 0x0
  1c:	e3a01456 	mov	r1, #1442840576	; 0x56000000
  20:	e3a03c01 	mov	r3, #256	; 0x100
  24:	e1a00002 	mov	r0, r2
  28:	e5813050 	str	r3, [r1, #80]
  2c:	e5812054 	str	r2, [r1, #84]
  30:	e1a0f00e 	mov	pc, lr

led.bin:startup.S led.c
	arm-linux-gcc -O2 -c -o startup.o startup.S
	arm-linux-gcc -O2 -c -o led.o led.c
	arm-linux-ld -Ttext 0x00000000 -g startup.o led.o -o led_elf
	arm-linux-objcopy -O binary -S led_elf led.bin
	arm-linux-objdump -D -m arm led_elf > led.dis

使用O2级别生成的可执行代码从O1的0x34减少到了0x30,节省了一条指令,然后,我们再看看O3选项会怎样

led_elf:     file format elf32-littlearm
Disassembly of section .text:
00000000 <_start>:
   0:	e3a00453 	mov	r0, #1392508928	; 0x53000000
   4:	e3a01000 	mov	r1, #0	; 0x0
   8:	e5801000 	str	r1, [r0]
   c:	e3a0da01 	mov	sp, #4096	; 0x1000
  10:	eb000000 	bl	18 <main>
00000014 <halt_loop>:
  14:	eafffffe 	b	14 <halt_loop>
00000018 <main>:
  18:	e3a02000 	mov	r2, #0	; 0x0
  1c:	e3a01456 	mov	r1, #1442840576	; 0x56000000
  20:	e3a03c01 	mov	r3, #256	; 0x100
  24:	e1a00002 	mov	r0, r2
  28:	e5813050 	str	r3, [r1, #80]
  2c:	e5812054 	str	r2, [r1, #84]
  30:	e1a0f00e 	mov	pc, lr

led.bin:startup.S led.c
	arm-linux-gcc -O3 -c -o startup.o startup.S
	arm-linux-gcc -O3 -c -o led.o led.c
	arm-linux-ld -Ttext 0x00000000 -g startup.o led.o -o led_elf
	arm-linux-objcopy -O binary -S led_elf led.bin
	arm-linux-objdump -D -m arm led_elf > led.dis

我们看一下代码,跟O2结果一样,说明代码基本已经没有优化的余地了,再试一下最后一个选项O0,不优化,看看代码是怎样的

led_elf:     file format elf32-littlearm
Disassembly of section .text:
00000000 <_start>:
   0:	e3a00453 	mov	r0, #1392508928	; 0x53000000
   4:	e3a01000 	mov	r1, #0	; 0x0
   8:	e5801000 	str	r1, [r0]
   c:	e3a0da01 	mov	sp, #4096	; 0x1000
  10:	eb000000 	bl	18 <main>
00000014 <halt_loop>:
  14:	eafffffe 	b	14 <halt_loop>
00000018 <main>:
  18:	e1a0c00d 	mov	ip, sp
  1c:	e92dd800 	stmdb	sp!, {fp, ip, lr, pc}
  20:	e24cb004 	sub	fp, ip, #4	; 0x4
  24:	e3a03456 	mov	r3, #1442840576	; 0x56000000
  28:	e2833050 	add	r3, r3, #80	; 0x50
  2c:	e3a02c01 	mov	r2, #256	; 0x100
  30:	e5832000 	str	r2, [r3]
  34:	e3a03456 	mov	r3, #1442840576	; 0x56000000
  38:	e2833054 	add	r3, r3, #84	; 0x54
  3c:	e3a02000 	mov	r2, #0	; 0x0
  40:	e5832000 	str	r2, [r3]
  44:	e3a03000 	mov	r3, #0	; 0x0
  48:	e1a00003 	mov	r0, r3
  4c:	e89da800 	ldmia	sp, {fp, sp, pc}

led.bin:startup.S led.c
	arm-linux-gcc -O0 -c -o startup.o startup.S
	arm-linux-gcc -O0 -c -o led.o led.c
	arm-linux-ld -Ttext 0x00000000 -g startup.o led.o -o led_elf
	arm-linux-objcopy -O binary -S led_elf led.bin
	arm-linux-objdump -D -m arm led_elf > led.dis

我们发现,不优化代码跟我们不加优化代码选项是一样的。通过以上反汇编分析,我大致得到如下经验:
1.不加优化选项默认就是不优化
2.O1级别仅仅对函数调用进行优化,设计实际代码内容不优化,可以调试(经验,不一定准确,暂时这么认为)
3.使用O2级别,对指令执行序列进行了优化,不可以调试(因为优化后执行代码跟源代码无法全部对应,人分辨不出来)
4.使用O3级别,对指令执行序列进行了最高级别优化,不可以调试

猜你喜欢

转载自blog.csdn.net/u010422438/article/details/81698333