What the GCC -O1 -O2 -O3 optimization is the principle?

https://www.zhihu.com/question/27090458

Author: know almost Users
link: https: //www.zhihu.com/question/27090458/answer/137944410
Source: know almost
copyrighted by the author. Commercial reprint please contact the author authorized, non-commercial reprint please indicate the source.
 

Click excerpt Using the GNU Compiler Collection (GCC)
Generally, if the identifier is not specified, then the optimization, gcc debug code will be generated, will be independent between each instruction: instruction may be disposed between breakpoints, using gdb in the p command to view the value of a variable, such as changing the value of the variable. And to get the fastest compilation speed as its target.

After optimization flag is enabled, gcc compiler will attempt to change the structure of the program (of course under the premise of ensuring the source program after the conversion semantically equivalent) to meet certain objectives, such as: minimum code size or run faster (but generally speaking, these two objectives are contradictory, but not both).

Gcc in different configurations and target platform, the same kind of optimization adopted a logo is not the same, which can be used to obtain -Q --help = optimizers identify each optimization optimization options enabled.

The following identifies each -f ** optimization can be found in the above-explained link


1.-O, -O1:
effect of these two commands is the same, the object is compiled without affecting the speed, as far as possible some optimization algorithm reduces code size and speed of executable code. And open the following optimization options:

-fauto-inc-dec 
-fbranch-count-reg 
-fcombine-stack-adjustments 
-fcompare-elim 
-fcprop-registers 
-fdce 
-fdefer-pop 
-fdelayed-branch 
-fdse 
-fforward-propagate 
-fguess-branch-probability 
-fif-conversion2 
-fif-conversion 
-finline-functions-called-once 
-fipa-pure-const 
-fipa-profile 
-fipa-reference 
-fmerge-constants 
-fmove-loop-invariants 
-freorder-blocks 
-fshrink-wrap 
-fshrink-wrap-separate 
-fsplit-wide-types 
-fssa-backprop 
-fssa-phiopt 
-fstore-merging 
-ftree-bit-ccp 
-ftree-ccp 
-ftree-ch 
-ftree-coalesce-vars 
-ftree-copy-prop 
-ftree-dce 
-ftree-dominator-opts 
-ftree-dse 
-ftree-forwprop 
-ftree-fre 
-ftree-phiprop 
-ftree-sink 
-ftree-slsr 
-ftree-sra 
-ftree-pta 
-ftree-ter 
-funit-at-a-time

2. -O2
the compiler optimization options will sacrifice some speed, in addition to performing all the optimization -O1 performed outside, but also the use of optimization algorithms almost all of the target configuration support, to improve the operating speed of the object code.

-fthread-jumps 
-falign-functions  -falign-jumps 
-falign-loops  -falign-labels 
-fcaller-saves 
-fcrossjumping 
-fcse-follow-jumps  -fcse-skip-blocks 
-fdelete-null-pointer-checks 
-fdevirtualize -fdevirtualize-speculatively 
-fexpensive-optimizations 
-fgcse  -fgcse-lm  
-fhoist-adjacent-loads 
-finline-small-functions 
-findirect-inlining 
-fipa-cp 
-fipa-cp-alignment 
-fipa-bit-cp 
-fipa-sra 
-fipa-icf 
-fisolate-erroneous-paths-dereference 
-flra-remat 
-foptimize-sibling-calls 
-foptimize-strlen 
-fpartial-inlining 
-fpeephole2 
-freorder-blocks-algorithm=stc 
-freorder-blocks-and-partition -freorder-functions 
-frerun-cse-after-loop  
-fsched-interblock  -fsched-spec 
-fschedule-insns  -fschedule-insns2 
-fstrict-aliasing -fstrict-overflow 
-ftree-builtin-call-dce 
-ftree-switch-conversion -ftree-tail-merge 
-fcode-hoisting 
-ftree-pre 
-ftree-vrp 
-fipa-ra


3. -O3
This option is in addition to performing all -O2 optimization options outside, usually take a lot of vectorization algorithm to improve the degree of parallel execution of the code, the use of modern CPU in the pipeline, Cache and so on.

-finline-functions      // 采用一些启发式算法对函数进行内联
-funswitch-loops        // 执行循环unswitch变换
-fpredictive-commoning  // 
-fgcse-after-reload     //执行全局的共同子表达式消除
-ftree-loop-vectorize   // 
-ftree-loop-distribute-patterns
-fsplit-paths 
-ftree-slp-vectorize
-fvect-cost-model
-ftree-partial-pre
-fpeel-loops 
-fipa-cp-clone options

This option will increase the size of the executable code, of course, will reduce the execution time of the object code.

4. -Os
the identification and optimization -O3 have the same purpose, of course, the two goals are not the same, the goal is to -O3 would rather increase the size of the object code, but also trying very hard to improve the speed, but this option is -O2 the basis, as much as possible to reduce the size of the object code, which is very important for the storage capacity of the device is very small.
In order to reduce the size of the object code will disable the following optimization options, compression is generally aligned blank (alignment padding) memory

-falign-functions  
-falign-jumps  
-falign-loops 
-falign-labels
-freorder-blocks  
-freorder-blocks-algorithm=stc 
-freorder-blocks-and-partition  
-fprefetch-loop-arrays

5. -Ofast:
This option will not strictly follow the standard language, in addition to enable all optimization options -O3 outside, will enable the partially optimized for certain languages. Such as: -ffast-math, for the Fortran language, but also enable the following options:

-fno-protect-parens 
-fstack-arrays

6.-Og:
The logo will be carefully selected part of the optimization option -g option does not conflict, of course, will be able to provide a reasonable level of optimization, while producing better debugging information and the extent of language standards can be used.

Guess you like

Origin blog.csdn.net/chengde6896383/article/details/93737842