The use of gcc compiler and common instructions

Table of contents

This article mainly introduces some tools behind the gcc compiler and common commands of gcc

         1. Introduction to gcc

2. The tools behind gcc

        1. Binutils

        2. C runtime library

1. Preparations

2. Compilation process 

3. Analyze ELF files

        Three, gcc common commands 

        Four. Summary

        5. References


This article mainly introduces some tools behind the gcc compiler and common commands of gcc

1. Introduction to gcc

GCC means GNU c Compiler too. After so many years of development, GCC has not only supported C language: it now also supports Ada language, C++ language, Java language, Objective C language, Pascal language, COBOL language, and Mercury language that supports functional programming and logic programming ,etc. And GCC is no longer just the meaning of the GNUC language compiler, but has become the GNU Compiler Collection, which is the meaning of the GNU compiler family. On the other hand, when it comes to GCC's support for operating system platforms and hardware platforms, it can be summed up in one sentence: ubiquitous.

2. The tools behind gcc

        1. Binutils

A set of binary program processing tools, including: addr2line, ar, objcopy, objdump, as, ld, ldd, readelf, size, etc. This set of tools is an indispensable tool for development and debugging, and their brief introductions are as follows:

(1) addr2line: It is used to convert the program address into the corresponding program source file and the corresponding code line, and the corresponding function can also be obtained. This tool will help the debugger locate the corresponding source code location during debugging.

(2) as: Mainly used for assembly, please refer to the following for detailed introduction of assembly.

(3) ld: Mainly used for links, for details about links, please refer to the following text.

(4) ar: Mainly used to create static libraries. In order to facilitate the understanding of beginners, the concept of dynamic library and static library is introduced here:

  • If multiple .o object files are to be generated into a library file, there are two types of libraries, one is a static library and the other is a dynamic library.
  • In Windows, a static library is a file with a suffix of .lib, and a shared library is a file with a suffix of .dll. In Linux, the static library is a file with the suffix .a, and the shared library is a file with the suffix .so.
  • The difference between a static library and a dynamic library is that the moment when the code is loaded is different. The code of the static library has been loaded into the executable program during the compilation process, so the size is relatively large. The code of the shared library is loaded into the memory when the executable program is running, and is simply referenced during the compilation process, so the code size is small. In the Linux system, you can use the ldd command to view the shared libraries that an executable program depends on.
  • If there are multiple programs that need to run at the same time in a system and there are shared libraries among these programs, then using a dynamic library will save memory more.

(5) ldd: It can be used to view the shared libraries that an executable program depends on.

(6) objcopy: translate an object file into another format, such as converting .bin to .elf, or converting .elf to .bin, etc.

(7) objdump: The main function is to disassemble. For a detailed introduction to disassembly, see the following text.

(8) readelf: Display information about ELF files, see later for more information.

(9) size: List the size and total size of each part of the executable file, code segment, data segment, total size, etc. Please refer to the following for specific usage examples of using size.

        2. C runtime library

The C language standard is mainly composed of two parts: one part describes (the grammar, and the other part describes the C standard library. The C standard library defines a set of standard header files, and each header file contains some related functions, variables, type declarations and macro definitions. For example, the common printf function is a C standard library function, and its prototype is defined in the stdio header file.

The C language standard only defines the prototype of the C standard library function, and does not provide an implementation. Therefore, the C language compiler usually needs the support of a C runtime library (C Run Time Library, CRT). The C runtime library is often referred to simply as the C runtime library. Similar to the C language, C++ also defines its own standard and provides related support libraries, called C++ runtime libraries.

1. Preparations

Since the GCC toolchain is mainly used in the Linux environment, this article will also use the Linux system as the working environment. In order to be able to demonstrate the whole process of compilation, first create a working directory test0, and then use a text editor to generate a simple Hello.c program written in C language as an example, and its source code is as follows:

2. Compilation process 

        1. Pretreatment

The preprocessing process mainly includes the following processes;

(1) Delete all #defines, expand all macro definitions, and process all conditional precompiled instructions, such as #if #ifdef #elif #else #endif, etc.

(2) Process the #include precompiled directive, and insert the included file into the position of the precompiled directive.

(3) Delete all comments "/ /" and "/*, */".

(4) Add the line number and file identification, so that the line number for debugging and the line number for compilation error warning will be generated during compilation.

(5) All #pragma compiler directives are reserved, and they are required for subsequent compilation processes.

The command for preprocessing with gcc is as follows:

gcc -E Hello.c -o Hello.i

Preprocess the source file Hello.c to generate Hello.i

GCC's option -E causes GCC to stop after preprocessing

The Hello.i file can be opened and viewed as a normal text file, and its code snippet is as follows:

Hello.i code snippet

         2. compile

The compilation process is to perform a series of lexical analysis, syntax analysis, semantic analysis and optimization on the preprocessed files to generate corresponding assembly codes.

The command to compile with gcc is as follows:

 gcc -S Hello.i -o Hello.s

Compile the hello.i file generated by preprocessing to generate the assembler Hello.s

The GCC option -S causes GCC to stop after compiling and generate an assembler

The code fragment of the assembler Hello.s generated by the above command is shown below, all of which are assembly codes.

 Hello.s code snippet

         3. Compilation

The assembly process call processes the assembly code, generates instructions that the processor can recognize, and saves them in the object file with the suffix .o. Since each assembly statement almost corresponds to a processor instruction, the assembly process is relatively simple compared to the compilation process, which can be translated one by one by calling the assembler as in Binutils according to the comparison table of assembly instructions and processor instructions.

When the program is composed of multiple source code files, each file must first complete the assembly work, and only after the .o object file is generated can it enter the next step of the link work. Note: Object files are already part of the final program, but cannot be executed until linked.

The command to compile with gcc is as follows:

gcc -c Hello.s -o Hello.o

Compile the generated Hello.s file to generate the target file Hello.o

GCC's option -c causes GCC to stop after executing the assembly and generate the object file

Or call as directly for assembly

as -c Hello.s -o Hello.o

Use as in Binutils to assemble the Hello.s file to generate an object file Note: The Hello.o object file is a redirectable file in ELF (Executable and Linkable Format) format.

       4. link

Links are also divided into static links and dynamic links, the main points are as follows:

(1) Static linking refers to directly adding the static library to the executable file during the compilation stage, so that the executable file will be relatively large. The linker copies the function's code from its location (either in a different object file or in a statically linked library) into the final executable program. In order to create an executable file, the main tasks that the linker must complete are: symbol resolution (associating the definition and reference of the symbol in the object file) and relocation (corresponding the symbol definition to the memory address and then modifying all references to the symbol ).

(2) Dynamic linking means that only some description information is added in the linking stage, and the corresponding dynamic library is loaded into the memory from the system when the program is executed.

  • In the Linux system, the order of the dynamic library search path when gcc compiles and links is usually: first search from the path specified by the parameter -L of the gcc command; then address from the path specified by the environment variable LIBRARY_PATH: then from the default path /lib, /usr/lib, /usr/local/lib look for.
  • In the Linux system, the order of the dynamic library search path when executing binary files is usually: first search the dynamic library search path specified when compiling the object code: then address from the path specified by the environment variable LD_LIBRARY_PATH; then from the configuration file /etc/ The dynamic library search path specified in ld.so.conf; then search from the default path /lib, /usr/lib.
  • In the Linux system, you can use the 1dd command to view the shared libraries that an executable program depends on.

Since the paths for linking dynamic libraries and static libraries may overlap, if there are static library files and dynamic library files with the same name in the path, such as libtest . .so, if you want gcc to choose to link libtest.a, you can specify the gcc option -static, which will force the use of a static library for linking. Take Hello World as an example:

        If you use the command " gcc Hello.c -o Hello ", the dynamic library will be used for linking. The size of the generated ELF executable file (use the size command of Binutils to view) and the linked dynamic library (use the ldd command of Binutils to view) are as follows Shown:

gcc Hello.c -o Hello

size Hello //Use size to check the size

 ldd Hello //It can be seen that the executable file is linked to many other dynamic libraries, mainly the glibc dynamic library of Linux

If you use the command " gcc -static Hello.c -o Hello ", the static library will be used for linking, the size of the generated ELF executable file (use the size command of Binutils to view> and the linked dynamic library (use the ldd command of Binutils to view )As follows: 

gcc -static Hello.c -o Hello.o

size Hello //Use size to check the size

 It can be seen from the figure that the code size of the text becomes extremely large

see Hello

 Indicates that no dynamic library is linked

The final file generated by the linker is an executable file in ELF format. An ELF executable file is usually linked into different segments, such as .text, .data, .rodata, and .bss .

3. Analyze ELF files

        1. Segments of the ELF file

The ELF file format is shown in the figure below, and the sections between the ELF Header and Section Header Table are sections. A typical ELF file contains the following sections:

.text: The instruction code segment of the compiled program.

.rodata:ro stands for read only, that is, read-only data (such as constant const).

.data: Initialized C program global variables and static local variables.

.bss : Uninitialized C program global variables and static local variables.

.debug: debug symbol table, the debugger uses the information in this section to help debug.

 You can use readelf -S to view the information of each section as follows:

readelf -S Hello

        2. Disassemble ELF

 Since ELF files cannot be opened as ordinary text files, if you want to directly view the instructions and data contained in an ELF file, you need to use the disassembly method.

Disassemble it using objdump -D as follows:

objdump -D Hello

Use objdump -S to disassemble it and display its C language source code mixed:

gcc -o hello -g Hello.c //To add -g option 

objdump -S Hello

Three, gcc common commands 

        There are already many common gcc commands in the above two, so I won’t explain too much here. If you want to know more detailed gcc common commands, you can refer to the following links:

Browse icon-default.png?t=M85Bhttps://mooc1.chaoxing.com/ueditorupload/read?objectId=94fdef0ff9306a1d78c5d95704d1e248&fileOriName=Linux%2520GCC%25E5%25B8%25B8%25E7%2594%25A8%25E5%2591%25BD%25E4%2 5BB%25A4.pdf

Four. Summary

For the further study of gcc, I can't help but sigh that gcc is really a powerful compiler. Gcc is not fighting alone. There are actually a lot of comrades behind gcc, and they have made great contributions to making gcc stronger. At the same time, I also have a certain understanding of some commonly used commands of gcc, and also consolidated the compilation process of the program, and gained a deeper and clearer understanding, which has been very fruitful.

5. References

Browse icon-default.png?t=M85Bhttps://mooc1.chaoxing.com/ueditorupload/read?objectId=b9616c5b28b9b0b12e4df8411148087e&fileOriName=GCC%25E7%25BC%2596%25E8%25AF%2591%25E5%2599%25A8%25E8%2583%2 58C%25E5%2590%258E %25E7%259A%2584%25E6%2595%2585%25E4%25BA%258B.pdf

Guess you like

Origin blog.csdn.net/qq_55894922/article/details/126878864