Environment and Preprocessing of C Language Program

  Let's take a brief look at the compilation and preprocessing process of C language programs. The C language code we wrote has gone through a relatively complicated process from running to generating results on the screen. Below, we briefly understand this process.

  From the source file to the executable program, it needs to go through two major processes of compilation and linking. In the compilation stage, it needs to go through three processes of pre-compilation, compilation and assembly. These are called translation environments.

So, what exactly is done in these processes?

1. The translation environment and execution environment of the program.

1. The process of translation environment

  Each source file that composes a program will generate its own .obj (windows) file, that is, the object file, through the compilation process. Then, these object files are connected with the link library through the action of the linker, and finally form the executable file of .exe. At the same time, the linker will introduce any function used by the program in the C standard function library, and it can also search the programmer's personal program library and link the functions it needs to the program.

  In the preprocessing (compilation) stage, the compiler will process the preprocessing instructions, such as the expansion of the header file, the replacement of the define instruction, and the deletion of comments written by the programmer, etc. The preprocessing intelligence processes the statements starting with #. Assemble new C/C++ programs according to preprocessing directives. After preprocessing, there will be an output file without header files (all have been expanded), macro definitions (all have been replaced), no conditional compilation instructions (all the masked ones are masked out), and no special symbols. This file The meaning is the same as the original document, but the content is different.

  During the compilation phase, the compiler will convert the code, converting the C language code into assembly code. After a series of lexical analysis, syntax analysis, semantic analysis and optimization are performed on the preprocessed files one by one, corresponding assembly code files are generated. Compilation is for a single file, and it only checks whether the syntax of this file is wrong, and is not responsible for finding entities.

At the assembly stage, the assembly code is converted into binary, and the symbol table is formed at the same time.

  In the linking stage, the object files (and perhaps library files) are linked together by the linker to generate a complete executable program. The main job of the linker is to connect the related object files with each other, that is, to connect the symbols referenced in one file with the definitions of the symbols in another file, so that all these object files become one that can be used by the operating system. Load the unified whole of execution. In the process will find out whether the called function is defined. It should be noted that only the called function/global variable will be linked in the linking phase. If there is a declaration (function declaration, external declaration of global variable) that does not exist, but it is not called, it can still be compiled and executed normally.

 2, the operating environment process

1. The program must be loaded into memory. In an operating system environment, it is generally done by the operating system. In a stand-alone environment, the loading of programs may be arranged manually, or it may be desirable to implement executable code implanted into institutional memory.

2, the program starts to execute, and then calls the main function.

3. Start executing the program code. At this point the program will use a runtime stack (stack) to store the function's local variables and return addresses. Programs can also use static memory. Variables stored in static memory retain their values ​​throughout the execution of the program.

4. Terminate the program. Terminate the main function normally and return 0; it may also terminate unexpectedly and return a random value.

Next, let's look at preprocessing.

2. Detailed preprocessing

1. Predefined symbols

__FILE__ //The source file to compile, two _.

__LINE__ //The current line number of the file

__DATE__ //The date the file was compiled

__TIME__ //The time the file was compiled

__STDC__ //If the compiler follows ANSI C, its value is 1, otherwise undefined

These are built into the C language.

 2,#define

(1), simple application

Syntax: #define name stuff

 Note that when define preprocessing defines identifiers, do not add ; at the end. This is because it just does a simple replacement.

 The above error is reported, if a simple replacement is made, then it is

    printf("the value of MM::%d\n", 100;);

(2), #define defines the macro

The #define mechanism includes a provision that allows parameter substitution into text, an implementation commonly referred to as a macro or define macro.

Macro declaration:

#define name( parament-list ) stuff

The parament-list is a comma-separated list of symbols that may appear in stuff.

 For example, use a macro to implement a search for the maximum value.

#define MAX(x, y) x>y?x:y

int main()
{
	int a = 10;
	int b = 20;
	int ret = MAX(a, b);
	printf("%d\n", ret);

	return 0;
}

Note here that the left parenthesis of the parameter list must be next to the name, otherwise it is considered to be part of the stuff. In the process, the macro just makes a simple substitution. We pass in a, b, and it will be replaced by a>b?a:b. Then substitute in the value. Because macros are simple substitutions, be careful about precedence.

#define NUM(n) n + 1;
int main()
{
	int a = 10;
	int ret = 2 * NUM(a);
	printf("%d\n", ret);//结果是?

	return 0;
}

The logic of the operation is 2 * a + 1, not 2 * (a + 1). To avoid this problem, add parentheses.

So macro definitions used to evaluate numeric expressions should be parenthesized in this way to avoid unpredictable interactions between operators in arguments or adjacent operators when using the macro.

(3), #define replacement rule

1. When calling the macro, the parameters are first checked to see if they contain any symbols defined by #define. If so, they are replaced first.

2. The replacement text is then inserted into the program in place of the original text. For macros, parameter names are replaced by their values.

3. Finally, scan the resulting file again to see if it contains any symbols defined by #define. If so, repeat the above process.

To be careful of:

1. Other #define-defined variables can appear in macro parameters and #define definitions. But for macros, recursion cannot occur.

2. When the preprocessor searches for symbols defined by #define, the contents of string constants are not searched. Such as the MM defined above. The MM in the string is not replaced.

(4), # and ##

So we think, how to insert parameters into the string?

Since strings are automatically connected, we can use define to write:

#define PRINT(name, val) printf("the "name" is %d\n", val)
int main()
{

	PRINT("a", 18);

	return 0;
}

Another is to use # to turn macro parameters into corresponding strings. The a here requires double quotes, because the whole is a string, and the string needs to be spliced. If it is a single quote, an error will be reported.

The role of # is to turn a macro parameter into a corresponding string. After # processing, such as #n, it will be processed as "n".

#define PRINT(n) printf("the val of "#n" is %d\n", n)

int main()
{
	int a = 10;
	printf("the val of a is %d\n", a);
	PRINT(a);
	int b = 10;
	PRINT(b);

	return 0;
}

The function of ## is to combine the coincidences on both sides into one symbol. It allows macro definitions to create identifiers from separate pieces of text. It should be noted that the synthesized symbols must be legal, such as in##t, after the synthesis is int, or a defined variable, otherwise the result is undefined.

It should be noted that the parameters of the macro are preferably without side effects, such as:

x++; //with side effects

x+1; //without side effects

Where x++ will cause problems in the operation after the replacement.

#define max(x, y) ((x)>(y)?(x):(y))

int main()
{
	int a = 3;
	int b = 5;
	int m = max(a++, b++);//先用,在++
	printf("%d %d %d\n", m, a, b);//结果是?

	return 0;
}

Analysis: Since the post ++ is used first, in ++, the values ​​3 and 5 of a and b are replaced first, that is ((a++)>(b++)?(a++):(b++)). After entering, it is 3>5? The result is not established. Then, the left side of the colon is not executed, and the colon is followed by ++. After the result is returned, a is 3 and b is 6. Given 6, the value of m is 6. Two are in ++, that is, a is 4, b is 7,

(5), macro and function comparison

Macros are often used to perform simple operations.

 In general, macro names should be capitalized, and function names should not be capitalized.

3,#undef

Used to remove a macro definition.

#undef NAME //If an existing name needs to be redefined, its old name needs to be removed first.

 4. Conditional compilation

When writing code, sometimes we need to temporarily not need some code, it is a pity to delete it, and it is a pity to keep it. We can either comment it out directly or use conditional compilation to do it.

#define __DEBUG__ 

int main()
{
	int i = 0;
	int arr[10] = { 0 };
	for (i = 0; i < 10; i++)
	{
		arr[i] = i;
#ifdef __DEBUG__  //判断某个宏是否被定义,若已定义,执行随后的语句
		printf("%d\n", arr[i]);//为了观察数组是否赋值成功。 
#endif //__DEBUG__  #if, #ifdef, #ifndef这些条件命令的结束标志.
	}
	return 0;
}

The specific use is not discussed here.

5, the file contains

When we write C language code, we often write #include preprocessing directives to make another file compiled. Such as the header file <stdio>.

There are two types of header files, the first is the <> method. It is to go directly to the standard path to search, it will not look for the header file written by the programmer personally, if not, it will report an error; the second is the "" method. It first goes to the directory of the source file to find it, if not, it goes to the standard path to find it, if not, it reports an error.

For library files can also be included in the form of ""? The answer is yes, but the search efficiency is lower, of course, it is not easy to distinguish whether it is a library file or a local file.

Sometimes, we need to write multiple identical and use multiple identical header files. If a large number of duplicate header files are included, the compiler will compile multiple times. To solve this problem, the following methods can be used:

At the beginning of each header file write:

#ifndef __TEST_H__

#define __TEST_H__ //Content of header file

#endif   //__TEST_H__

or

#pragma once

There are many more preprocessing directives, such as #error #pragma #line ... I will not introduce them.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324119266&siteId=291194637