[C] Program environment and preprocessing

The translation environment and execution environment of the program

In any implementation of ANSI C, there are two distinct environments.

The first is the translation environment, where source code is converted into executable machine instructions. The second is the execution environment, which is used to actually execute the code.

translation environment

Translation environment We mainly explain the compilation and linking process.

Our source file is our .c file through the compiler to generate an object file (windows is .obj, Linux is .o) suffixed file, we have multiple .c files in one project, so there will be multiple object files, they are linked by the linker and then together with the link library (static library) to produce our executable program and timely .exe file.

in conclusion:

Each source file that makes up a program is individually converted into object code through the compilation process.
Each object file is bundled together by a linker to form a single and complete executable program.
The linker also imports any functions in the standard C library used by the program, and it can search the programmer's personal library and link the functions it needs into the program.

compile

Compilation is divided into three stages, precompilation, compilation, and assembly.

Precompiled

What do precompilers do?

Remove all comments.
The inclusion of #include header files.
Replacement for #define symbols.
Actions for all preprocessing directives.

compile

What does our program do during this process?

Translate our code into assembly instructions.
Grammatical analysis Word
meaning analysis
Semantic analysis
Symbol summary

compilation

What does our program do in this process?

Translate our assembly instructions into binary instructions that computers can understand,
form a symbol table,
and generate our target files

Link

Since our multiple object files have their own symbol tables and various segments, we mainly do in the link:

Merge segment table
Merge of symbol tables and relocation of symbols (addresses with actual meaning)

execution environment

1. The program must be loaded into memory. In an environment with an operating system: Generally this is done by the operating system. In a standalone environment, program loading must be arranged manually, possibly by placing executable code into read-only memory.
2. The execution of the program starts. Then call the main function.
3. Start executing the program code. At this time, the program will use a runtime stack (stack) to store the local variables and return address of the function. Programs can also use static memory. Variables stored in static memory retain their values ​​throughout the execution of the program.
4. Terminate the program. Normal termination of the main function; unexpected termination is also possible.

preprocessing

predefined symbols

These predefined symbols are built-in to the language.
insert image description here
Code demo:

#include <stdio.h>
int main()
{
    
    
	printf("%s\n", __FILE__);
	printf("%d\n", __LINE__);
	printf("%s\n", __DATE__);
	printf("%s\n", __TIME__);
	return 0;
}

operation result:
insert image description here

#define

define identifier

grammar:

#define name stuff

Let me give you a few examples:

#define M 100
#define FOR for(;;)
#define uint unsigned int
#define CASE break;case

These definition methods are all possible, and it can be followed by a number, a type, or even a piece of code, which is all possible.

When define defines an identifier, do you want to add at the end; ?

It is not recommended to add ; , which can easily lead to problems.

define macro

The #define mechanism includes a provision that allows parameters to be substituted into the text, and this implementation is often called a macro (macro) or define macro (definemacro).

grammar:

#define name( parament-list ) stuff where parament-list is a comma-separated list of symbols that may appear in stuf.

Notice:

The opening parenthesis of the parameter list must be immediately adjacent to name. If any white space exists between the two, the argument list is interpreted as part of the stuf.
Example: #define SQUARE( x ) x * x

When we define a macro, it is best to put a parenthesis on each number, and then add a parenthesis to the whole at the end. Why do we do this?
Let's look at a piece of code:

#include <stdio.h>

#define DOUBLE(x)  x*x
#define DOUBLE1(x)  ((x)*(x))
int main()
{
    
    
	printf("%d\n", DOUBLE(3));     // 我们预期是9
	printf("%d\n", DOUBLE(3+2));  // 我们预期是25
	printf("%d\n", DOUBLE1(3));   // 我们预期是9
	printf("%d\n", DOUBLE1(3+2)); // 我们预期是25
	return 0;
}

Running results:
insert image description here
We can see that the result of the second calculation is different from ours, so why is this?
In fact, when we #define is operating, we will replace the content. What we actually execute is:

3+2*3+2

this code. The result is 11, which is why I put each number in parentheses when defining the macro, and then add a parenthesis as a whole at the end, otherwise unexpected results will appear.

Replacement rules for #define

1. When calling a macro, the parameters are first checked to see if they contain any symbols defined by #define. If yes, they are replaced first.
2. The replacement text is then inserted into the program in place of the original text. For macros, parameter names are replaced by their values.
3. Finally, the resulting file is scanned again to see if it contains any symbols defined by #define. If yes, repeat the above processing.

Notice:

Variables defined by other #define definitions can appear in macro parameters and #define definitions. But with macros, recursion cannot occur.
When the preprocessor searches for symbols defined by #define, the contents of string constants are not searched.

#and##

Here is a piece of knowledge for everyone, that is to say, when we want to print Hello world on the screen, we will definitely write like this:

printf(“Hello world\n”);

In fact, the following writing method is also possible, and our C language supports writing like this

printf("Hello "“world\n”);

1. The role of #

Use # to turn a macro parameter into a corresponding string.

Code demo:

#include <stdio.h>
#define PRINT(s) printf("the " #s " is %d",s)
int main()
{
    
    
	int value = 10;
	PRINT(value);
	return 0;
}

Running result:
insert image description here
2. The role of ##

## You can combine the symbols on both sides of it into one symbol. It allows macro definitions to create identifiers from separated text fragments.

Code demo:

#include <stdio.h>
#define s(a,b) a##b
int main()
{
    
    
	int num123 = 2019;
	printf("%d\n", s(num, 123));
	return 0;
}

operation result:
insert image description here

NOTE: Such a connection must result in a valid identifier. Otherwise the result is undefined.

Macro arguments with side effects

When a macro parameter appears more than once in the definition of the macro, if the parameter has side effects, then you may be dangerous when using this macro, resulting in unpredictable consequences. Side effects are permanent effects that occur when an expression is evaluated.
x+1;//without side effects
x++;//with side effects

Because macros are replaced, x++ may cause x to change multiple times, which is unpredictable.

Comparing Macros and Functions

Macros are usually used to perform simple operations.

The code to call and return from a function may take more time than it takes to actually perform this small computational work. So macros are better than functions in terms of program size and speed.
More importantly, the parameters of the function must be declared as specific types. So functions can only be used on expressions of the appropriate type. On the contrary, how can this macro be applicable to types such as integers, long integers, and floating-point types that can be used for > to compare. Macros are type independent.

Every time a macro is used, a copy of the code defined by the macro is inserted into the program. Unless the macro is relatively short, it can increase the length of the program considerably.
Macros cannot be debugged.
Macros are not rigorous enough because they are type-independent.
Macros may introduce operator precedence issues, making programs prone to errors.

Macros can sometimes do things that functions cannot. For example: macro parameters can have types, but functions cannot.
insert image description here

naming convention

Do not capitalize all macro names and function names

#undef

This command is used to remove a macro definition.
#undef NAME
//If an existing name needs to be redefined, its old name must be removed first.

command line definition

Many C compilers provide the ability to define symbols on the command line. Used to start the compilation process. For example: this feature is somewhat useful when we want to compile different versions of a different program based on the same source file. (Suppose an array of a certain length is declared in a program. If the machine memory is limited, we need a small array, but if another machine has a larger memory, we need an array that can be larger.)

conditional compilation

When compiling a program, it is very convenient if we want to compile or discard a statement (a group of statements). Because we have conditional compilation directives.
For example:
debugging code, it is a pity to delete it, but it is a hindrance to keep it, so we can selectively compile it.

Common conditional compilation directives:

#if 常量表达式 
 //... 
#endif 
//常量表达式由预处理器求值。 

//如: 
#define __DEBUG__ 1 
#if __DEBUG__ 
 //.. 
#endif 
 
2.多个分支的条件编译 
#if 常量表达式 
 //... 
#elif 常量表达式 
 //... 
#else 
 //... 
#endif 
 
3.判断是否被定义 
#if defined(symbol) 
#ifdef symbol 
 
#if !defined(symbol) 
#ifndef symbol 
 
4.嵌套指令 
#if defined(OS_UNIX) 
	 #ifdef OPTION1 
		 unix_version_option1(); 
	 #endif 
	 #ifdef OPTION2 
		 unix_version_option2(); 
	 #endif 
#elif defined(OS_MSDOS) 
	 #ifdef OPTION2 
		 msdos_version_option2(); 
	 #endif 
#endif 

file contains

We already know that the #include directive can cause another file to be compiled. Just like where it actually appears in the #include directive.
The way this substitution works is simple: the preprocessor first removes the directive and replaces it with the contents of the include file. Such a source file is included 10 times, it is actually compiled 10 times.

The way the header file is included:
1. Local file includes

#include "filename"
search strategy: first search in the directory where the source file is located, if the header file is not found, the compiler searches for the header file in the standard location just like looking for the library function header file. If not found, a compilation error will be prompted.

2. The library file contains

#include <filename.h>
Find the header file and go to the standard path to search directly. If it cannot find it, it will prompt a compilation error.

In this way, can it be said that the form of "" can also be used for library files?

The answer is yes, you can. But the efficiency of searching in this way is lower. Of course, it is not easy to distinguish whether it is a library file or a local file.

Nested header files contain

Nested header file inclusion will cause file content duplication, how to solve this problem?
At the beginning of each header file write:

#ifndef TEST_H
#define TEST_H
//Content of header file
#endif // TEST_H

or

#pragma once

You can avoid repeated introduction of header files.

Today's content is shared here, thank you for your attention and support.

Guess you like

Origin blog.csdn.net/bushibrnxiaohaij/article/details/131826340