The strongest C language tutorial in history ---- program compilation and preprocessing (2)

content

3. Detailed preprocessing

3.1 Predefined symbols

3.2 #define

3.2.1 #define define identifier

3.2.2 #define define macro

3.2.3 #define substitution rule

3.2.4 # and ##

3.2.5 Macro parameters with side effects

3.2.6 Macro and function comparison

3.2.7 Naming Conventions

3.3 #undef

3.4 Command Line Definition

3.5 Conditional compilation

3.6 File Inclusion

3.6.1 How header files are included

3.6.2 Nested file inclusion


3. Detailed preprocessing

3.1 Predefined symbols

__FILE__      //进行编译的源文件
__LINE__     //文件当前的行号
__DATE__    //文件被编译的日期
__TIME__    //文件被编译的时间
__STDC__    //如果编译器遵循ANSI C,其值为1,否则未定义,VS中是未定义的,但GCC是已经定义的

These predefined symbols are built into the language.

Here is an example:

#include<stdio.h>
int main()
{
	printf("%s\n", __FILE__);
	printf("%d\n", __LINE__);
	printf("%s\n"__DATE__);
	printf("%s\n", __TIME__);
	//printf("%d", __STDC__);在vs中不遵循ANSI C,是未定义行为
	return 0;
}

Run the screenshot:

Application: Logging

#include<stdio.h>
int main()
{
	int i = 0;
	FILE* pf = fopen("log.txt", "w");
	if (pf == NULL)
	{
		return 1;
	}
	for (i = 0; i < 10; i++)
	{
		fprintf(pf,"%s %s %s %d\n", __DATE__, __TIME__, __FILE__, __LINE__);
	}
	fclose(pf);
	pf = NULL;
	return 0;
}

Run the screenshot:

3.2 #define

3.2.1 #define define identifier

语法: 
    #define name stuff

for example:

#define MAX 1000
#define reg register          //为 register这个关键字,创建一个简短的名字
#define do_forever for(;;)     //用更形象的符号来替换一种实现
#define CASE break;case        //在写case语句的时候自动把 break写上。
// 如果定义的 stuff过长,可以分成几行写,除了最后一行外,每行的后面都加一个反斜杠(续行符)。
#define DEBUG_PRINT printf("file:%s\tline:%d\t \
                          date:%s\ttime:%s\n" ,\
__FILE__,__LINE__ ,       \
__DATE__,__TIME__ ) 

Question: When defining an identifier, do you want to add ; at the end?

for example:

#define MAX 1000;
#define MAX 1000

It is recommended not to add ; , which can easily lead to problems.

For example the following scenario:

if(condition) 
    max = MAX;
else 
    max = 0;

There is a syntax error here (else has no matching if). The essence of macro definition should be clarified here. The essence of macro definition is replacement.

3.2.2 #define define macro

The #define mechanism includes a provision that allows parameter substitution into text, an implementation commonly referred to as a macro or define macro.

Here is how the macro is declared:

#define name( parament-list ) stuff

The parament-list is a comma-separated list of symbols that may appear in stuff.

Note: The left parenthesis of the parameter list must be immediately adjacent to name. If there is any white space in between, the argument list is interpreted as part of stuff.

Such as:

#define SQUARE( x ) x * x

This macro takes one parameter x .

If after the above statement, you put

SQUARE( 5 );

Placed in a program, the preprocessor replaces the above expression with this expression:

5 * 5

Warning: There is a problem with this macro:

Observe the code snippet below:

int a = 5;
printf("%d\n" ,SQUARE( a + 1) );

At first glance, you might think that this code will print the value 36.

In fact, it will print 11.

Why?

替换文本时,参数x被替换成a + 1,所以这条语句实际上变成了:
printf ("%d\n",a + 1 * a + 1 );

This makes it clear that the expressions resulting from the substitution are not evaluated in the expected order.

This problem is easily solved by adding two parentheses to the macro definition:

#define SQUARE(x) (x) * (x)

This preprocessing produces the expected effect:

printf ("%d\n",(a + 1) * (a + 1) );

Here's another macro definition:

#define DOUBLE(x) (x) + (x)

We used parentheses in the definition to avoid the previous problem, but this macro may introduce new errors.

int a = 5;
printf("%d\n" ,10 * DOUBLE(a));

What value will this print?

It looks like it prints 100, but in fact it prints 55. We find that after replacing:

printf ("%d\n",10 * (5) + (5));

The multiplication operation precedes the addition defined by the macro, so there is

55

The solution to this problem is to add a pair of parentheses around the macro definition expression.

#define DOUBLE( x)   ( ( x ) + ( x ) )

hint:

So macro definitions used to evaluate numeric expressions should be parenthesized in this way to avoid unpredictable interactions between operators in arguments or adjacent operators when using the macro.

3.2.3 #define substitution rule

There are several steps involved when expanding #define to define symbols and macros in a program.

  1. When calling the macro, the arguments are first checked to see if they contain any symbols defined by #define. If so, they are replaced first.
  2. The replacement text is then inserted into the program in place of the original text. For macros, parameter names are replaced by their values.
  3. Finally, the resulting file is scanned again to see if it contains any symbols defined by #define. If so, repeat the above process.

Notice:

  1. Other #define-defined symbols may appear in macro parameters and #define definitions. But for macros, recursion cannot occur.
  2. When the preprocessor searches for symbols defined by #define, the contents of string constants are not searched. (The macro definition in the string will not be replaced)

3.2.4 # and ##

How to insert parameters into strings?

First let's look at this code:

char* p = "hello ""bit\n";
printf("hello"," bit\n");
printf("%s", p);

Is the output here hello bit?

The answer is definite: yes.

We found that strings have the characteristics of automatic concatenation.

1. Can we write code like this? :

#define PRINT(FORMAT, VALUE)\
 printf("the value is "FORMAT"\n", VALUE);
...
PRINT("%d", 10);
//上述宏定义替换之后的结果如下所示:
printf("the value is " "%d" "\n",10);

operation result:

Here you can put a string in a string only when the string is used as a macro parameter.

1. Another trick is to use # to turn a macro parameter into a corresponding string.

for example:

int i = 10;
#define PRINT(VALUE)\
 printf("the value of " #VALUE "is %d \n", VALUE);
...
PRINT( i + 3);//产生了什么效果?
//上面的这段代码会替换为:
printf("the value of " "value" "is %d \n",VALUE);

The #VALUE in the code is processed by the preprocessor as:

"VALUE"

The final output should be:

the value of i+3 is 13

The role of ##

## can combine the symbols on both sides of it into one symbol.

It allows macro definitions to create identifiers from separate pieces of text.

#define ADD_TO_SUM(sumnum, value) \
 sum##num += value;
...
ADD_TO_SUM(5, 10);//作用是:给sum5增加10

NOTE: Such connections must generate a valid identifier. Otherwise the result is undefined.

3.2.5 Macro parameters with side effects

When a macro parameter appears more than once in the macro definition, if the parameter has side effects, then you may be dangerous when using the macro, leading to unpredictable consequences. A side effect is a permanent effect that occurs when an expression is evaluated.

E.g:

x+1;//不带副作用
x++;//带有副作用

The MAX macro can prove problems caused by arguments with side effects.

#define MAX(a, b) ( (a) > (b) ? (a) : (b) )
...
x = 5;
y = 8;
z = MAX(x++, y++);
printf("x=%d y=%d z=%d\n", x, y, z);//输出的结果是什么?

Here we have to know what the result of the preprocessor processing is:

z = ( (x++) > (y++) ? (x++) : (y++));

So the output is:

x=6 y=10 z=9

3.2.6 Macro and function comparison

Macros are usually used to perform simple operations.

For example, find the larger of two numbers.

#define MAX(a, b) ((a)>(b)?(a):(b))

So why not use functions to accomplish this task?

There are two reasons:

  1. The code used to call the function and return from the function may take more time than actually performing this small computational work (functions include function calls, logical operations, and function return three parts that take time. Macros only need logical operations) . So macros outperform functions in terms of program size and speed.
  2. More importantly, the parameters of the function must be declared as specific types. So functions can only be used on expressions of the appropriate type. Conversely, how can this macro be applied to integers, long integers, floating point types, etc. that can be used for > to compare types. Macros are type-independent.

Disadvantages of macros: Of course, macros also have disadvantages compared to functions:

  1. Every time a macro is used, a copy of the macro definition code is inserted into the program. Unless the macro is relatively short, it can significantly increase the length of the program.
  2. Macros cannot be debugged. (The macro has been replaced in the pre-compilation stage, and the code after the macro has been replaced during debugging, that is, the debugged code and the real debugged code are not the same code)
  3. Macros are not rigorous enough because they are type-independent.
  4. Macros can cause problems with operator precedence, making procedures prone to errors.

Macros can sometimes do things that functions can't. For example: macro parameters can have types, but functions cannot.

#define MALLOC(num, type) (type *)malloc(num * sizeof(type))
...
//使用
MALLOC(10, int);//类型作为参数
//预处理器替换之后:
(int*)malloc(10 * sizeof(int));

A comparison of macros and functions

Attributes

#defindedefine macro

function

code length

Macro code is inserted into the program each time it is used. Except for very small macros, the length of the program can grow substantially

Function code appears in only one place; every time the function is used, the same code in that place is called

execution speed

faster

There is additional overhead of function call and return, so it is relatively slow

operator precedence

Macro arguments are evaluated in the context of all surrounding expressions. Unless parentheses are added, the precedence of adjacent operators may have unpredictable consequences, so it is recommended that macros be written with more parentheses.

A function parameter is evaluated only once when the function is called, and its result value is passed to the function. Expression evaluation results are more predictable.

Parameters with side effects

Parameters may be substituted in multiple places in the macro body, so parameter evaluation with side effects may produce unpredictable results.

Function parameters are only evaluated once when the parameters are passed, and the result is easier to control.

Parameter Type

The parameters of a macro are type-independent, and can be used with any parameter type as long as the operation on the parameter is legal.

The parameters of the function are related to the type, if the types of the parameters are different, different functions are needed, even if the tasks they perform are different.

debugging

Macros are inconvenient to debug

Functions can be debugged statement by statement

recursion

Macros cannot be recursive

Functions can be recursive

3.2.7 Naming Conventions

In general, the syntax for using macros for functions is similar. So language itself cannot help us distinguish between the two.

Then our usual habit is:

capitalize macro names

Do not use all uppercase function names

3.3 #undef

This directive is used to remove a macro definition.

#undef NAME
//如果现存的一个名字需要被重新定义,那么它的旧名字首先要被移除。

3.4 Command Line Definition

Many C compilers provide the ability to define symbols on the command line. Used to start the compilation process.

For example, this feature is useful when we want to compile different versions of a program based on the same source file. (Assuming that a program declares an array of a certain length, if the machine memory is limited, we need a small array, but another machine memory is uppercase, and we need an array that can be uppercase.)

#include <stdio.h>
int main()
{
    int array[ARRAY_SIZE];
    int i = 0;
    for (i = 0; i < ARRAY_SIZE; i++)
    {
        array[i] = i;
    }
    for (i = 0; i < ARRAY_SIZE; i++)
    {
        printf("%d ", array[i]);
    }
    printf("\n");
    return 0;
}

Compilation instructions:

gcc -D ARRAY_SIZE=10 test.c(假设文件名为test.c)

3.5 Conditional compilation

When compiling a program, it is convenient if we want to compile or discard a statement (a group of statements). Because we have conditional compilation directives.

For example:

Debugging code, it's a pity to delete it, and it's a pity to keep it in the way, so we can selectively compile it.

#include<stdio.h>
int main()
{
#if 0
	for (int i = 0; i < 10; i++)
	{
		printf("hello world\n");
	}
#endif
	return 0;
}

Run the screenshot:

Suppose the code is modified as follows:

#include<stdio.h>
int main()
{
#if 1
	for (int i = 0; i < 10; i++)
	{
		printf("hello world\n");
	}
#endif
	return 0;
}

Run the screenshot:

Note: If the #if is followed by a variable, the result is similar to the #if followed by 0. Why does this happen? Because variables are generated after running, that is, variables have corresponding meanings after running. After running it, the preprocessing instructions have disappeared.

Common conditional compilation directives:

1.
#if 常量表达式
//...
#endif
//常量表达式由预处理器求值。常量表达式也可以是逻辑表达式,比如0>2为假,返回值为0,就相当于0
如:
#define __DEBUG__ 1
#if __DEBUG__//此时的效果与1完全相同
//..
#endif
2.多个分支的条件编译
#if 常量表达式
//...
#elif 常量表达式
//...
#else
//...
#endif
3.判断是否被定义
#if defined(symbol)
#ifdef symbol
//这两种是来判断是否被定义 
#if !defined(symbol)
#ifndef symbol
//这两种是来判断是否没有被定义
4.嵌套指令
#if defined(OS_UNIX)
#ifdef OPTION1
unix_version_option1();
#endif
#ifdef OPTION2
unix_version_option2();
#endif
#elif defined(OS_MSDOS)
#ifdef OPTION2
msdos_version_option2();
#endif
#endif

Note: In conditional compilation, if the character after our if is not predefined, or is predefined after conditional compilation, then this character is equivalent to 0. But since we generally use it in conditional compilation instructions, the front must generally appear. Here we should also pay attention to a point, although the characters that are not predefined will be defaulted to 0 when the conditional judgment of conditional compilation is performed, but when using the printf function to output, if there is no predefined before, it cannot be performed. output, since there is no corresponding preprocessing directive to make it replace when preprocessing.

3.6 File Inclusion

We already know that the #include directive can cause another file to be compiled. as if it actually appears in the place of the #include directive.

The way this replacement works is simple: the preprocessor first removes this directive and replaces it with the contents of the include file. Such a source file is included 10 times, it is actually compiled 10 times.

3.6.1 How header files are included

  • local file contains
#include "filename"

Search strategy: first search in the directory where the source file is located. If the header file is not found, the compiler will search for the header file in the standard location just like the library function header file.

If it is not found, it will prompt a compilation error.

Path to standard header files for Linux environment:

/usr/include

Path to standard header files for VS environment: (VS2013)

C:\Program Files (x86)\Microsoft Visual Studio 12.0\VC\include

Pay attention to find it according to your own installation path.

  • library file contains
#include <filename.h>

Find the header file and go directly to the standard path to find it. If it is not found, it will prompt a compilation error.

Is it possible to say that library files can also be included in the form of ""?

The answer is yes, yes. However, the search efficiency is lower in this way, and of course, it is not easy to distinguish whether it is a library file or a local file.

3.6.2 Nested file inclusion

If such a scenario occurs:

comm.h and comm.c are common modules.

test1.h and test1.c use common modules.

test2.h and test2.c use common modules.

test.h and test.c use the test1 module and test2 module.

In this way, there will be two copies of comm.h in the final program. This results in duplication of file content.

how to solve this problem?

Answer: Conditional compilation.

At the beginning of each header file write:

#ifndef __TEST_H__
#define __TEST_H__
//头文件的内容
#endif   //__TEST_H__

Reason: Because when the file is included for the second time, a conditional compilation has been performed before, that is, __TEST_H__ has been defined once, so this time, the content of the file in test.h will not be included twice, that is Ignore the content between conditional compilation in test.h.

Note: __TEST_H__ is not necessarily used, but the included file name is generally used, mainly for better identification and distinction.

or:

#pragma once

This avoids duplication of header files. This way of writing is generally supported under new compilers.

Guess you like

Origin blog.csdn.net/m0_57304511/article/details/123211919