[C++] 4. Preprocessor preprocessing: conditional compilation, source file inclusion, macro replacement, redefinition of line numbers, error messages, compiler reserved instructions

C++ Build can be divided into 4 steps: preprocessing, compiling, assembling, and linking.

  • Preprocessing is the macro replacement, header file inclusion, etc. to be discussed in detail in this article.
  • Compilation refers to the grammatical and semantic analysis of the preprocessed code to finally obtain assembly code or other intermediate code close to assembly
  • Assembly refers to converting the assembly or intermediate code obtained in the previous step into binary instructions of the target machine. Generally, each source file generates a binary file (VS is .obj, GCC is .o)
  • Linking is to "link" multiple binary files obtained in the previous step into executable files or library files, etc.

I. Overview

Preprocessor (preprocessing) consists of 4 stages:

  • Trigraph replacement (character mapping): map system-related characters to corresponding characters defined by the C++ standard, but the semantics remain unchanged, such as uniformly replacing different line breaks on different operating systems with specified characters (set to newline);
  • Line splicing (line continuation processing): For "\" followed by newline, delete "\" and newline, the process is only performed once (if there are two newlines after "\", only one "\" will be deleted ");
  • Tokenization (string segmentation): The source code is divided into the following strings (Token) as a string: comments, whitespace, preprocessing tokens (identifiers, etc. are all preprocessing tokens at this time, because it is not known who is the identifier at this time, After the next step, the real preprocessor will be processed);
  • Execute Preprocessor: Recursively perform steps 1-4 on the #include directive. At this step, the source code no longer contains any preprocessing statements (those at the beginning of #).

The effect after preprocessing is: the failed part of the conditional compilation test is deleted, the macro is replaced, the header file is inserted, etc.

Preprocessing is carried out in units of translation units: a translation unit is a source file together with all text files included (or indirectly included) by #include. Generally, the compiler generates a binary file for a translation unit (VS is .obj, GCC is .o).

2. Format

The general format of the Preprocessor instruction is as follows:

# preprocessing_instruction [arguments] newline

The instructions are as follows: (except for the Preprocessor instructions listed above, other instructions are not supported by the C++ standard, although some compilers have implemented their own preprocessing instructions. According to the principle of "portability is more important than efficiency", it should Try to only apply the C++ standard Preprocessor)

  • Null, a # followed by newline, has no effect, similar to an empty statement;
  • Conditional compilation, defined by #if, #ifdef, #ifndef, #else, #elif, #endif;
  • Source file includes, defined by #include;
  • Macro substitution, defined by #define, #undef, #, ##;
  • Redefine the line number and file name, defined by #line;
  • Error message, defined by #error;
  • Compiler reserved directives, defined by #pragma.

2.1 Conditional compilation

Conditional compilation starts with #if, #ifdef, #ifndef, followed by 0-n #elifs, followed by 0-1 #else, followed by #endif.

#include <iostream>

#define ABCD 2

int main() {
    
    
#ifdef ABCD
    std::cout << "1: yes\n";
#else
    std::cout<<"2:no\n";
#endif

#ifndef ABCD
    std::cout << "2: no1\n";
#elif ABCD == 2
    std::cout << "2: yes\n";
#else
    std::cout << "2: no2\n";
#endif

#if !defined(DCBA) && (ABCD < 2 * 4 - 3) // todo: 不知道为什么ABCD < 2 * 4 - 3成立
    std::cout << "3: yes\n";
# endif
    std::cin.get();
    return 0;
}

// code result:
1: yes
2: yes
3: yes

Conditional compilation is widely used for code that depends on the system and needs to be cross-platform. These codes generally identify the operating system, processor architecture, and compiler by detecting certain macro definitions, and then conditionally compile different codes to be compatible with the system.

PS: But then again, the greatest value of the C++ standard is to make all versions of C++ implementations consistent, so you shouldn't make any assumptions about the system unless you call system functions.

2.2 Source file contains

The source file includes instructions to insert the content of a certain file into the #include, where "a certain file" will be recursively preprocessed (steps 1-4, see Section 1). The 3 formats included in the file are:

  • #include<filename>, look for filename in the standard include directory (the general C++ standard library header file is here)
  • #include"filename", first find the directory where the source file to be processed is located, if not found, then find the standard include directory
  • #include pp-tokensAmong them, pp-tokens must be a macro defined as or "filename", otherwise the result is unknown. Note that filename can be any text file, not necessarily a .h, .hpp, etc. suffix file, for example, it can be a .c or .cpp text file (so the title is "source file includes" instead of "header file includes").
#ifndef B_CPP
#define B_CPP
int b = 999;
#endif // B_CPP
// file: a.cpp
#include<iostream> // 在标准库目录找
#include"b.cpp"// 先在源文件所在目录找, 再再标准库找

#define CMATH <cmath> // 如下两行效果即为 #include<cmath>, 这是一个标准库
#include CMATH // 同上

int main() {
    
    
    std::cout << b << '\n'; // 是在b.cpp定义的
    std::cout << std::log10(10.0) << '\n';
    std::cin.get();
    return 0;
}

// code result: 注意将a.cpp和b.cpp放在同一文件夹,只编译a.cpp(命令为g++ a.cpp && ./a.out)
999
1

2.3 Macro replacement

2.3.1 Syntax

#define defines macro replacement, all macros after #define will be replaced with the definition of the macro, until #undef is used to remove the definition of the macro.

Macro definitions are divided into constant macros without parameters (Object-like macros) and function macros with parameters (Function-like macros). Its format is as follows:

#define identifier replacement-list
#define identifier( parameters ) replacement-list
#define identifier( parameters, ... ) replacement-list
#define identifier( ... ) replacement-list
#undef identifier

For a function macro with parameters, in the replacement-list, "#" is placed in front of the identifier to change the identifier into a string literal value, and "##" is connected. The following example is from cppreference.com:

#include<iostream>

// make function factory
#define FUNCTION(name, a) int fun_##name() {
      
      return a;} // “#”置于identifier面前表示将identifier变成字符串字面值,“##”连接
FUNCTION(abcd, 12); // 定义func_abc()函数其无参数, 返回 12
FUNCTION(fff, 2);// 定义func_fff()函数其无参数, 返回 2
FUNCTION(kkk, 23);// 定义func_kkk()函数其无参数, 返回 23
#undef FUNCTION

#define FUNCTION 34 // 之前已定义过的 fun_abcd()、fun_fff()、fun_kkk() 已定义好, 现在可以重新宏定义了

#define OUTPUT(a) std::cout << #a << '\n'

int main() {
    
    
    std::cout << "abcd: " << fun_abcd() << std::endl; // use function factory
    std::cout << "fff: " << fun_fff() << std::endl;
    std::cout << "kkk: " << fun_kkk() << std::endl;
    std::cout << FUNCTION << std::endl; // 新的宏定义是34
    OUTPUT(million);
    std::cin.get();
    return 0;
}

// code result:
abcd: 12
fff: 2
kkk: 23
34
million

The variable parameter macro is a new part of C++11 (from C99). When using it, __VA_ARGS__ is used to refer to the parameter "...". An example from the C++ standard 2011 is as follows (the example given by the standard is different):

#include<iostream>
#define debug(...) fprintf(stderr, __VA_ARGS__) // __VA_ARGS__指代参数“...”
#define showlist(...) puts(#__VA_ARGS__)
#define report(test, ...) ((test) ? puts(#test) : printf(__VA_ARGS__))
int main() {
    
    
    int x = 1;
    int y = 2;
    debug("Flag");
    debug("X = %d\n", x);
    showlist(The first, second, and third items.);
    report(x>y, "x is %d but y is %d", x, y);
}

// 这段代码在预处理后产生如下代码:
fprintf(stderr, "Flag");
fprintf(stderr, "X = %d\n", x);
puts("The first, second, and third items.");
((x>y) ? puts("x>y") : printf("x is %d but y is %d", x, y));

// code result:
FlagX = 1
The first, second, and third items.
x is 1 but y is 2

2.3.2 Predefined macros built into the C++ standard

// 其中上面5个宏一定会被定义,下面从__STDC__开始的宏不一定被定义,这些预定义宏不能被 #undef
// 这些宏经常用于输出调试信息。预定义宏一般以“__”作为前缀,所以用户自定义宏应该避开“__”开头
// 现代的C++程序设计原则不推荐适用宏定义常量或函数宏,应该尽量少的使用 #define ,如果可能,用 const 变量或 inline 函数代替
__cplusplus: 在C++98中定义为199711L,C++11中定义为201103L
__LINE__: 指示所在的源代码行数(从1开始),十进制常数
__FILE__: 指示源文件名,字符串字面值
__DATE__: 处理时的日期,字符串字面值,格式“Mmm dd yyyy”
__TIME__: 处理时的时刻,字符串字面值,格式“hh:mm:ss”

__STDC__: 指示是否符合Standard C,可能不被定义
__STDC_HOSTED__: 若是Hosted Implementation,定义为1,否则为0
__STDC_MB_MIGHT_NEQ_WC__: 见ISO/IEC 14882:2011
__STDC_VERSION__: 见ISO/IEC 14882:2011
__STDC_ISO_10646__: 见ISO/IEC 14882:2011
__STDCPP_STRICT_POINTER_SAFETY__: 见ISO/IEC 14882:2011
__STDCPP_THREADS__: 见ISO/IEC 14882:2011

// 示例如下:
#include <iostream>
int main() {
    
    
#define PRINT(arg) std::cout << #arg": " << arg << '\n'
    PRINT(__cplusplus);
    PRINT(__LINE__);
    PRINT(__FILE__);
    PRINT(__DATE__);
    PRINT(__TIME__);
#ifdef __STDC__
    PRINT(__STDC__);
#endif
    std::cin.get();
    return 0;
}
// code result:
__cplusplus: 201703
__LINE__: 6
__FILE__: /cppcodes/run/a.cpp
__DATE__: Aug 26 2023
__TIME__: 21:11:36
__STDC__: 1

2.4 Redefine line numbers and file names

Starting from #line number ["filename"]the next source line of , __LINE__ is redefined to start from number, __FILE__is redefined "filename"(optional), an example is as follows:

#include <iostream>
int main() {
    
    
#define PRINT(arg) std::cout << #arg": " << arg << '\n'
#line 999 "WO"
    PRINT(__LINE__);
    PRINT(__FILE__);
    std::cin.get();
    return 0;
}

// code result:
__LINE__: 999
__FILE__: WO

2.5 Error messages

#error [message]#errorInstruct the compiler to report errors, generally used for system-related codes, such as detecting the type of operating system, and reporting errors in conditional compilation . Examples are as follows:

int main(){
    
    
#error "w"
    return 0;
#error
}

// code result:
/cppcodes/run/a.cpp:2:2: error: "w"
#error "w"
 ^
/cppcodes/run/a.cpp:4:2: error:
#error
 ^
2 errors generated.

2.6 Compiler reserved instructions

#pragmaPreprocessing instructions are reserved by the C++ standard for specific C++ implementations, so the parameters and meanings of #pragma may be different on different compilers, such as:

3. Application scenarios

Common uses of preprocessing are:

  • Include guard, see wikipedia entry , this technology is used to ensure that the header file is only included once by the same file (accurately, the content of the header file appears only once in a translation unit), to prevent violation of the "one definition" principle of C++;
  • Use #ifdef and special macros to identify operating systems, processor architectures, compilers, and conditional compilation to implement platform-specific functions, mostly for portability codes;
  • Define function macros to simplify the code, or to modify some configurations;
  • Use #pragma to set and implement related configurations (see the link given at the end of the previous section).
    There is a project on sourceforge.net about using macros to detect operating systems, processor architectures, and compilers (please click on the link or see references). Here's an example (from here):

The definition of macro detection is linked below, and the code example is as follows:

#ifdef _WIN64
   //define something for Windows (64-bit)
#elif _WIN32
   //define something for Windows (32-bit)
#elif __APPLE__
    #include "TargetConditionals.h"
    #if TARGET_OS_IPHONE && TARGET_IPHONE_SIMULATOR
        // define something for simulator   
    #elif TARGET_OS_IPHONE
        // define something for iphone  
    #else
        #define TARGET_OS_OSX 1
        // define something for OSX
    #endif
#elif __linux
    // linux
#elif __unix // all unices not caught above
    // Unix
#elif __posix
    // POSIX
#endif

Guess you like

Origin blog.csdn.net/jiaoyangwm/article/details/132514208