C++ preprocessing and macro-related programming (#, ## etc.)

1. Introduction

C ++  template metaprogramming  (template metaprogramming)  although powerful, but there are  limitations :

  • Templates can not be expanded by generating a new  identifier (identifier The) . For example: generate new function names, class names, namespace names, etc.
  • Only the user identifier using predefined template parameters can not be acquired by  symbols / marks (token)  a  literal (literal)
  • For example, get the literal value of the actual parameter parameter name in reflection, and get the literal value of the expression in the assertion.

Therefore, when you need to directly  manipulate the identifier  , you also need to use  macros to perform metaprogramming in the  preprocessing stage :

  • And  compile-time (compile-time)  of the  template  expand different macros  in preprocessing (the Preprocess) fully expanded before compilation phase - in a narrow sense, the compiler does not see the processing macro code.
  • By  #define/ TOKEN1##TOKEN2/ #TOKEN defines  Objects (object-like macro)  and  macro function (function-like Macro) , you can achieve the replacement text, splicing identifiers, literals acquired functions.

 

1.1 About C++ Macro Programming and Debugging

Many people "Macro Programming" can not debug, and directly "from entry to give up" - casual  sign spelling mistakes , wrong number of arguments , causing the text  can not be replaced properly , leading to  full screen compile errors , and finally  difficult to locate  the problem Where-

  • In the worst case, the compiler  will only tell you  that there was a syntax error when compiling the cpp file 
  • In the best case, the compiler  may tell you that the  expansion result of the XXX macro contains a  syntax error
  • And I  will never tell you  what the expansion of the XXX macro is, which causes the expansion of the YYY macro to fail
  • Finally, you  can only see the  ZZZ macro expansion error

Since the macro code will be fully expanded before compilation, we can:

  • Let the compiler  output only preprocessed results
  • gcc -E Let the compiler stop after the preprocessing ends, without compiling and linking
  • gcc -P Shield the line marker (linemarker) of the preprocessing result output by the compiler to reduce interference. And combined with __LINE__;;; macro to achieve the positioning of the number of lines of code and position.
  • In addition, since the output result is not formatted, it is recommended to clang-format send it to  format before outputting
  • Block  irrelevant  header files

 

Two, the common use mode of macro programming

CC++The macro () in (and ) Macrobelongs to the category of compiler preprocessing and belongs to the concept of compile-time (not the concept of runtime). The following is a brief summary of the frequently encountered macro usage problems.

2.1 About # and ## (# symbol splicing)

In the C language macro, #the function is to stringfication the following macro parameters. Simply put, after replacing the macro variable it refers to, add a double quotation mark on the left and right sides of it . For example, the macro in the following code:

#define WARN_IF(EXP) \ 

do { \
    if (EXP) \ 
    fprintf(stderr, "Warning: " #EXP "\n"); \ 
}while(0)

Then the following replacement process will appear in actual use:

WARN_IF (divider == 0);

Is replaced by:

do { 
    if (divider == 0) 
    fprintf(stderr, "Warning" "divider == 0" "\n"); 
} while(0); 

In this way, every time divider(divisor) is 0, a prompt message will be output on the standard error stream.

It ##is called a concatenator, which is used to connect two Tokeninto one Token. Note that the objects connected here Tokenare fine, not necessarily macro variables. For example, you want to make an array of structure composed of menu item command names and function pointers, and hope that there is an intuitive, name relationship between the function name and the menu item command name. Then the following code is very practical:

struct command 
{ 
    char * name; 
    void (*function) (void);
}; 

#define COMMAND(NAME) { NAME, NAME ## _command } 

Then you use some pre-defined commands to conveniently initialize a commandstructured array:

struct command commands[] = { 
    COMMAND(quit), 
    COMMAND(help), 
    ... 

} 

COMMANDThe macro here acts as a code generator, which can reduce code density to a certain extent, and indirectly can also reduce errors caused by inattention. We can also na ##symbolic link n+1one Token, this feature is also a #symbol not available. such as:

#define LINK_MULTIPLE(a,b,c,d) a##_##b##_##c##_##d 
typedef struct _record_type LINK_MULTIPLE(name,company,position,salary); 

Here this statement will expand to:

 typedef struct _record_type name_company_position_salary; 

2.2 About the use of (## variable length parameter)

...It Cis called in the macro Variadic Macro, which is the variable parameter macro.

In GNU C, starting from C99, macros can accept a variable number of parameters, just like variadic functions. Like functions, macros also use three dots... to represent variable parameters

__VA_ARGS__ macro

The __VA_ARGS__ macro is used to express the content of the variable parameter. Simply put, the content of... in the macro on the left is copied as it is in the location of __VA_ARGS__ on the right. The following example code:

#include <stdio.h>
#define debug(...) printf(__VA_ARGS__)
int main(void)
{
    int year = 2018;
    debug("this year is %d\n", year);  //效果同printf("this year is %d\n", year);
}

Variable parameter alias

In addition, with some syntax, you can give a name to the variable parameter instead of using __VA_ARGS__, as in args in the following example:

#include <stdio.h>
#define debug(format, args...) printf(format, args)
int main(void)
{
    int year = 2018;
    debug("this year is %d\n", year);  //效果同printf("this year is %d\n", year);
}

Incoming case without parameters

Different from the variable parameter function, the variable parameter in the variable parameter macro must have at least one parameter passed in, otherwise an error will be reported. In order to solve this problem, a special "##" operation is required. Ignored or empty, the "##" operation will cause the preprocessor to remove the comma in front of it. As shown in the following example

#include <stdio.h>
#define debug(format, args...) printf(format, ##args)
int main(void)
{
    int year = 2018;
    debug("hello, world");  //只有format参数,没有args可变参数
}
  • Macro connector##

For example: the macro is defined as #define XNAME(n) x##n, the code is: XNAME(4), then during pre-compilation, the macro finds that XNAME(4) matches XNAME(n), then let n be 4. Then change the content of n on the right to 4, and then replace the entire XNAME(4) with x##n, which is x4, so the final result is XNAME(4) becomes x4. As shown in the following example:

#include <stdio.h>
#define XNAME(n) x##n
#define PRINT_XN(n) printf("x" #n " = %d\n", x##n);
int main(void)
{
    int XNAME(1) = 14; // becomes int x1 = 14;
    int XNAME(2) = 20; // becomes int x2 = 20;
    PRINT_XN(1);       // becomes printf("x1 = %d\n", x1);
    PRINT_XN(2);       // becomes printf("x2 = %d\n", x2);
    return 0;
}

2.3 Special symbols

Unlike template meta-programming, macro programming  has no  concept of type . Both input and output are  symbols  -no C++ syntax at compile time is involved, only text replacement before compilation is performed  :

  • macro parameter  is an arbitrary  symbol sequence (Sequence token) , separated by commas between different macro parameters
  • Each parameter can be an  empty sequence , and whitespace characters will be ignored (for example  a + 1 , the  a+1 same as)
  • In a parameter, comma (comma)  or  unmatched parenthesis (parenthesis) cannot appear  (for example, it  FOO(bool, std::pair<int, int>) is considered to  FOO() have three parameters: bool /  std::pair<int /  int>)

If the required  std::pair<int, int> as a parameter, a method of using the C ++  type aliases  (Alias type) (e.g.  using IntPair = std::pair<int, int>;), to avoid the comma (parameter appears  FOO(bool, IntPair) only two parameters).

A more general method is to use a  pair of parentheses to  encapsulate each parameter (hereinafter referred to as a  tuple ), and remove the parentheses during the final expansion ( tuple unpacking ):

#define PP_REMOVE_PARENS(T) PP_REMOVE_PARENS_IMPL T
#define PP_REMOVE_PARENS_IMPL(...) __VA_ARGS__

#define FOO(A, B) int foo(A x, B y)
#define BAR(A, B) FOO(PP_REMOVE_PARENS(A), PP_REMOVE_PARENS(B))

FOO(bool, IntPair)                  // -> int foo(bool x, IntPair y)
BAR((bool), (std::pair<int, int>))  // -> int foo(bool x, std::pair<int, int> y)
  • PP_REMOVE_PARENS(T) Expand into  PP_REMOVE_PARENS_IMPL T the form
  • If the parameter  T is a  pair of parentheses , the expansion result will become   the form of calling the macro functionPP_REMOVE_PARENS_IMPL (...)
  • Then, PP_REMOVE_PARENS_IMPL(...) expand into the parameter itself  __VA_ARGS__(the variable-length parameter mentioned below  ), which is T the content of the tuple 

In addition, commonly used macro functions instead of  special symbols are used for the lazy evaluation mentioned below  :

#define PP_COMMA() ,
#define PP_LPAREN() (
#define PP_RPAREN() )
#define PP_EMPTY()

2.4 Other

Macro programming also has powerful computing power, whether it is Turing complete. The author is not sure yet. However, such  BOOST_PP : popular  pretreatment library (preprocessor library) using macros in a number of programming statements computing infrastructure, such as: increment decrement, logical operations, converting the Boolean conditional select, lazy evaluation, subscripting , The parameter length is judged to be empty, length calculation, traversal access, symbol matching, and provides common data structures (such as tuples, sequences, lists, arrays, etc.), in addition to numerical operations, numerical comparisons, and so on. You can see an article in Zhihu: The Art of C/C++ Macro Programming .

 

3. Points to note when using macros

3.1 Wrong nesting-Misnesting

Macro definitions do not have to have complete, matching parentheses, but in order to avoid errors and improve readability, it is best to avoid such use.

3.2 Problems caused by operator precedence-Operator Precedence Problem

Since the macro is only a simple replacement, if the macro parameter is a composite structure, then after the replacement, the operator priority between the various parameters may be higher than the operator priority of the interaction between the various parts of the single parameter. If we do not Brackets protect each macro parameter, which may cause unexpected situations. such as:

#define ceil_div(x, y) (x + y - 1) / y 

Then

a = ceil_div( b & c, sizeof(int) ); 

Will be transformed into:

a = ( b & c + sizeof(int) - 1) / sizeof(int); 

Since +/-the priority is higher than &the priority, then the above formula is equivalent to:

a = ( b & (c + sizeof(int) - 1)) / sizeof(int); 

This is obviously not the original intention of the caller. In order to prevent this from happening, you should write a few more parentheses:

#define ceil_div(x, y) (((x) + (y) - 1) / (y)) 

3.3 Eliminate unnecessary semicolons-Semicolon Swallowing

Generally, in order to make a function-like macro look like a normal Clanguage call on the surface , we usually add a semicolon after the macro, such as the following macro with parameters:

MY_MACRO(x); 

But if it is the following situation:

#define MY_MACRO(x) { \ 
/* line 1 */ \ 
/* line 2 */ \ 
/* line 3 */ } 

//... 

if (condition()) 
    MY_MACRO(a); 
else {
  ...
} 

This will cause compilation errors due to the extra semicolon. In order to avoid this situation and maintain MY_MACRO(x);this kind of writing, we need to define the macro as this form:

#define MY_MACRO(x) do { 
/* line 1 */ \ 
/* line 2 */ \ 
/* line 3 */ } while(0) 

So as long as you always use semicolons, you won't have any problems.

3.4 Duplication of Side Effects

This Side Effectmeans that when the macro is expanded, its parameters may be performed multiple times Evaluation(that is, value), but if the macro parameter is a function, it may be called multiple times to achieve inconsistent results, or even more Serious mistake. such as:

#define min(X,Y) ((X) > (Y) ? (Y) : (X)) 

//... 

c = min(a,foo(b)); 

At this time, the foo()function is called twice. In order to solve this potential problem, we should write min(X,Y)this macro like this:

#define min(X,Y) ({ \ 
typeof (X) x_ = (X); \ 
typeof (Y) y_ = (Y); \ 
(x_ < y_) ? x_ : y_; }) 

({...})The function of is to return the value of the last one of the internal statements, and it also allows to declare variables internally (because it forms a part through curly braces Scope).

Some interesting questions

  • The following code:
#define display(name) printf(""#name"") 
int main() { 
    display(name); 
} 

The result of the operation is name, why not "#name"?

#Here is the meaning of stringification, which is printf(""#name"")equivalent to printf("" "name" "").

printf("" #name "")<1> is
equivalent to printf("" "name" "")<2>
and the second and third "spaces in the middle" in <2> are equivalent to ("empty + name + empty')

##The connection symbol is composed of two pound signs, and its function is to tokenconnect two substrings ( ) in the macro definition with parameters to form a new substring. But it cannot be the first or last substring. The so-called substring ( token) refers to the smallest syntactic unit that the compiler can recognize. The specific definition is explained in detail in the compilation principle, but it doesn't matter if you don't know. At the same time, it is worth noting that the #symbol is to replace the passed parameter as a string. Let's take a look at how they work. This is MSDNan example from above.

Assuming that such a macro with parameters has been defined in the program

#define paster( n ) printf( "token" #n " = %d", token##n ) 

At the same time, an integer variable is defined:

int token9 = 9; 

Now call this macro in the main program in the following way:

paster(9); 

Then at compile time, the above sentence is expanded to:

printf( "token" "9" " = %d", token9 );

Note that in this example, paster(9);the " 9" in the " " is treated as a string intact, and is tokenconnected with " " to become token9. It has #nalso been replaced by " 9".

As you can imagine, the result of running the above program is printed out on the screen token9=9.

#define display(name) printf(""#name"") 
int main() { 
    display(name); 
} 

The particularity is that it is a macro, and the processing #number in the macro is just like LS said!
After processing, it is an additional string!

But printf(""#name"");it won't work!

#define display(name) printf(""#name"") 

The definition is stringified, nameand the result is actually printf("name")(the empty string before and after it is removed), so the output is naturally name.

From another perspective, it #is a connection symbol, which will not be output when it participates in calculations.

 

 

Guess you like

Origin blog.csdn.net/smilejiasmile/article/details/113771439