Several pits and special usage of C language macro definition

Since this article mainly refers to GCC documents, some details (such as whether spaces in macro parameters are handled or not) may be slightly different in other compilers, please refer to the corresponding documents.

Macro basics

Macros are just a text replacement tool in the C preprocessing stage, and are invisible to the binary code after compilation. The basic usage is as follows:

  1. Identifier alias
#define BUFFER_SIZE 1024

In the preprocessing stage foo = (char *) malloc (BUFFER_SIZE),; will be replaced with foo = (char *) malloc (1024);

Macro body wrap requires a backslash at the end of the line\

#define NUMBERS 1, \
                2, \
                3

Preprocessing stage int x[] = { NUMBERS }; will be expanded to int x[] = { 1, 2, 3 };

  1. Macro function
    Macros with parentheses after the macro name are considered macro functions. The usage is the same as the normal function, except that the macro function will be expanded during the preprocessing stage. The advantage is that there is no overhead for saving registers and parameter transfer in ordinary functions. The expanded code is conducive to the utilization of CPU cache and instruction prediction, and the speed is fast. The disadvantage is that the executable code is large.
#define min(X, Y)  ((X) < (Y) ? (X) : (Y))

y = min(1, 2); Will be expanded to y = ((1) < (2) ? (1) : (2));

Macro special usage

  1. Stringification
    In the macro body, if a macro parameter is added before the #macro parameter, the macro parameter will be expanded into a string form when the macro body is expanded. Such as:
#define WARN_IF(EXP) \
     do {            \
     	if (EXP) {   \
             fprintf (stderr, "Warning: " #EXP "\n"); \
        }            \
     } while (0)

WARN_IF (x == 0); Will be expanded to:

do {
    
                  \
	if (x == 0) {
    
     \
    	fprintf (stderr, "Warning: " "x == 0" "\n"); \
    }             \
} while (0);

This usage can be used in assert, if the assertion fails, the failed statement can be output to the feedback message

  1. Concatenation is
    in the macro body. If the macro body is located in the identifier ##, then when the macro body is expanded, the macro parameters will be directly replaced in the identifier. Such as:
#define COMMAND(NAME)  { #NAME, NAME ## _command }

struct command
{
    
    
    char *name;
    void (*function) (void);
};

During macro expansion

struct command commands[] =
{
    
    
    COMMAND (quit),
    COMMAND (help),
    ...
};

Will be expanded to:

struct command commands[] =
{
    
    
    {
    
     "quit", quit_command },
    {
    
     "help", help_command },
    ...
};

This saves a lot of time and improves efficiency.

A few pits

  1. Grammatical problems
    Because it is a plain text replacement, the C preprocessor does not do any grammatical checks on the macro body. The preprocessor that lacks parentheses and a semicolon does not matter. Be extra careful here, which may lead to all kinds of strange problems, and it is difficult to find the root cause at once.

  2. Operator precedence problem
    Not only is the macro body replaced by plain text, but also the macro parameters are replaced by plain text. There is the following simple macro to achieve multiplication:

#define MULTIPLY(x, y) x * y

MULTIPLY(1, 2)No problem, it will expand normally 1 * 2. The problem is that this kind of expression is MULTIPLY(1 + 2, 3)expanded 1 + 2 * 3, obviously the priority is wrong. In the macro body, adding parentheses to the quoted parameter can avoid this problem.

#define MULTIPLY(x, y) (x) * (y)

MULTIPLY(1+2, 3)Will be expanded into (1 + 2) * (3), the priority is normal. In fact, this problem and some of the problems to be mentioned below belong to the problem of semantic destruction caused by plain text replacement, so be extra careful.

  1. The semicolon swallow problem

There are the following macro definitions:

#define SKIP_SPACES(p, limit)       \
     {                              \
     	char *lim = (limit);        \
        while (p < lim) {           \
          if (*p++ != ' ') {        \
            p--;                    \
            break;                  \ 
          }                         \
       }                            \
     }                              \

Suppose there is the following piece of code:

if (*p != 0)
   SKIP_SPACES (p, lim);
else ...

Compile, GCC reported error: ‘else’ without a previous ‘if’. It turns out that this macro that seems to be a function is expanded into a code block enclosed in curly braces. After adding a semicolon, the if logic block ends, so the compiler finds that there is no corresponding if for this else. This problem is generally do ... while(0)solved in the form:

#define SKIP_SPACES(p, limit)         \
     do {                             \
     	char *lim = (limit);          \
        while (p < lim) {             \
            if (*p++ != ' ') {        \
              p--;                    \
              break;                  \
            }			              \
       }                              \
	} while (0) 

After unfolding it becomes

if (*p != 0)
    do ... while(0);
else ...

This eliminates the problem of swallowing semicolons. This technique is very common in the Linux kernel source code, such as this set macro (located in arch/mips/include/asm/mach-pnx833x/gpio.h)

#define SET_REG_BIT(reg, bit)  do { (reg |= (1 << (bit))); } while (0)
  1. Macro parameter repeated call

There are the following macro definitions:

#define min(X, Y)  ((X) < (Y) ? (X) : (Y))

When the following calls are made next = min (x + y, foo (z));, the macro body is expanded next = ((x + y) < (foo (z)) ? (x + y) : (foo (z)));, and you can see that foo(z) has been called twice and repeated calculations have been made. More seriously, if foo is not reentrant (global or static variables are modified in foo), the program will generate logic errors. Therefore, try not to pass function calls in macro parameters.

  1. Recursive reference to itself

There are the following macro definitions:

#define foo (4 + foo)

According to the previous understanding, it (4 + foo)will expand into (4 + (4 + foo)), and then continue to expand until the memory is exhausted. However, the strategy adopted by the preprocessor is to expand only once. In other words, it foowill only be expanded into (4 + foo), and foothe meaning after expansion will be determined according to the context.

For the following cross references, the macro body will only be expanded once.

#define x (4 + y)
#define y (2 * x)

xExpand into (4 + y) -> (4 + (2 * x)), yexpand into (2 * x) -> (2 * (4 + y)). Note that this is a highly recommended way of writing, and the program readability is extremely poor.

  1. Macro parameter preprocessing

If the macro parameter contains another macro, then the macro parameter will be fully expanded before being substituted into the macro body, unless the macro body contains an #or ##. There are the following macro definitions:

#define AFTERX(x) X_ ## x
#define XAFTERX(x) AFTERX(x)
#define TABLESIZE 1024
#define BUFSIZE TABLESIZE

AFTERX(BUFSIZE)Will be expanded into X_BUFSIZE. Because the macro body contains ##, the macro parameters are directly substituted into the macro body. XAFTERX(BUFSIZE)Will be expanded into X_1024. Because XAFTERX(x)the macro body is AFTERX(x), there is no #or ##, so it BUFSIZEwill be fully expanded to 1024 before being substituted, and then substituted into the macro body to become X_1024.

Reference materials: http://gcc.gnu.org/onlinedocs/cpp/Macros.html

Guess you like

Origin blog.csdn.net/u013318019/article/details/109308719