Linux driver GNU extension C

1 Structure assignment

Assign values to members, such as structures:

struct  st1 {
    int a;
    int b;
}

Generally, you can directly assign values in the form of {}, namely:

struct st1 st1 = {1,2,3);

But the assignment style of Linux is:

struct st1 st1 = {
    .a = 1，
    .b = 2，
}

Note: The advantage of this style (that is, adding a dot "." Before the member variable) is that you can assign values not in the order of the member variables. Such as:

struct st1 st1 = {  
    .b = 2,  
    .a = 1, 
};

2 inline function inline

In the c file, in order to solve the problem of frequently consuming small functions and consuming a lot of stack space or stack memory, the inline modifier is specifically introduced, which is expressed as an inline function. Inline functions are defined using the inline keyword, and the function body and declaration must be combined, otherwise the compiler will treat it as a normal function. The inline function is generally placed in the header file.

inline void function(int x); //仅仅是声明函数，没有任何效果 
inline void function(int x) //正确 
{
    return x;
}

3 Typeof usage, offsetof , container_of analysis

3.1 The keyword typeof is used to obtain the data type of the expression

@1 char *chptr;

typeof (*chptr) ch;    //等价于char ch  
typeof  (ch) *chptr1;  //等价于char *chptr1  
typeof (chptr1) array[5]; //char  *array[5],chptr1的数据类型为char *

@ 2 typeof is commonly used in the linux kernel

#define  min(x,y)  ({                   \  
     typeof(x)  __min1 = (x);        \  
     typeof(y) __min2 = (y);             \  
     (void)  (& __min1 == & __min2);     \  
    __min1 < __min2  ? __min1 :min2})

Obtain the data of x and y through typeof, then define two temporary variables, and assign x and y to the two temporary variables respectively for final comparison.
In addition, the role of the (void) (& __min1 = & __ min2) statement in the macro definition is to warn that x and y cannot belong to different data types.

@ 3 examples are used as follows:

#include <stdio.h>  
#define  min(x,y)  ({         \  
     typeof(x)  __min1 = (x); \  
     typeof(y) __min2 = (y);  \  
     (void)  (& __min1 == & __min2); \  
     __min1 < __min2  ? __min1 :min2})  
int main()  
{  
    int a=4;  
    int b=6;  
     int min_data;  
     min_data=min(a,b);  
     printf(“the min=%d\n”,min_data);  
     return 0;  
}

Results of the:

the min=4

If changed to:

#include <stdio.h>  
#define  min(x,y)  ({         \  
     typeof(x)  __min1 = (x); \  
     typeof(y) __min2 = (y);  \  
     (void)  (& __min1 == & __min2); \  
     __min1 < __min2  ? __min1 :min2})  
int main()  
{  
     int a=4;  
     float b=6;  
     int min_data;  
     min_data=min(a,b);  
     printf(“the min=%d\n”,min_data);  
     return 0;  
}

The following warning will be prompted:

main.c: In function ‘main’:
main.c:17:9: warning: comparison of distinct pointer types lacks acast [enabled by default]

3.2 offsetof analysis

The meaning of offsetof: Get the offset of the memory location of a member variable in the structure relative to the first address of the structure, defined as follows:

#define offsetof(TYPE,MEMBER) ((size_t)& ((TYPE *)0)->MEMBER)

Note: The content of address 0 here is inaccessible, but the address of address 0 is still accessible. Here we use an addressing operator (TYPE *) 0, which means that the address 0 is forced to be converted to type TYPE. ((TYPE *) 0)-> MEMBER is to find the member MEMBER of TYPE from address 0.

Here, the principle of offsetof is interpreted in conjunction with the file_node structure, which is defined as follows:

struct file_node{
　　char c;
　　struct list_head node;
};

Substitute the actual parameters into:

offset( struct file_node, node );

It will eventually become like this:

((size_t)&((struct file_node*)0)->node)；

That is to find the address of the member node of p, but p is 0, and the address of the member node is calculated from the 0 address, which is the offset of the member node in the structure struct file_node. That is, the offset macro calculates the offset of MEMBER in TYPE.

3.3 container_of analysis

The meaning of container_of: Obtain the first memory address of the structure according to the memory address of a member variable in the structure. The definition is as follows:

#define container_of(ptr, type, member) ({             \
         const typeof( ((type *)0)->member ) *__mptr = (ptr);     \
         (type *)( (char *)__mptr - offsetof(type,member) );})

That is, the pointer to the entire structure variable is obtained according to the pointer of a domain member variable in a structure variable. For example, there is a structure variable, which is defined as follows:

struct demo_struct {
    type1 member1;
    type2 member2;
    type3 member3;
    type4 member4;
};
struct demo_struct demo;

At the same time, in another place, I got a pointer to a domain member variable in the variable demo, such as:

type3 *memp = //从某处获得的member3成员的指针

At this point, if you need to get a pointer to the entire structure variable, you can do this as follows:

struct demo_struct *demop = container_of(memp, struct demo_struct, member3);

Next we will analyze this process. First, we expand container_of (memp, struct demo_struct, type3) according to the definition of macro as follows:

struct demo_struct *demop = ({ \
const typeof( ((struct demo_struct *)0)->member3 ) *__mptr = (memp); \
(struct demo_struct *)( (char *)__mptr - offsetof(struct demo_struct, member3) );})

@ 1 Line 2 parsing

Here typeof is an extension of GNU C to standard C, and its role is to obtain the type of the variable according to the variable. Therefore, the function of line 2 in the above code is

Use typeof to get the type of structure domain variable member3 as type3,
A temporary variable __mptr of type3 pointer type is defined, and the value of the pointer memp of the domain variable in the actual structure variable is assigned to the temporary variable __mptr.

After two steps:

const typeof( ((struct demo_struct *)0)->member3 ) *__mptr = (memp);
//转换成
type3 *__mptr = (memp);

@ 2 Line 3 analysis

Assume that the structure variable demo location in actual memory is as follows:

   struct demo
 +-------------+ 0xA000
 |   member1          |
 +-------------+ 0xA004
 |   member2          |
 |                    |
 +-------------+ 0xA010
 |   member3          |
 |                    |
 +-------------+ 0xA018
 |   member4          |
 +-------------+------+

Then after executing the second line of the above code, the value of __mptr is 0xA010; look at the third line of the above code:

(struct demo_struct *)( (char *)__mptr - offsetof(struct demo_struct, member3) );})

Because offsetof is to take the offset address of the domain member in the structure relative to address 0, that is, the offset of the domain member variable relative to the first address of the structure variable. Therefore, the value returned by the offsetof (struct demo_struct, member3) call is the offset of member3 relative to the demo variable. According to the variable address distribution diagram given above, offsetof (struct demo_struct, member3) will return 0x10.

@ 3 Comprehensive analysis, at this time:

__mptr==0xA010
offsetof(struct demo_struct, member3)==0x10

Therefore, (char *) __ mptr-((size_t) & ((struct demo_struct *) 0)-> member3) == 0xA010-0x10 == 0xA000, which is the first address of the structure variable demo. Thus, container_of implements the function of obtaining the pointer to the entire structure variable according to the pointer of a domain member variable in a structure variable.

4 Compound statements in GNU C expressions

4.1 Description of compound statements

In Standard C, an expression refers to a combination of operators and operands, and a compound statement refers to a code block composed of one or more statements enclosed in curly brackets. Compound statements are not allowed in expressions. However, in GNU C, compound statements enclosed in parentheses are allowed to appear in an expression. The type of this expression is the type of the last sub-statement expression that ends with a semicolon in the compound statement, and its value is also the value of the last sub-expression. Examples of use are as follows:

#include <stdio.h>  
main()  
{  
    int  a = ({
               int b =4;  
               int c =3;  
               b+c;  
               b+c-2;  
              });  
    printf("a = %d\n",a);  
    return 0;  
}

The output is:

a = 5

Note: The value of a is the value of the last statement in the compound statement, and its data type matches the data type of the last statement.

4.2 Application of compound statements in the Linux kernel

@ 1 is often used in the definition of macros, such as the implementation of min:

#define  min(x,y)  ({         \  
     typeof(x)  __min1 = (x); \  
     typeof(y) __min2 = (y);  \   
     __min1 < __min2  ? __min1 :min2;})

A safe macro for minimization is defined here.

@ 2 In Standard C, it is usually defined as:

#define min(x,y) ((x) < (y) ? (x) : (y))

Compared to @ 3 @ 1 and @ 2, the writing of @ 1 can avoid the side effects of @ 2 in min_t (x ++, ++ y). That is, this definition calculates x and y twice, respectively. When the parameter has side effects, it will produce incorrect results. In GNU C, the statement expression is used to calculate the parameter only once, avoiding possible errors. Therefore, statement expressions are commonly used in macro definitions in the kernel.

5 Label elements of GNU C

Standard C requires that the initial values of arrays or structure variables must appear in a fixed order. In GNU C, by specifying the index or structure domain name, the initialization values are allowed to appear in any order. The method of specifying the array index is to write "[INDEX] =" before initializing the value. To specify a range, use the form "[FIRST ... LAST] =".

5.1 Array application 1, initialize the specified element

Use the form "[index] = value" in the initialization list of the array to initialize an element specified (specified by index). Examples of use are as follows:

#include <stdio.h>  
int main(void)  
{  
    int i;  
    int arr[6] = {[3] =10,11,[0]=5,6};  
 
    for (i=0;i<6;i++)  
           printf("a[%d]=%d\n",i,arr[i]);  
 
    return 0;  
}

The execution result is:

a[0]=5
a[1]=6
a[2]=0
a[3]=10
a[4]=11
a[5]=0

If there is more than one value after the specified initialization item, such as [3] = 10,11. Then these redundant values will be used to initialize the subsequent array elements, that is, the value 11 is used to initialize arr [4]. For C language arrays, after initializing one or more elements, the uninitialized elements will be automatically initialized to 0, the latter NULL (for pointer variables). At the same time, the value of all elements of the array without any initialization will be undefined.

5.2 Array application 2, several elements in the range are initialized to the same value

GNU C also supports the form of "[first ... last] = value", that is, several elements in a range are initialized to the same value. Examples of use are as follows:

#include <stdio.h>  
int main()  
{  
    int i;  
    int arr[]={ [0 ... 3] =1,[4 ... 5]=2,[6 ... 9] =3};  
    for(i=0; i<sizeof(arr)/sizeof(arr[0]);i++ )
        printf("arr[%d]:%d\n",i,arr[i]);  
    return 0;  
}

The execution result is:

arr[0]:1
arr[1]:1
arr[2]:1
arr[3]:1
arr[4]:2
arr[5]:2
arr[6]:3
arr[7]:3
arr[8]:3
arr[9]:3

6 Anonymous union or structure of GNU C

In GNU C, you can declare a union (or structure) in a structure without specifying its name, so that you can use the members of the union (or structure) directly like you use structure members Use the members of the union (or structure) directly. Examples of use are as follows:

#include <stdio.h>
struct test_struct {
    char * name;
    union {
        char gender;
        int id;
    }; 
    int num; 
};  
int main(void)  
{  
    struct test_struct test_struct={"jibo",'F',28};  
    printf("test_struct.gender=%c,test_struct.id=%d\n",test_struct.gender,test_sturct.id);
    return 0;  
}

The execution result is:

test_struct.gender=F,test_struct.id=70

Note: An anonymous union (or structure) is commonly used in the Linux kernel.

7 GNU C branch statement

For conditional selection statements, gcc has a built-in instruction for optimization. When this condition occurs frequently or rarely, the compiler can optimize the conditional branch selection according to this instruction. The kernel encapsulates this instruction into macros, namely likely () and uniquely (), for example:

if (foo){  
    /**/
}

If you want to mark this choice as a branch that rarely happens:

if (unlikely(foo)){  
    /**/  
}

Conversely, if you want to mark a branch as a choice that is usually true:

if(likely(foo)) {  
    /**/  
}

Note: For a detailed analysis of likely () and uniquely (), see the similar () and uniquely () in the Linux kernel source code

8 Zero-length array

8.1 Reasons for existence

In standard C, an array with a length of zero is prohibited, which requires that the array has a minimum length of 1 byte. But the GNU extension C allows the definition of zero-length arrays. So why support zero-length arrays, what are its benefits, how is it used, and what are the main uses of zero-length arrays:

Purpose: To save space and convenience when accessing structures of indefinite length.
Usage: At the end of a structure, declare an array of length 0 to make the structure variable-length.

For the compiler, the array of length 0 does not occupy space at this time, because the array name itself does not occupy space, it is only an offset, the symbol of the array name itself represents an unmodifiable address constant (note: array The name will never be a pointer!) , But for the size of this array, we can dynamically allocate it. The use case is as follows: now a struct demo structure is to be allocated in the program, and a length of LEN bytes is allocated immediately after it, you can use the following method to get:

struct demo  
{  
    int a;  
    char b[256];  
    char follow[0];  
};

In this way, you can use demo-> follow to access the subsequent spatial data of the structure demo. Of course, you can also use pointers to achieve this purpose. As follows:

struct demo{  
    int a;  
    char b[256];  
    char *follow;     
};  
struct demo *demo=(struct demo *)malloc(sizeof(struct demo)+LEN);

The effect of a zero-length array can also be achieved, but an extra char pointer is allocated. If the additional data space is allocated, it is a waste of space.

If a zero-length array like char bytes [0] is defined at the end of a structure, it means that the structure is of indefinite length and can be expanded by means of an array. The structure must contain length information. The structure itself is similar to an information header. At the same time, this structure can only allocate memory through the heap. Its advantage is that this method is more efficient than declaring a pointer variable in the structure and then dynamically allocating it.

8.2 Direct access

Because when accessing the contents of the array, no indirect access is needed, avoiding two accesses. Examples of use are as follows:

#include <stdio.h>  
#include <stdlib.h>  
  
struct  test{  
    int count;  
    //reverse is array name;the array is no item;  
    //the array address follow test struct  
    int reverse[0];  
};  
   
int main()  
{  
    int i;  
    struct test *ptest = (struct test *)malloc(sizeof(struct test)+sizeof(int)*10);  
    for(i=0;i<10;i++){  
            ptest->reverse[i]=i+1;  
    }  
    for(i=0;i<10;i++){  
            printf("reverse[%d]=%d \n",i,ptest->reverse[i]);  
    }  
    printf("sizeof(struct test) =%d\n",sizeof(struct test));  
    int a = *(&ptest->count +1 );  
    printf("a=%d\n",a);  
    return 0;  
}

The execution result is:

reverse[0]=1
reverse[1]=2
reverse[2]=3
reverse[3]=4
reverse[4]=5
reverse[5]=6
reverse[6]=7
reverse[7]=8
reverse[8]=9
reverse[9]=10
sizeof(struct test) =4
a=1

It can be seen that the reverse array in the test structure does not take up space. sizeof (struct test) takes up memory space 4. And you can see that the count variable is followed by the contents of the zero-length array.

9 Range markers

GCC also expands the range marker, which can be used to represent a range of values. This can be used in many places in C programs. The most common use is in switch / case statements. Examples of use are as follows:

static int sd_major(int major_idx)  
{  
    switch (major_idx) {  
    case 0:  
        return SCSI_DISK0_MAJOR;  
    case 1 ... 7:  
        return SCSI_DISK1_MAJOR + major_idx - 1;  
    case 8 ... 15:  
        return SCSI_DISK8_MAJOR + major_idx - 8;  
    default:  
        BUG();  
        return 0;   /* shut up gcc */  
    }  
}

Range tags can also be used to initialize some consecutive elements in an array. Array as described in Section 5 above.

10 Assign attributes to functions, variables and data types

Attributes are tools used by programmers to transfer information or commands to the compiler. They are usually used to instruct the compiler to help us complete some special processing when compiling the program. Attributes can be assigned to different kinds of objects, including functions, variables, and types. When specifying attributes, you must use the keyword "__attribute__" followed by a list of attributes enclosed in two parentheses. The attributes in the list of attributes are separated by commas. The method of use is as follows:

__attrbitue__((attr_1,attr_2,attr_3))

10.1 noreturn

The function attribute, noreturn is used for functions, indicating that the function never returns. This allows the compiler to generate slightly optimized code. The most important thing is to eliminate unnecessary warning messages such as uninitialized variables.

10.2 format(ARCHETYPE,STRING-INDEX,FIRST-TO-CHECK)

The function attribute, format is used for functions, which means that the function uses printf, scanf or strftime style parameters.
The easiest mistake to make when using this type of function is that the format string does not match the parameters. Specifying the format attribute allows the compiler to check the parameters according to the format string Type.
Parameter description:

        "archetype"指定是哪种风格；
        "string-index"指定传入函数的第几个参数是格式化字符串
        "first-to-check"指定从函数的第几个参数开始按上述规则进行检查

In the file include / linux / kernel.h, examples are used as follows:

asmlinkage int printk(const char * fmt, ...)  __attribute__ ((format (printf, 1, 2)));

10.3 unused

The function attribute and the unused attribute are used for functions and variables, indicating that the function or variable may not be used. This attribute can prevent the compiler from generating warning messages.

10.4 deprecated

The function attribute, deprecated, indicates that the function has been deprecated and should not be used again. If you try to use an obsolete function, you will receive a warning. You can also apply this attribute to types and variables to encourage developers to use them as little as possible.

10.5 section ("section-name")

The function attribute, the section attribute in __attribute__ puts the function or data into the input section named "section_name" instead of the output section.
Note: In the Linux driver design, there is an __init macro before the module loading function, and the attribute section attribute is also used. As follows:

#define __init __attribute__ ((__section__(".init.text")))

In the Linux kernel, all functions marked as __init are placed in the .init.text section when linking. In addition, all __init functions also save a function pointer in the section .initcall.init. During initialization, the kernel will call these __init functions through these function pointers and release the init section after initialization. In the Linux kernel source code, important macro definitions related to segments are: __init, __initdata, __exit, __exitdata and similar macros.

10.6 aligned(ALIGNMENT)

function: When defining variables, add __attribut__ to determine whether to use memory alignment, or memory alignment to a few bytes. Examples of use are as follows:

struct i387_fxsave_struct {  
    unsigned short cwd;  
    unsigned short swd;  
    unsigned short twd;  
    unsigned short fop;  
} __attribute__ ((aligned (16)));

Variables representing this structure type are aligned with 16 bytes.

10.7 packed

The function attribute tells the compiler to cancel the optimized alignment of the structure during compilation, and align it according to the actual number of bytes occupied.

10.8 interrupt("")

On the ARM platform, "__attribute ((interrupt (" IRQ ")))" means that the modified function is an interrupt handler.

11 Variable parameter macro

In GNU C, macros can accept a variable number of parameters, just like functions. Examples of use are as follows:

include/linux/kernel.h  
#define pr_debug(fmt,arg...) \  
     printk(KERN_DEBUG fmt,10rg)

Note: arg indicates the remaining parameters, which can be zero or more. These parameters and the comma between the parameters constitute the value of arg. Replace arg when the macro is expanded, for example:

pr_debug("%s:%d",filename,line)

Expands to

printk("<7>" "%s:%d", filename, line)

The reason for using "##" is to deal with the case where arg does not match any parameters. At this time, the value of arg is empty. In this special case, the GNU C preprocessor discards the comma before "##".

pr_debug("success!\n")

Expands to

printk("<7>" "success!\n")

Note: There is no comma at the end of the macro definition.

12 Built-in functions

GNU C provides a large number of built-in functions, many of which are built-in versions of standard C library functions, such as memcpy, which have the same functions as the corresponding C library functions. There are other built-in functions whose names usually start with __builtin.

12.1 __builtin_return_address(LEVEL)

The built-in function __builtin_return_address returns the return address of the current function or its caller. The parameter LEVEL specifies the number of search frames on the stack. 0 indicates the return address of the current function and 1 indicates the return address of the caller of the current function. For example, there is this definition in the file kernel / sched.c:

printk(KERN_ERR "schedule_timeout: wrong timeout" "value %lx from %p\n", timeout,__builtin_return_address(0));

12.2 __builtin_constant_p(EXP)

The built-in function __builtin_constant_p is used to determine whether a value is a compile-time constant. If the value of the parameter EXP is a constant, the function returns 1, otherwise it returns 0. For example, the file include / asm-i386 / bitops.h has this definition:

    (__builtin_constant_p(nr) ?     \  
      constant_test_bit((nr),(addr)) :   \  
       variable_test_bit((nr),(addr)))

Note: Many calculations or operations have a more optimized implementation when the parameter is a constant. In GNU C, the above method can be used to compile a constant version or a non-constant version according to whether the parameter is a constant Compile the optimized code when the parameter is constant.

12.3 __builtin_expect(EXP,C)

The built-in function __builtin_expect is used to provide branch prediction information for the compiler. Its return value is the value of the integer expression EXP, and the value of C must be a compile-time constant. There is this definition in the file include / linux / compiler.h:

#define       likely(x)         __builtin_expect((x),1)  
#define       unlikely(x)      __builtin_expect((x),0)

Used in the file kernel / sched.c as follows:

if(unlikely(in_interrupt())) {  
    printk("Scheduling in interrupt\n");  
    BUG();  
}

The semantics of this built-in function is that the expected value of EXP is C. The compiler can properly rearrange the order of the statement blocks based on this information, so that the program has higher execution efficiency under the expected conditions. The example above shows that it is rare to be in an interrupt context.

AGS-wangdsh

Published 289 original articles · praised 47 · 30,000+ views

Private letter concerns

Linux driver GNU extension C

Guess you like