"Linux Kernel Coding Style" Official Manual!

This is an official manual translation, shared with everyone


This is a short document describing the preferred coding style of the Linux kernel. Coding style is very personal, this is the coding style for the code that I have to maintain (referring to the Linux kernel code), and I also want to use it for other project code. Please consider at least the style proposed in this article when writing kernel code.

First, I recommend printing out the GNU Coding Standards and then not reading them. Burn them, it's a great symbolic gesture.

Anyway, here we go:

1) Indent

The tab character ( Tab key) is 8 characters, so the indentation is also 8 characters. There are some heretical attempts to make the tab character 4 (or even 2!) characters, which is similar to trying to define the value of PI as 3.

Reason: The purpose of indentation is to clearly define where control blocks begin and end. Especially when you've been looking at the screen for 20 hours straight, it's more useful if the indents are larger (meaning it's easier to distinguish the indents).

Now, some people will claim that having an 8-character indentation moves the code too far to the right and makes it difficult to read on an 80-character terminal screen. The answer is that if you need more than three indentation levels, then there's something wrong with your code anyway and the program should be fixed.

In short, 8-character indentation makes things easier to read and has the effect of warning when functions are nested too deeply. Heed this warning.

The preferred method of mitigating  multiple indentation levels in a statement is to align  its subordinate  labels switch in the same column   , rather than   indenting labels twice. For example:switchcasecase

 
 
  1. switch (suffix) {

  2. case 'G':

  3. case 'g':

  4. mem <<= 30;

  5. break;

  6. case 'M':

  7. case 'm':

  8. mem <<= 20;

  9. break;

  10. case 'K':

  11. case 'k':

  12. mem <<= 10;

  13. fallthrough;

  14. default:

  15. break;

  16. }

Don't put multiple statements on one line unless you want to hide something:

 
 
  1. if (condition) do_this;

  2. do_something_everytime;

Don't use commas to avoid curly braces:

 
 
  1. if (condition)

  2. do_this(), do_that();

Always use curly braces for multiple statements:

 
 
  1. if (condition) {

  2. do_this();

  3. do_that();

  4. }

Also don't put multiple assignment statements on a single line. The kernel coding style is very simple. Avoid using tricky expressions.

Except for comments, documentation and Kconfig, spaces are not used for indentation. The previous example is intentional.

Use a good editor and don't leave spaces at the end of lines.

2) Break up long lines and strings

Coding style is about using common tools to maintain readability and maintainability.

The preferred limit for single row length is 80 columns.

Statements longer than 80 columns should be divided into reasonable fragments unless exceeding 80 columns would significantly improve readability without hiding information.

The following fragment should be shorter than the original statement and positioned essentially to the right. A typical example is to align the following fragment with the opening bracket of the function.

These same rules apply to function headers with long argument lists, like this:

 
 
  1. /* 注意:这个例子是我(陈孝松)写的 */

  2. void func(int a, int b, int c, int d, int e, int f, int g, int h, int i

  3. int j, int k)

  4. {

  5. ...

However, never destroy  printk user-visible strings such as messages, as this will break  grep the functionality for which they are displayed.

3) Placement of braces and spaces

3.1) Braces

Another issue that often arises in C style is the placement of braces. Unlike indent size, there is no technical reason to choose one placement strategy over another, but as Kernighan and Ritchie show us, the preferred way is to put the opening brace at the end of the line and then the closing brace The curly brace goes at the beginning of the line, so:

 
 
  1. if (x is true) {

  2. we do y

  3. }

This applies to all non-function blocks ( if, switch, for, while, do). For example:

 
 
  1. switch (action) {

  2. case KOBJ_ADD:

  3. return "add";

  4. case KOBJ_REMOVE:

  5. return "remove";

  6. case KOBJ_CHANGE:

  7. return "change";

  8. default:

  9. return NULL;

  10. }

However, there is a special case, namely functions: put the opening brace at the beginning of the next line, so:

 
 
  1. int function(int x)

  2. {

  3. body of function

  4. }

Heretics all over the world claim that this inconsistency is...well...inconsistent, but all sane people know that (a) K&R is correct, and (b) K&R is correct. In addition, functions are very special (functions cannot be nested in C language).

Chen Xiaosong's Note: K & R: Kernighan and Ritchie, the authors of "The C Programming Language".

Note that the closing brace is on a line by itself, unless it is followed by the remainder of the same statement, that is,  do within a statement  while, or  if within a statement  else, for example:

 
 
  1. do {

  2. body of do-loop

  3. } while (condition);

besides

 
 
  1. if (x == y) {

  2. ..

  3. } else if (x > y) {

  4. ...

  5. } else {

  6. ....

  7. }

Reason: K&R.

Also, note that this placement of braces also minimizes the number of empty (or nearly empty) lines without losing any readability. So, since new lines on the screen are a non-renewable resource (consider a 25-line terminal screen), you have more empty lines to place comments.

Where a single statement is used, there is no need to add unnecessary braces.

 
 
  1. if (condition)

  2. action();

and

 
 
  1. if (condition)

  2. do_this();

  3. else

  4. do_that();

This does not apply if only one branch of the conditional statement is a single statement; use braces in both branches:

 
 
  1. if (condition) {

  2. do_this();

  3. do_that();

  4. } else {

  5. otherwise();

  6. }

Also, use braces when the loop contains multiple simple statements on a single line:

 
 
  1. while (condition) {

  2. if (test)

  3. do_something();

  4. }

3.2) Space

The way the Linux kernel uses spaces depends (mostly) on whether they are used for functions or keywords. (Most) Add a space after the keyword. Notable exceptions are  sizeof, , typeof, alignof and  attribute, which look a bit like functions (and are often used with parentheses in Linux, although they are not required in the language, eg struct fileinfo info; after declarations  sizeof info).

Therefore, add a space after these keywords:

 
 
  1. if, switch, case, for, do, while

But you cannot   add spaces after sizeof, typeof, alignof or  . attributeFor example:

 
 
  1. s = sizeof(struct file);

Do not add spaces around (inside) the expression within parentheses. Here is a counterexample:

 
 
  1. s = sizeof( struct file );

* The preferred usage when declaring a pointer data type or a function that returns a pointer type is adjacent to the data name or function name rather than adjacent to the type name. example:

 
 
  1. char *linux_banner;

  2. unsigned long long memparse(char *ptr, char **retptr);

  3. char *match_strdup(substring_t *s);

Use a space on either side (each side) of most binary and ternary operators, such as any of the following:

 
 
  1. = + - < > * / % | & ^ <= >= == != ? :

But do not add spaces after the unary operator:

 
 
  1. & * + - ~ ! sizeof typeof alignof __attribute__ defined

There is no space before the postfix increment and decrement unary operators:

 
 
  1. ++ --

There is no space after the prefix increment and decrement unary operators:

 
 
  1. ++ --

. There are no spaces before and  -> after the structure member operators.

Do not leave spaces at the end of the line. Some editors with "smart" indentation will insert spaces at the beginning of new lines where appropriate, so you can start typing the next line of code immediately. However, if no code is entered on this line, some editors will not remove the spaces, as if you were left with a line that is just blank. As a result, lines with trailing spaces are produced.

Git will warn you when it finds that a patch contains trailing spaces, and can optionally remove the trailing spaces for you; however, if you apply a series of patches, doing so will cause subsequent patches to fail because you have changed the context of the patch. .

4) Naming

C is a simple language, and your naming should be as well. Unlike Modula-2 and Pascal programmers, C programmers don't use  ThisVariableIsATemporaryCounter cute names like . C programmers name  tmpthis variable, which is easier to write and easier to understand.

However, although mixed case names are deprecated, global variables still need to have descriptive names. Naming a global function  foo is an unforgivable mistake.

Global variables (used only when you really need them), like global functions, need to have descriptive names. If you have a function that counts the number of active users, it should be named .  count_active_users() or something similar, not  cntusr().

Including the function type in the function name (so-called Hungarian nomenclature) is stupid - the compiler knows the type and can check the type, and doing so only confuses the programmer.

Chen Xiaosong's Note: There was once another sentence here: No wonder Microsoft always creates problematic programs. On February 12, 2021 this sentence was deleted.

Local variable names should be short and concise. If you have some random integer loop counter, this should be named  i. loop_counter A name is useless if there is no possibility of being misunderstood  . Likewise, tmp it can be used to name temporary variables of any type.

If you're afraid of confusing your local variable names, then you'll run into another problem called function-growth-hormone-imbalance syndrome. See Chapter 6 (Functions).

For symbol names and documentation, avoid introducing new uses of "master/slave" (or "slave" independent of "master") and "blacklist/whitelist".

The recommended "master/slave" alternative is:

 
 
  1. '{primary,main} / {secondary,replica,subordinate}' '{initiator,requester} / {target,responder}' '{controller,host} / {device,worker,proxy}' 'leader / follower' 'director / performer'

Recommended "blacklist/whitelist" alternatives are:

 
 
  1. 'denylist / allowlist' 'blocklist / passlist'

The exception to introducing new usage is maintaining user-space ABI/API, or updating code that enforces existing (as of 2020) hardware or protocol specifications that enforce use of these terms. For new specifications, whenever possible, the canonical usage of terms is translated into kernel coding standards.

5) typedef

Please don't use  vps_t anything like that. The use of structures and pointers  typedef is wrong. when you see

 
 
  1. vps_t a;

appears in the code, what does it mean? On the contrary, if this

 
 
  1. struct virtual_container *a;

You know  a what it is.

Many people think  typedef it helps improve readability. not like this. They are only useful in the following situations:

1. Completely opaque objects (at this time  typedef actively used to hide what the object is).
For example: pte_t for opaque objects, you can only access them using the appropriate accessor functions.
Note: Opaqueness and "access functions" are not good in themselves. The reason for using  pte_t types such as etc. is that there is really no shared accessible information at all.

2. Clear integer types. This layer of abstraction helps int  avoid  confusion  long .
u8/u16/u32  are OK  typedef , but they are more consistent with case (4) than here.
Note again: there needs to be a reason . unsigned long There is no need for this if a certain variable type is 
 
  
  1. typedef unsigned long myflags_t;


But if there's a clear reason why it might be true in some cases  unsigned int and not in others  unsigned long , then by all means keep using it  typedef . 3. When you use  sparse  literally create a new type for type checking .
Chen Xiaosong's note: sparse  It was born in 2004 and was developed by Linus. The purpose is to provide a tool for static code inspection, thereby reducing the hidden dangers of the Linux kernel.

4. New types that are identical to standard C99 types in some special cases.
Although it only takes a short time for the eyes and brain to get used to  uint32_t such standard types, there are still objections to their use.
Therefore, Linux-specific equivalents of standard types  u8/u16/u32/u64 and their signed counterparts are allowed -- although they are not required in your own new code.
When editing existing code that already uses a type set, existing selections in the code should be respected.

5. Types that can be safely used in user space.
In some structures visible to user space, we cannot require C99 types and cannot use  u32 the types mentioned above. __u32 Therefore, we use similar types in all structures shared with user space  .

There may be other cases, but the basic rule should never be used  typedefunless you can clearly fit one of the above rules.

Usually, if an element in a pointer or structure can reasonably be accessed, it shouldn't be  typedef.

6) Function

Functions should be short and nice, and do one thing. They should fit on one or two screens (as we all know, ISO/ANSI the screen size is  80x24) and do one thing and do it well.

The maximum length of a function is inversely proportional to the function's complexity and the number of indentation levels. So if you have a theoretically simple function that's just a long (but simple)  case statement that needs to  case do a lot of little things in each statement, the function can be very long.

However, if the function is complex, and one suspects that a less gifted first-year high school student may not even understand what the function does, the length limit should be adhered to more strictly. Use helper functions with descriptive names (if you think their performance is critical, you can ask the compiler to inline them, which is better than writing a complex function).

Another measure of a function is the number of local variables. There should be no more than 5-10 of them, otherwise the function will have problems. Rethink the function implementation and split it into smaller functions. The human brain can usually easily keep track of about 7 different things, anything more and it gets confusing. No matter how smart you are, you may not be able to remember what you did two weeks ago.

In the source file, separate functions with a blank line. If the function is exported, the function's  EXPORT macro should be on the line immediately following the closing brace. For example:

 
 
  1. int system_is_up(void)

  2. {

  3. return system_state == SYSTEM_RUNNING;

  4. }

  5. EXPORT_SYMBOL(system_is_up);

In a function prototype, it is a good idea to include the parameter names and their data types. Although the C language does not require this, it is recommended in Linux because it can easily provide readers with more valuable information.

Do not  extern use keywords with function prototypes as this makes the lines longer and is not strictly necessary.

7) Centralized function exit path

Although considered obsolete by some,  goto the equivalent of a statement is often used by compilers in the form of an unconditional jump instruction.

Chen Xiaosong's note: The equivalent of the goto statement does not seem to be very smooth when translated into the  goto equivalent of the goto statement. Please tell me if you have a better translation.

goto Statements come in handy when a function exits from multiple locations and some general work must be performed, such as cleanup  . If no cleaning is required, return directly.

Choose a label name that explains  goto its function or  goto reason for its existence.  A good example would be a  tag name of  if  goto the transition was to the release  location. Avoid using   GW-BASIC names such as and because if you add or remove exit paths, you will have to renumber them, and they will make correctness difficult to verify anyway  . bufferout_free_buffer:err1:err2:

Chen Xiaosong's Note: GW-BASIC is a dialect version of BASIC. This version of BASIC was first developed by Microsoft in 1984 for Compaq (Compaq was acquired by Hewlett-Packard in 2002).

The basic principles used  goto are:

◈ Unconditional statements are easier to understand and track

◈ Less nesting

◈ Prevent errors caused by forgetting to update a separate exit point when making changes

◈ Saves the compiler work to optimize redundant code:wink:

 
 
  1. int fun(int a)

  2. {

  3. int result = 0;

  4. char *buffer;

  5. buffer = kmalloc(SIZE, GFP_KERNEL);

  6. if (!buffer)

  7. return -ENOMEM;

  8. if (condition1) {

  9. while (loop1) {

  10. ...

  11. }

  12. result = 1;

  13. goto out_free_buffer;

  14. }

  15. ...

  16. out_free_buffer:

  17. kfree(buffer);

  18. return result;

  19. }

A common mistake to watch out for is  one err bugs as follows:

 
 
  1. err:

  2. kfree(foo->bar);

  3. kfree(foo);

  4. return ret;

The bug in this code is on some exit  foo paths  NULL. Usually the solution to this problem is to split it into two error labels  err_free_bar: and err_free_foo::

 
 
  1. err_free_bar:

  2. kfree(foo->bar);

  3. err_free_foo:

  4. kfree(foo);

  5. return ret;

Ideally you should simulate errors to test all exit paths.

8) Comments

Comments are good, but there is a danger of over-commenting. Never try to explain how code works in comments: it's better to let others understand it at a glance, and explaining poorly written code is a waste of time.

Generally, you want your comments to tell others what your code does, not how it does it. Also, try to avoid adding comments inside the function body: if the function is so complex that you need to comment parts of it separately, you should probably go back to Chapter 6 and take a look. You can add small comments to note or warn about particularly clever (or bad) practices, but don't add too many. You should put comments at the head of the function to tell people what it does and why it does it.

When commenting kernel API functions, use kernel-doc format. For details, please refer to: Documentation/doc-guide/ <doc_guide> and scripts/kernel-doc.

The preferred style for long (multiline) comments is:

 
 
  1. /*

  2. * This is the preferred style for multi-line

  3. * comments in the Linux kernel source code.

  4. * Please use it consistently.

  5. *

  6. * Description: A column of asterisks on the left side,

  7. * with beginning and ending almost-blank lines.

  8. */

  9. /*

  10. * 这是Linux内核源代码中多行注释的首选风格。

  11. * 请始终使用这种风格。

  12. *

  13. * 说明:左侧是星号列,开始和结束的行几乎是空白的。

  14. */

 The preferred style for long (multiline) comments is slightly different for  files in net/ and  .drivers/net/

 
 
  1. /* The preferred comment style for files in net/ and drivers/net

  2. * looks like this.

  3. *

  4. * It is nearly the same as the generally preferred comment style,

  5. * but there is no initial almost-blank line.

  6. */

  7. /* net/和drivers/net/中的文件的首选注释风格如下所示。*

  8. * 它几乎与一般的首选注释风格相同,但是开始的行不是几乎空白的。

  9. */

It's also important to annotate the data (whether it's a base type or a derived type). To do this, use only one data declaration per line (do not use commas to declare multiple data at once). This leaves room for you to write a small comment on each piece of data to explain its purpose.

9) You've already messed up.

That's okay, we all do it. Longtime Unix user helpers may have told you that GNU  emacs automatically formats C source code for you, and as you've noticed, it does indeed do that, but the defaults it uses are less than ideal (in fact, they're worse than random input -An infinite number of monkeys  emacs typing in GNU will never make a good program).

Therefore, you either abandon GNU  emacsor change it to use more reasonable settings. To do this, you can paste the following into  .emacs the file:

 
 
  1. (defun c-lineup-arglist-tabs-only (ignored)

  2. "Line up argument lists by tabs, not spaces"

  3. (let* ((anchor (c-langelem-pos c-syntactic-element))

  4. (column (c-langelem-2nd-pos c-syntactic-element))

  5. (offset (- (1+ column) anchor))

  6. (steps (floor offset c-basic-offset)))

  7. (* (max steps 1)

  8. c-basic-offset)))

  9. (dir-locals-set-class-variables

  10. 'linux-kernel

  11. '((c-mode . (

  12. (c-basic-offset . 8)

  13. (c-label-minimum-indentation . 0)

  14. (c-offsets-alist . (

  15. (arglist-close . c-lineup-arglist-tabs-only)

  16. (arglist-cont-nonempty .

  17. (c-lineup-gcc-asm-reg c-lineup-arglist-tabs-only))

  18. (arglist-intro . +)

  19. (brace-list-intro . +)

  20. (c . c-lineup-C-comments)

  21. (case-label . 0)

  22. (comment-intro . c-lineup-comment)

  23. (cpp-define-intro . +)

  24. (cpp-macro . -1000)

  25. (cpp-macro-cont . +)

  26. (defun-block-intro . +)

  27. (else-clause . 0)

  28. (func-decl-cont . +)

  29. (inclass . +)

  30. (inher-cont . c-lineup-multi-inher)

  31. (knr-argdecl-intro . 0)

  32. (label . -1000)

  33. (statement . 0)

  34. (statement-block-intro . +)

  35. (statement-case-intro . +)

  36. (statement-cont . +)

  37. (substatement . +)

  38. ))

  39. (indent-tabs-mode . t)

  40. (show-trailing-whitespace . t)

  41. ))))

  42. (dir-locals-set-directory-class

  43. (expand-file-name "~/src/linux-trees")

  44. 'linux-kernel)

This will  emacs better match  ~/src/linux-trees the kernel coding style of C files.

But even if you can't  emacs format it properly, that doesn't mean all is lost: it still works  indent.

Now, again, GNU   has problematic settings indent with GNU  , so you need to give it some command options. emacsHowever, it's not too bad, because even  indent the authors of GNU agree on the authority of K&R (the GNU people are not bad people, they are seriously misinformed on this matter), so you just give the specified  indent option -kr -i8( stands for "K&R, 8 character indents"), or use  scripts/Lindent(indent in the funkiest way possible).

indent There are a lot of options, especially when it comes to reformatting comments, and you might want to take a look at the man page. But remember: indent you can't fix bad programming habits.

Note that you can also use  clang-format tools to help you follow these rules, quickly and automatically reformat parts of your code, and review the complete file to spot coding style errors, typos, and possible improvements. #includesIt's also handy for sorting  , aligning variables/macros, reflowing text, and other similar tasks. See documentation for more details  Documentation/process/clang-format.rst <clangformat>.

10) Kconfig configuration file

The indentation is somewhat different for all Kconfig* configuration files throughout the source tree. Lines immediately  config below definitions are indented by one tab, and help information is indented by an additional 2 spaces. example:

 
 
  1. config AUDIT

  2. bool "Auditing support"

  3. depends on NET

  4. help

  5. Enable auditing infrastructure that can be used with another

  6. kernel subsystem, such as SELinux (which requires this for

  7. logging of avc messages output). Does not do system-call

  8. auditing without CONFIG_AUDITSYSCALL.

Seriously dangerous features (such as write support for certain file systems) should have this highlighted in their prompt string:

 
 
  1. config ADFS_FS_RW

  2. bool "ADFS write support (DANGEROUS)"

  3. depends on ADFS_FS

  4. ...

For complete documentation on configuration files, see Files  Documentation/kbuild/kconfig-language.rst.

11) Data structure

Data structures that have visibility outside the single-threaded environment in which they are created and destroyed should always have a reference count. In the kernel, there is no garbage collection (and outside the kernel, garbage collection is slow and inefficient), which means you absolutely have to reference count all uses.

Reference counting means you can avoid locking and allow multiple users to access the data structure in parallel - no need to worry about the data structure disappearing just because it was not used for a while, sleeping for a while or doing other things.

Note that locking is not a substitute for reference counting. Locking is used to maintain the consistency of data structures, while reference counting is a memory management technique. Both are usually required and should not be confused with each other.

classes Many data structures can indeed have two levels of reference counting when there are different  users. The subclass counter counts the number of subclass users, and when the subclass counter reaches zero, the global counter is decremented by one.

Examples of this kind of multi-level reference counting ( : and multi-level-reference-counting) can be found in memory management ( struct mm_struct: mm_users and  mm_count) and in file system code ( struct super_blocks_count and  s_active).

Remember: if another thread can find your data structure, and you don't have a reference count for it, there's almost certainly a bug.

12) Macros, enumerations and RTL

Use uppercase letters in macro names that define constants and labels in enumerations.

 
 
  1. #define CONSTANT 0x12345

When defining multiple related constants, it is best to use enumerations.

Please use uppercase letters for macro names, but macros similar to functions can be named with lowercase letters.

Generally, inline functions are preferable to function-like macros.

Macros with multiple statements should be enclosed in  a do -statement  while block:

 
 
  1. #define macrofun(a, b, c) \

  2. do { \

  3. if (a == 5) \

  4. do_this(b, c); \

  5. } while (0)

Things to avoid when using macros:

1. Macros that affect the control flow:

 
 
  1. #define FOO(x) \

  2. do { \

  3. if (blah(x) < 0) \

  4. return -EBUGGERED; \

  5. } while (0)


Very bad. It looks like a function, but causes the function that calls it to exit. Don't mess with the grammar analyzer in the reader's brain. 2. Macros that rely on a local variable with a fixed name

 
 
  1. #define FOO(val) bar(index, val)


Maybe it looks good, but when people read the code it looks messy and can easily be broken by irrelevant changes.

3. Macros with parameters as lvalues: FOO(x) = y; , it will go wrong if someone  FOO turns it into an inline function.

4. Forget about precedence: Macros that use expressions to define constants must enclose the expression in parentheses. Macros with parameters should also pay attention to such issues.

 
 
  1. #define CONSTANT 0x4000

  2. #define CONSTEXP (CONSTANT | 3)

5. Namespace conflict when defining local variables in a function-like macro:

 
 
  1. #define FOO(x) \

  2. ({ \

  3. typeof(x) ret; \

  4. ret = calc_ret(x); \

  5. (ret); \

  6. })


ret Is a common name for local variables -  __foo_ret less likely to conflict with existing variables.
The cpp manual treats macros in detail. The gcc internals manual also introduces RTL, which is often used in assembly language in the kernel.

Chen Xiaosong's Note:
RTL: register transfer language (register transfer language), also translated as temporary register conversion language, register conversion language, an intermediate language used in compilers

13) Print kernel messages

Kernel developers should be well educated. Pay attention to the spelling of kernel messages to make a lasting impression. Do not use incorrect contractions such as  dont; instead use  do not or  don't. Make the message simple, clear, and unambiguous.

Kernel messages do not have to be terminated with a period (i.e. dot).

Numbers in parentheses ( %d) have no value and should be avoided.

<linux/device.h> There are a number of driver model diagnostic macros that you should use to ensure that messages match the correct device and driver and are tagged at the correct level:  dev_err(), dev_warn(), dev_info() etc. For messages that are not related to a specific device, , , ,  etc. <linux/printk.h> are defined  .pr_notice()pr_info()pr_warn()pr_err()

Writing good debug messages can be a big challenge. Once you have them, they will be a huge help with remote troubleshooting. However, debug messages are printed differently from other non-debug messages. While other  pr_XXX() functions print unconditionally, this  does not; the compiler ignores it pr_debug() unless it is defined  DEBUG or set  .  This is also true, and the related convention uses   adding   messages to   messages already enabled by.CONFIG_DYNAMIC_DEBUGdev_dbg()VERBOSE_DEBUGdev_vdbg()DEBUG

Many subsystems have Kconfig debugging options that can be turned on in the corresponding Makefile  -DDEBUG. In other cases, specific files are defined  #define DEBUG. and can be used when a debug message should be printed unconditionally (for example, if it is already in a debug-related  #ifdef context)  printk(KERN_DEBUG ...) .

14) Allocate memory

The kernel provides the following general-purpose memory allocation functions: kmalloc(), kzalloc(), kmalloc_array()kcalloc(), vmalloc(), vzalloc(). See the API documentation for more information about them:  Documentation/core-api/memory-allocation.rst <memory_allocation>.

The preferred form of passing the size of a structure is as follows:

 
 
  1. p = kmalloc(sizeof(*p), ...);

In another passing method, sizeof the operand is the name of the structure, which reduces readability and may introduce bugs. It is possible that when the pointer variable type is changed, the corresponding  sizeof result passed to the memory allocation function remains unchanged.

Chen Xiaosong Note: The following situations may occur:

 
  
  1. int *p; /* 最开始是 char *p, 后来修改成 int *p */

  2. p = kmalloc(sizeof(char), ...);

Casting  void the return value of a pointer is redundant. The C language guarantees that  void conversions from pointers to any other pointer type are problem-free.

The preferred form of allocating an array is as follows:

 
 
  1. p = kmalloc_array(n, sizeof(...), ...);

The preferred form of allocating an array initialized to zero is as follows:

 
 
  1. p = kcalloc(n, sizeof(...), ...);

Both forms check for  n * sizeof(...) overflow on the allocated size and return it if it occurs  NULL.

These general allocation functions   , __GFP_NOWARN when used without NULL

15) Inline Disadvantages

There is a common misconception that inline functions ( inline) are an option provided by gcc to make your code run faster. Although inline functions can be used appropriately (for example, as a way to replace macros, see Chapter 12), in many cases this is not the case. Heavy use  inline of keywords results in a larger kernel, which slows down the entire system due to a larger icache footprint on the CPU and less memory available for the pagecache. Consider this: a pagecache miss results in a disk seek, which can easily take 5 milliseconds. The CPU can execute many instructions in 5 milliseconds.

A basic rule of thumb is not to inline functions that contain more than 3 lines of code. The exception to this rule is when the parameter is known to be a compile-time constant, and because of that constant you are sure that the compiler will be able to optimize most functions at compile time. A good example is  kmalloc() inline functions.

People often argue that   nothing is lost by inline adding  a function that is only used once because there is nothing to trade off. staticWhile this is technically correct, the  gcc potential value of being able to automatically inline this function without help, and the  inlineresulting arguments that other users may ask for removal,  inline is outweighed by the potential value.

16) Function return value and naming

Functions can return many different types of values, one of the most common is a value that indicates whether the function succeeded or failed. Such a value can be represented as an error code integer (-Exxx = failure, 0 = success) or a boolean value of success (0 = failure, non-zero = success).

Mixing these two expressions is a source of hard-to-find bugs. If C could strictly distinguish between integers and booleans, the compiler would find these errors for us...but C doesn't. To prevent such errors, always follow the following conventions:

If the function's name is an action or mandatory command, the function should return an integer error code. If it is a judgment, the function should return a Boolean value indicating whether it was "successful" or not.

For example, add work it is a command that add_work() returns when the function succeeds  0and returns when it fails  -EBUSY. Again, it is a judgment,  the function will return  PCI device present if a matching device is successfully found , otherwise it will return  .pci_dev_present()10

All  EXPORT functions must adhere to this convention, and all public functions should adhere to this convention. Private( static) functions are not required but recommended.

Functions whose return value is the actual result of a calculation rather than indicating whether the calculation was successful are not subject to this rule. Typically, they indicate failure by returning an out-of-range result. A typical example is a function that returns a pointer. They use  NULL the or  ERR_PTR mechanism to report errors.

17) Use Boolean

The Linux kernel's Boolean types are  _Bool aliases for C99 types. Boolean values ​​can only be  0 or  1, and implicit or explicit conversions to boolean automatically convert the value to  true or  false. When using Boolean, !! does not require a structure, thus eliminating a class of errors.

When working with boolean values, you should use  true and  false definitions, not  1 sum  0.

Use functions and stack variables that can use Boolean return types when appropriate. The use of Boolean values ​​is encouraged for readability and is generally preferable to using integer types when storing Boolean values.

Do not use Boolean if cache line layout or size of the value is important, as the size and alignment will vary depending on the architecture it is compiled for. Structures that are optimized for alignment and size should not use Boolean values.

If your struct has many  true/ s false, consider combining them into bitfields with 1 bit member, or use an appropriate fixed-width type (eg  u8.

Similarly, for function arguments, many  true/ false values ​​can be combined into a single bitwise  argument, and  is often a more readable alternative flags if the calling site has exposed  true/ false constants  .flags

Otherwise, limiting the use of booleans in structures and parameters can improve readability.

18) Don’t reinvent kernel macros

The header files  include/linux/kernel.h contain many macros that you should use rather than write variations of them yourself. For example, if you need to calculate the length of an array, use macros:

 
 
  1. #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))

Likewise, if you need to calculate the size of some structure members, use:

 
 
  1. #define sizeof_field(t, f) (sizeof(((t*)0)->f))

min() There are also  max() macros that do strict type checking if needed  . You can check by yourself what macros are defined in that header file that you can use. If they are defined, you should not redefine them yourself in your code.

19) Editor modeline (configuration information) and other content

Some editors can interpret configuration information embedded in the source file represented by special markers. For example, emacs interpret the lines marked as follows:

 
 
  1. -*- mode: c -*-

Or something like this:

 
 
  1. /*

  2. Local Variables:

  3. compile-command: "gcc -DMAGIC_DEBUG_FLAG foo.c"

  4. End:

  5. */

vim Interpret the following markup:

 
 
  1. /* vim:set sw=8 noet */

Do not include any of these in the source file. People have their own personal editor configurations, and your source files shouldn't override them. This includes tags for indentation and mode configuration. People can use their own custom patterns, or use other clever methods that produce correct indentation.

20) Inline assembly

In architecture-specific code, you may need to use inline assembly to interact with CPU or platform features. Don't hesitate when necessary. But don't feel free to use inline assembly when C can do the job. You can and should operate hardware in C whenever possible.

Consider writing simple helper functions that wrap common bits of inline assembly rather than repeatedly writing slightly varying functions. Remember that inline assembly can take C parameters.

Large, non-trivial (large) assembly functions should be placed  .S in files, with corresponding C prototypes defined in C header files. The C prototype of the assembly function should be used  asmlinkage.

You may want to  asm mark the statement as such  volatileto prevent GCC from removing it without finding any side effects. However, you don't always need to do this, as doing so may limit optimization.

When writing a single inline assembly statement that contains multiple instructions, put each instruction on a separate line in a separate quoted string, ending each but the last string with  \n\t , To correctly indent the next instruction in the assembly output:

 
 
  1. asm ("magic %reg1, #42\n\t"

  2. "more_magic %reg2, %reg3"

  3. : /* outputs */ : /* inputs */ : /* clobbers */);

21) Conditional compilation

If possible, avoid  .c using preprocessing conditions ( #if, #ifdef) in files; doing so makes the code harder to read and the logic harder to follow. Instead use such conditions in header files to define  .c functions for use in those files,  #else provide no-op (no operation) stub versions, and then  .c call those functions unconditionally from the files. The compiler will avoid generating any code for the stub calls, resulting in the same result, but the logic will be easy to follow.

It is better to compile out the entire function rather than compiling parts of functions or expressions. Instead of  ifdef putting it into an expression, wrap some or all of the expression into a function and then call that function.

If you have a function or variable that may not be used in a particular configuration, and the compiler warns you that its definition is unused, mark the definition as unused  __maybe_unusedrather than wrapping it in a preprocessor conditional. (However, if the function or variable is never used, delete it.)

Within the code, where possible, use  IS_ENABLED macros to convert Kconfig symbols into C boolean expressions and use it in ordinary C conditionals:

 
 
  1. if (IS_ENABLED(CONFIG_SOMETHING)) {

  2. ...

  3. }

The compiler will keep folding the conditionals and  #ifdef include or exclude blocks of code as such, so this doesn't add any runtime overhead. However, this approach still allows the C compiler to look at the code within the block and check if it is correct (syntax, types, symbol references, etc.). Therefore, if code within a block refers to a symbol that would not exist if the condition was not met, it must still be used  #ifdef.

At the end of any important  #if or  #ifdef code block (multiple lines), place a comment after the same line  #endif and note the conditional expression used. For example:

 
 
  1. #ifdef CONFIG_SOMETHING

  2. ...

  3. #endif /* CONFIG_SOMETHING */

Appendix I) Reference

The C Programming Language, Second Edition by Brian W. Kernighan and Dennis M. Ritchie. Prentice Hall, Inc., 1988. ISBN 0-13-110362-8 (paperback), 0-13-110370-9 (hardback).

The Practice of Programming by Brian W. Kernighan and Rob Pike. Addison-Wesley, Inc., 1999. ISBN 0-201-61586-X.

GNU manuals - where in compliance with K&R and this text - for cpp, gcc, gcc internals and indent, all available from https://www.gnu.org/manual/

WG14 is the international standardization working group for the programming language C, URL: http://www.open-std.org/JTC1/SC22/WG14/

Kernel :ref:process/coding-style.rst <codingstyle>, by [email protected] at OLS 2002: http://www.kroah.com/linux/talks/ols_2002_kernel_codingstyle_talk/html/

Guess you like

Origin blog.csdn.net/liuxing__jacker/article/details/132932676