This is an official manual translation, shared with everyone
This is a short document describing the preferred coding style of the Linux kernel. Coding style is very personal, this is the coding style for the code that I have to maintain (referring to the Linux kernel code), and I also want to use it for other project code. Please consider at least the style proposed in this article when writing kernel code.
First, I recommend printing out the GNU Coding Standards and then not reading them. Burn them, it's a great symbolic gesture.
Anyway, here we go:
1) Indent
The tab character ( Tab
key) is 8 characters, so the indentation is also 8 characters. There are some heretical attempts to make the tab character 4 (or even 2!) characters, which is similar to trying to define the value of PI as 3.
Reason: The purpose of indentation is to clearly define where control blocks begin and end. Especially when you've been looking at the screen for 20 hours straight, it's more useful if the indents are larger (meaning it's easier to distinguish the indents).
Now, some people will claim that having an 8-character indentation moves the code too far to the right and makes it difficult to read on an 80-character terminal screen. The answer is that if you need more than three indentation levels, then there's something wrong with your code anyway and the program should be fixed.
In short, 8-character indentation makes things easier to read and has the effect of warning when functions are nested too deeply. Heed this warning.
The preferred method of mitigating multiple indentation levels in a statement is to align its subordinate labels switch
in the same column , rather than indenting labels twice. For example:switch
case
case
-
switch (suffix) {
-
case 'G':
-
case 'g':
-
mem <<= 30;
-
break;
-
case 'M':
-
case 'm':
-
mem <<= 20;
-
break;
-
case 'K':
-
case 'k':
-
mem <<= 10;
-
fallthrough;
-
default:
-
break;
-
}
Don't put multiple statements on one line unless you want to hide something:
-
if (condition) do_this;
-
do_something_everytime;
Don't use commas to avoid curly braces:
-
if (condition)
-
do_this(), do_that();
Always use curly braces for multiple statements:
-
if (condition) {
-
do_this();
-
do_that();
-
}
Also don't put multiple assignment statements on a single line. The kernel coding style is very simple. Avoid using tricky expressions.
Except for comments, documentation and Kconfig, spaces are not used for indentation. The previous example is intentional.
Use a good editor and don't leave spaces at the end of lines.
2) Break up long lines and strings
Coding style is about using common tools to maintain readability and maintainability.
The preferred limit for single row length is 80 columns.
Statements longer than 80 columns should be divided into reasonable fragments unless exceeding 80 columns would significantly improve readability without hiding information.
The following fragment should be shorter than the original statement and positioned essentially to the right. A typical example is to align the following fragment with the opening bracket of the function.
These same rules apply to function headers with long argument lists, like this:
-
/* 注意:这个例子是我(陈孝松)写的 */
-
void func(int a, int b, int c, int d, int e, int f, int g, int h, int i
-
int j, int k)
-
{
-
...
-
}
However, never destroy printk
user-visible strings such as messages, as this will break grep
the functionality for which they are displayed.
3) Placement of braces and spaces
3.1) Braces
Another issue that often arises in C style is the placement of braces. Unlike indent size, there is no technical reason to choose one placement strategy over another, but as Kernighan and Ritchie show us, the preferred way is to put the opening brace at the end of the line and then the closing brace The curly brace goes at the beginning of the line, so:
-
if (x is true) {
-
we do y
-
}
This applies to all non-function blocks ( if
, switch
, for
, while
, do
). For example:
-
switch (action) {
-
case KOBJ_ADD:
-
return "add";
-
case KOBJ_REMOVE:
-
return "remove";
-
case KOBJ_CHANGE:
-
return "change";
-
default:
-
return NULL;
-
}
However, there is a special case, namely functions: put the opening brace at the beginning of the next line, so:
-
int function(int x)
-
{
-
body of function
-
}
Heretics all over the world claim that this inconsistency is...well...inconsistent, but all sane people know that (a) K&R is correct, and (b) K&R is correct. In addition, functions are very special (functions cannot be nested in C language).
Chen Xiaosong's Note: K & R: Kernighan and Ritchie, the authors of "The C Programming Language".
Note that the closing brace is on a line by itself, unless it is followed by the remainder of the same statement, that is, do
within a statement while
, or if
within a statement else
, for example:
-
do {
-
body of do-loop
-
} while (condition);
besides
-
if (x == y) {
-
..
-
} else if (x > y) {
-
...
-
} else {
-
....
-
}
Reason: K&R.
Also, note that this placement of braces also minimizes the number of empty (or nearly empty) lines without losing any readability. So, since new lines on the screen are a non-renewable resource (consider a 25-line terminal screen), you have more empty lines to place comments.
Where a single statement is used, there is no need to add unnecessary braces.
-
if (condition)
-
action();
and
-
if (condition)
-
do_this();
-
else
-
do_that();
This does not apply if only one branch of the conditional statement is a single statement; use braces in both branches:
-
if (condition) {
-
do_this();
-
do_that();
-
} else {
-
otherwise();
-
}
Also, use braces when the loop contains multiple simple statements on a single line:
-
while (condition) {
-
if (test)
-
do_something();
-
}
3.2) Space
The way the Linux kernel uses spaces depends (mostly) on whether they are used for functions or keywords. (Most) Add a space after the keyword. Notable exceptions are sizeof
, , typeof
, alignof
and attribute
, which look a bit like functions (and are often used with parentheses in Linux, although they are not required in the language, eg struct fileinfo info;
after declarations sizeof info
).
Therefore, add a space after these keywords:
-
if, switch, case, for, do, while
But you cannot add spaces after sizeof
, typeof
, alignof
or . attribute
For example:
-
s = sizeof(struct file);
Do not add spaces around (inside) the expression within parentheses. Here is a counterexample:
-
s = sizeof( struct file );
*
The preferred usage when declaring a pointer data type or a function that returns a pointer type is adjacent to the data name or function name rather than adjacent to the type name. example:
-
char *linux_banner;
-
unsigned long long memparse(char *ptr, char **retptr);
-
char *match_strdup(substring_t *s);
Use a space on either side (each side) of most binary and ternary operators, such as any of the following:
-
= + - < > * / % | & ^ <= >= == != ? :
But do not add spaces after the unary operator:
-
& * + - ~ ! sizeof typeof alignof __attribute__ defined
There is no space before the postfix increment and decrement unary operators:
-
++ --
There is no space after the prefix increment and decrement unary operators:
-
++ --
.
There are no spaces before and ->
after the structure member operators.
Do not leave spaces at the end of the line. Some editors with "smart" indentation will insert spaces at the beginning of new lines where appropriate, so you can start typing the next line of code immediately. However, if no code is entered on this line, some editors will not remove the spaces, as if you were left with a line that is just blank. As a result, lines with trailing spaces are produced.
Git will warn you when it finds that a patch contains trailing spaces, and can optionally remove the trailing spaces for you; however, if you apply a series of patches, doing so will cause subsequent patches to fail because you have changed the context of the patch. .
4) Naming
C is a simple language, and your naming should be as well. Unlike Modula-2 and Pascal programmers, C programmers don't use ThisVariableIsATemporaryCounter
cute names like . C programmers name tmp
this variable, which is easier to write and easier to understand.
However, although mixed case names are deprecated, global variables still need to have descriptive names. Naming a global function foo
is an unforgivable mistake.
Global variables (used only when you really need them), like global functions, need to have descriptive names. If you have a function that counts the number of active users, it should be named . count_active_users()
or something similar, not cntusr()
.
Including the function type in the function name (so-called Hungarian nomenclature) is stupid - the compiler knows the type and can check the type, and doing so only confuses the programmer.
Chen Xiaosong's Note: There was once another sentence here: No wonder Microsoft always creates problematic programs. On February 12, 2021 this sentence was deleted.
Local variable names should be short and concise. If you have some random integer loop counter, this should be named i
. loop_counter
A name is useless if there is no possibility of being misunderstood . Likewise, tmp
it can be used to name temporary variables of any type.
If you're afraid of confusing your local variable names, then you'll run into another problem called function-growth-hormone-imbalance syndrome. See Chapter 6 (Functions).
For symbol names and documentation, avoid introducing new uses of "master/slave" (or "slave" independent of "master") and "blacklist/whitelist".
The recommended "master/slave" alternative is:
-
'{primary,main} / {secondary,replica,subordinate}' '{initiator,requester} / {target,responder}' '{controller,host} / {device,worker,proxy}' 'leader / follower' 'director / performer'
Recommended "blacklist/whitelist" alternatives are:
-
'denylist / allowlist' 'blocklist / passlist'
The exception to introducing new usage is maintaining user-space ABI/API, or updating code that enforces existing (as of 2020) hardware or protocol specifications that enforce use of these terms. For new specifications, whenever possible, the canonical usage of terms is translated into kernel coding standards.
5) typedef
Please don't use vps_t
anything like that. The use of structures and pointers typedef
is wrong. when you see
-
vps_t a;
appears in the code, what does it mean? On the contrary, if this
-
struct virtual_container *a;
You know a
what it is.
Many people think typedef
it helps improve readability. not like this. They are only useful in the following situations:
1. Completely opaque objects (at this time
2. Clear integer types. This layer of abstraction helpstypedef
actively used to hide what the object is).
For example:pte_t
for opaque objects, you can only access them using the appropriate accessor functions.
Note: Opaqueness and "access functions" are not good in themselves. The reason for usingpte_t
types such as etc. is that there is really no shared accessible information at all.int
avoid confusionlong
.
u8/u16/u32
are OKtypedef
, but they are more consistent with case (4) than here.
Note again: there needs to be a reason .unsigned long
There is no need for this if a certain variable type is
typedef unsigned long myflags_t;
But if there's a clear reason why it might be true in some casesunsigned int
and not in othersunsigned long
, then by all means keep using ittypedef
. 3. When you usesparse
literally create a new type for type checking .Chen Xiaosong's note:sparse
It was born in 2004 and was developed by Linus. The purpose is to provide a tool for static code inspection, thereby reducing the hidden dangers of the Linux kernel.4. New types that are identical to standard C99 types in some special cases.
Although it only takes a short time for the eyes and brain to get used touint32_t
such standard types, there are still objections to their use.
Therefore, Linux-specific equivalents of standard typesu8/u16/u32/u64
and their signed counterparts are allowed -- although they are not required in your own new code.
When editing existing code that already uses a type set, existing selections in the code should be respected.5. Types that can be safely used in user space.
In some structures visible to user space, we cannot require C99 types and cannot useu32
the types mentioned above.__u32
Therefore, we use similar types in all structures shared with user space .
There may be other cases, but the basic rule should never be used typedef
unless you can clearly fit one of the above rules.
Usually, if an element in a pointer or structure can reasonably be accessed, it shouldn't be typedef
.
6) Function
Functions should be short and nice, and do one thing. They should fit on one or two screens (as we all know, ISO/ANSI
the screen size is 80x24
) and do one thing and do it well.
The maximum length of a function is inversely proportional to the function's complexity and the number of indentation levels. So if you have a theoretically simple function that's just a long (but simple) case
statement that needs to case
do a lot of little things in each statement, the function can be very long.
However, if the function is complex, and one suspects that a less gifted first-year high school student may not even understand what the function does, the length limit should be adhered to more strictly. Use helper functions with descriptive names (if you think their performance is critical, you can ask the compiler to inline them, which is better than writing a complex function).
Another measure of a function is the number of local variables. There should be no more than 5-10 of them, otherwise the function will have problems. Rethink the function implementation and split it into smaller functions. The human brain can usually easily keep track of about 7 different things, anything more and it gets confusing. No matter how smart you are, you may not be able to remember what you did two weeks ago.
In the source file, separate functions with a blank line. If the function is exported, the function's EXPORT
macro should be on the line immediately following the closing brace. For example:
-
int system_is_up(void)
-
{
-
return system_state == SYSTEM_RUNNING;
-
}
-
EXPORT_SYMBOL(system_is_up);
In a function prototype, it is a good idea to include the parameter names and their data types. Although the C language does not require this, it is recommended in Linux because it can easily provide readers with more valuable information.
Do not extern
use keywords with function prototypes as this makes the lines longer and is not strictly necessary.
7) Centralized function exit path
Although considered obsolete by some, goto
the equivalent of a statement is often used by compilers in the form of an unconditional jump instruction.
Chen Xiaosong's note: The equivalent of the goto statement does not seem to be very smooth when translated into the
goto
equivalent of the goto statement. Please tell me if you have a better translation.
goto
Statements come in handy when a function exits from multiple locations and some general work must be performed, such as cleanup . If no cleaning is required, return directly.
Choose a label name that explains goto
its function or goto
reason for its existence. A good example would be a tag name of if goto
the transition was to the release location. Avoid using GW-BASIC names such as and because if you add or remove exit paths, you will have to renumber them, and they will make correctness difficult to verify anyway . buffer
out_free_buffer:
err1:
err2:
Chen Xiaosong's Note: GW-BASIC is a dialect version of BASIC. This version of BASIC was first developed by Microsoft in 1984 for Compaq (Compaq was acquired by Hewlett-Packard in 2002).
The basic principles used goto
are:
◈ Unconditional statements are easier to understand and track
◈ Less nesting
◈ Prevent errors caused by forgetting to update a separate exit point when making changes
◈ Saves the compiler work to optimize redundant code:wink:
-
int fun(int a)
-
{
-
int result = 0;
-
char *buffer;
-
buffer = kmalloc(SIZE, GFP_KERNEL);
-
if (!buffer)
-
return -ENOMEM;
-
if (condition1) {
-
while (loop1) {
-
...
-
}
-
result = 1;
-
goto out_free_buffer;
-
}
-
...
-
out_free_buffer:
-
kfree(buffer);
-
return result;
-
}
A common mistake to watch out for is one err bugs
as follows:
-
err:
-
kfree(foo->bar);
-
kfree(foo);
-
return ret;
The bug in this code is on some exit foo
paths NULL
. Usually the solution to this problem is to split it into two error labels err_free_bar:
and err_free_foo:
:
-
err_free_bar:
-
kfree(foo->bar);
-
err_free_foo:
-
kfree(foo);
-
return ret;
Ideally you should simulate errors to test all exit paths.
8) Comments
Comments are good, but there is a danger of over-commenting. Never try to explain how code works in comments: it's better to let others understand it at a glance, and explaining poorly written code is a waste of time.
Generally, you want your comments to tell others what your code does, not how it does it. Also, try to avoid adding comments inside the function body: if the function is so complex that you need to comment parts of it separately, you should probably go back to Chapter 6 and take a look. You can add small comments to note or warn about particularly clever (or bad) practices, but don't add too many. You should put comments at the head of the function to tell people what it does and why it does it.
When commenting kernel API functions, use kernel-doc format. For details, please refer to: Documentation/doc-guide/ <doc_guide>
and scripts/kernel-doc
.
The preferred style for long (multiline) comments is:
-
/*
-
* This is the preferred style for multi-line
-
* comments in the Linux kernel source code.
-
* Please use it consistently.
-
*
-
* Description: A column of asterisks on the left side,
-
* with beginning and ending almost-blank lines.
-
*/
-
/*
-
* 这是Linux内核源代码中多行注释的首选风格。
-
* 请始终使用这种风格。
-
*
-
* 说明:左侧是星号列,开始和结束的行几乎是空白的。
-
*/
The preferred style for long (multiline) comments is slightly different for files in net/
and .drivers/net/
-
/* The preferred comment style for files in net/ and drivers/net
-
* looks like this.
-
*
-
* It is nearly the same as the generally preferred comment style,
-
* but there is no initial almost-blank line.
-
*/
-
/* net/和drivers/net/中的文件的首选注释风格如下所示。*
-
* 它几乎与一般的首选注释风格相同,但是开始的行不是几乎空白的。
-
*/
It's also important to annotate the data (whether it's a base type or a derived type). To do this, use only one data declaration per line (do not use commas to declare multiple data at once). This leaves room for you to write a small comment on each piece of data to explain its purpose.
9) You've already messed up.
That's okay, we all do it. Longtime Unix user helpers may have told you that GNU emacs
automatically formats C source code for you, and as you've noticed, it does indeed do that, but the defaults it uses are less than ideal (in fact, they're worse than random input -An infinite number of monkeys emacs
typing in GNU will never make a good program).
Therefore, you either abandon GNU emacs
or change it to use more reasonable settings. To do this, you can paste the following into .emacs
the file:
-
(defun c-lineup-arglist-tabs-only (ignored)
-
"Line up argument lists by tabs, not spaces"
-
(let* ((anchor (c-langelem-pos c-syntactic-element))
-
(column (c-langelem-2nd-pos c-syntactic-element))
-
(offset (- (1+ column) anchor))
-
(steps (floor offset c-basic-offset)))
-
(* (max steps 1)
-
c-basic-offset)))
-
(dir-locals-set-class-variables
-
'linux-kernel
-
'((c-mode . (
-
(c-basic-offset . 8)
-
(c-label-minimum-indentation . 0)
-
(c-offsets-alist . (
-
(arglist-close . c-lineup-arglist-tabs-only)
-
(arglist-cont-nonempty .
-
(c-lineup-gcc-asm-reg c-lineup-arglist-tabs-only))
-
(arglist-intro . +)
-
(brace-list-intro . +)
-
(c . c-lineup-C-comments)
-
(case-label . 0)
-
(comment-intro . c-lineup-comment)
-
(cpp-define-intro . +)
-
(cpp-macro . -1000)
-
(cpp-macro-cont . +)
-
(defun-block-intro . +)
-
(else-clause . 0)
-
(func-decl-cont . +)
-
(inclass . +)
-
(inher-cont . c-lineup-multi-inher)
-
(knr-argdecl-intro . 0)
-
(label . -1000)
-
(statement . 0)
-
(statement-block-intro . +)
-
(statement-case-intro . +)
-
(statement-cont . +)
-
(substatement . +)
-
))
-
(indent-tabs-mode . t)
-
(show-trailing-whitespace . t)
-
))))
-
(dir-locals-set-directory-class
-
(expand-file-name "~/src/linux-trees")
-
'linux-kernel)
This will emacs
better match ~/src/linux-trees
the kernel coding style of C files.
But even if you can't emacs
format it properly, that doesn't mean all is lost: it still works indent
.
Now, again, GNU has problematic settings indent
with GNU , so you need to give it some command options. emacs
However, it's not too bad, because even indent
the authors of GNU agree on the authority of K&R (the GNU people are not bad people, they are seriously misinformed on this matter), so you just give the specified indent
option -kr -i8
( stands for "K&R, 8 character indents"), or use scripts/Lindent
(indent in the funkiest way possible).
indent
There are a lot of options, especially when it comes to reformatting comments, and you might want to take a look at the man page. But remember: indent
you can't fix bad programming habits.
Note that you can also use clang-format
tools to help you follow these rules, quickly and automatically reformat parts of your code, and review the complete file to spot coding style errors, typos, and possible improvements. #includes
It's also handy for sorting , aligning variables/macros, reflowing text, and other similar tasks. See documentation for more details Documentation/process/clang-format.rst <clangformat>
.
10) Kconfig configuration file
The indentation is somewhat different for all Kconfig* configuration files throughout the source tree. Lines immediately config
below definitions are indented by one tab, and help information is indented by an additional 2 spaces. example:
-
config AUDIT
-
bool "Auditing support"
-
depends on NET
-
help
-
Enable auditing infrastructure that can be used with another
-
kernel subsystem, such as SELinux (which requires this for
-
logging of avc messages output). Does not do system-call
-
auditing without CONFIG_AUDITSYSCALL.
Seriously dangerous features (such as write support for certain file systems) should have this highlighted in their prompt string:
-
config ADFS_FS_RW
-
bool "ADFS write support (DANGEROUS)"
-
depends on ADFS_FS
-
...
For complete documentation on configuration files, see Files Documentation/kbuild/kconfig-language.rst
.
11) Data structure
Data structures that have visibility outside the single-threaded environment in which they are created and destroyed should always have a reference count. In the kernel, there is no garbage collection (and outside the kernel, garbage collection is slow and inefficient), which means you absolutely have to reference count all uses.
Reference counting means you can avoid locking and allow multiple users to access the data structure in parallel - no need to worry about the data structure disappearing just because it was not used for a while, sleeping for a while or doing other things.
Note that locking is not a substitute for reference counting. Locking is used to maintain the consistency of data structures, while reference counting is a memory management technique. Both are usually required and should not be confused with each other.
classes
Many data structures can indeed have two levels of reference counting when there are different users. The subclass counter counts the number of subclass users, and when the subclass counter reaches zero, the global counter is decremented by one.
Examples of this kind of multi-level reference counting ( : and multi-level-reference-counting
) can be found in memory management ( struct mm_struct
: mm_users
and mm_count
) and in file system code ( struct super_block
: s_count
and s_active
).
Remember: if another thread can find your data structure, and you don't have a reference count for it, there's almost certainly a bug.
12) Macros, enumerations and RTL
Use uppercase letters in macro names that define constants and labels in enumerations.
-
#define CONSTANT 0x12345
When defining multiple related constants, it is best to use enumerations.
Please use uppercase letters for macro names, but macros similar to functions can be named with lowercase letters.
Generally, inline functions are preferable to function-like macros.
Macros with multiple statements should be enclosed in a do
-statement while
block:
-
#define macrofun(a, b, c) \
-
do { \
-
if (a == 5) \
-
do_this(b, c); \
-
} while (0)
Things to avoid when using macros:
1. Macros that affect the control flow:
-
#define FOO(x) \
-
do { \
-
if (blah(x) < 0) \
-
return -EBUGGERED; \
-
} while (0)
Very bad. It looks like a function, but causes the function that calls it to exit. Don't mess with the grammar analyzer in the reader's brain. 2. Macros that rely on a local variable with a fixed name
-
#define FOO(val) bar(index, val)
Maybe it looks good, but when people read the code it looks messy and can easily be broken by irrelevant changes.
3. Macros with parameters as lvalues: FOO(x) = y;
, it will go wrong if someone FOO
turns it into an inline function.
4. Forget about precedence: Macros that use expressions to define constants must enclose the expression in parentheses. Macros with parameters should also pay attention to such issues.
-
#define CONSTANT 0x4000
-
#define CONSTEXP (CONSTANT | 3)
5. Namespace conflict when defining local variables in a function-like macro:
-
#define FOO(x) \
-
({ \
-
typeof(x) ret; \
-
ret = calc_ret(x); \
-
(ret); \
-
})
ret
Is a common name for local variables - __foo_ret
less likely to conflict with existing variables.
The cpp manual treats macros in detail. The gcc internals manual also introduces RTL, which is often used in assembly language in the kernel.
Chen Xiaosong's Note:
RTL: register transfer language (register transfer language), also translated as temporary register conversion language, register conversion language, an intermediate language used in compilers
13) Print kernel messages
Kernel developers should be well educated. Pay attention to the spelling of kernel messages to make a lasting impression. Do not use incorrect contractions such as dont
; instead use do not
or don't
. Make the message simple, clear, and unambiguous.
Kernel messages do not have to be terminated with a period (i.e. dot).
Numbers in parentheses ( %d
) have no value and should be avoided.
<linux/device.h>
There are a number of driver model diagnostic macros that you should use to ensure that messages match the correct device and driver and are tagged at the correct level: dev_err()
, dev_warn()
, dev_info()
etc. For messages that are not related to a specific device, , , , etc. <linux/printk.h>
are defined .pr_notice()
pr_info()
pr_warn()
pr_err()
Writing good debug messages can be a big challenge. Once you have them, they will be a huge help with remote troubleshooting. However, debug messages are printed differently from other non-debug messages. While other pr_XXX()
functions print unconditionally, this does not; the compiler ignores it pr_debug()
unless it is defined DEBUG
or set . This is also true, and the related convention uses adding messages to messages already enabled by.CONFIG_DYNAMIC_DEBUG
dev_dbg()
VERBOSE_DEBUG
dev_vdbg()
DEBUG
Many subsystems have Kconfig debugging options that can be turned on in the corresponding Makefile -DDEBUG
. In other cases, specific files are defined #define DEBUG
. and can be used when a debug message should be printed unconditionally (for example, if it is already in a debug-related #ifdef
context) printk(KERN_DEBUG ...)
.
14) Allocate memory
The kernel provides the following general-purpose memory allocation functions: kmalloc()
, kzalloc()
, kmalloc_array()
, kcalloc()
, vmalloc()
, vzalloc()
. See the API documentation for more information about them: Documentation/core-api/memory-allocation.rst <memory_allocation>
.
The preferred form of passing the size of a structure is as follows:
-
p = kmalloc(sizeof(*p), ...);
In another passing method, sizeof
the operand is the name of the structure, which reduces readability and may introduce bugs. It is possible that when the pointer variable type is changed, the corresponding sizeof
result passed to the memory allocation function remains unchanged.
Chen Xiaosong Note: The following situations may occur:
int *p; /* 最开始是 char *p, 后来修改成 int *p */
p = kmalloc(sizeof(char), ...);
Casting void
the return value of a pointer is redundant. The C language guarantees that void
conversions from pointers to any other pointer type are problem-free.
The preferred form of allocating an array is as follows:
-
p = kmalloc_array(n, sizeof(...), ...);
The preferred form of allocating an array initialized to zero is as follows:
-
p = kcalloc(n, sizeof(...), ...);
Both forms check for n * sizeof(...)
overflow on the allocated size and return it if it occurs NULL
.
These general allocation functions , __GFP_NOWARN
when used without NULL
15) Inline Disadvantages
There is a common misconception that inline functions ( inline
) are an option provided by gcc to make your code run faster. Although inline functions can be used appropriately (for example, as a way to replace macros, see Chapter 12), in many cases this is not the case. Heavy use inline
of keywords results in a larger kernel, which slows down the entire system due to a larger icache footprint on the CPU and less memory available for the pagecache. Consider this: a pagecache miss results in a disk seek, which can easily take 5 milliseconds. The CPU can execute many instructions in 5 milliseconds.
A basic rule of thumb is not to inline functions that contain more than 3 lines of code. The exception to this rule is when the parameter is known to be a compile-time constant, and because of that constant you are sure that the compiler will be able to optimize most functions at compile time. A good example is kmalloc()
inline functions.
People often argue that nothing is lost by inline
adding a function that is only used once because there is nothing to trade off. static
While this is technically correct, the gcc
potential value of being able to automatically inline this function without help, and the inline
resulting arguments that other users may ask for removal, inline
is outweighed by the potential value.
16) Function return value and naming
Functions can return many different types of values, one of the most common is a value that indicates whether the function succeeded or failed. Such a value can be represented as an error code integer (-Exxx = failure, 0 = success) or a boolean value of success (0 = failure, non-zero = success).
Mixing these two expressions is a source of hard-to-find bugs. If C could strictly distinguish between integers and booleans, the compiler would find these errors for us...but C doesn't. To prevent such errors, always follow the following conventions:
If the function's name is an action or mandatory command, the function should return an integer error code. If it is a judgment, the function should return a Boolean value indicating whether it was "successful" or not.
For example, add work
it is a command that add_work()
returns when the function succeeds 0
and returns when it fails -EBUSY
. Again, it is a judgment, the function will return PCI device present
if a matching device is successfully found , otherwise it will return .pci_dev_present()
1
0
All EXPORT
functions must adhere to this convention, and all public functions should adhere to this convention. Private( static
) functions are not required but recommended.
Functions whose return value is the actual result of a calculation rather than indicating whether the calculation was successful are not subject to this rule. Typically, they indicate failure by returning an out-of-range result. A typical example is a function that returns a pointer. They use NULL
the or ERR_PTR
mechanism to report errors.
17) Use Boolean
The Linux kernel's Boolean types are _Bool
aliases for C99 types. Boolean values can only be 0
or 1
, and implicit or explicit conversions to boolean automatically convert the value to true
or false
. When using Boolean, !! does not require a structure, thus eliminating a class of errors.
When working with boolean values, you should use true
and false
definitions, not 1
sum 0
.
Use functions and stack variables that can use Boolean return types when appropriate. The use of Boolean values is encouraged for readability and is generally preferable to using integer types when storing Boolean values.
Do not use Boolean if cache line layout or size of the value is important, as the size and alignment will vary depending on the architecture it is compiled for. Structures that are optimized for alignment and size should not use Boolean values.
If your struct has many true
/ s false
, consider combining them into bitfields with 1 bit member, or use an appropriate fixed-width type (eg u8
.
Similarly, for function arguments, many true
/ false
values can be combined into a single bitwise argument, and is often a more readable alternative flags
if the calling site has exposed true
/ false
constants .flags
Otherwise, limiting the use of booleans in structures and parameters can improve readability.
18) Don’t reinvent kernel macros
The header files include/linux/kernel.h
contain many macros that you should use rather than write variations of them yourself. For example, if you need to calculate the length of an array, use macros:
-
#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
Likewise, if you need to calculate the size of some structure members, use:
-
#define sizeof_field(t, f) (sizeof(((t*)0)->f))
min()
There are also max()
macros that do strict type checking if needed . You can check by yourself what macros are defined in that header file that you can use. If they are defined, you should not redefine them yourself in your code.
19) Editor modeline (configuration information) and other content
Some editors can interpret configuration information embedded in the source file represented by special markers. For example, emacs
interpret the lines marked as follows:
-
-*- mode: c -*-
Or something like this:
-
/*
-
Local Variables:
-
compile-command: "gcc -DMAGIC_DEBUG_FLAG foo.c"
-
End:
-
*/
vim
Interpret the following markup:
-
/* vim:set sw=8 noet */
Do not include any of these in the source file. People have their own personal editor configurations, and your source files shouldn't override them. This includes tags for indentation and mode configuration. People can use their own custom patterns, or use other clever methods that produce correct indentation.
20) Inline assembly
In architecture-specific code, you may need to use inline assembly to interact with CPU or platform features. Don't hesitate when necessary. But don't feel free to use inline assembly when C can do the job. You can and should operate hardware in C whenever possible.
Consider writing simple helper functions that wrap common bits of inline assembly rather than repeatedly writing slightly varying functions. Remember that inline assembly can take C parameters.
Large, non-trivial (large) assembly functions should be placed .S
in files, with corresponding C prototypes defined in C header files. The C prototype of the assembly function should be used asmlinkage
.
You may want to asm
mark the statement as such volatile
to prevent GCC from removing it without finding any side effects. However, you don't always need to do this, as doing so may limit optimization.
When writing a single inline assembly statement that contains multiple instructions, put each instruction on a separate line in a separate quoted string, ending each but the last string with \n\t
, To correctly indent the next instruction in the assembly output:
-
asm ("magic %reg1, #42\n\t"
-
"more_magic %reg2, %reg3"
-
: /* outputs */ : /* inputs */ : /* clobbers */);
21) Conditional compilation
If possible, avoid .c
using preprocessing conditions ( #if
, #ifdef
) in files; doing so makes the code harder to read and the logic harder to follow. Instead use such conditions in header files to define .c
functions for use in those files, #else
provide no-op (no operation) stub versions, and then .c
call those functions unconditionally from the files. The compiler will avoid generating any code for the stub calls, resulting in the same result, but the logic will be easy to follow.
It is better to compile out the entire function rather than compiling parts of functions or expressions. Instead of ifdef
putting it into an expression, wrap some or all of the expression into a function and then call that function.
If you have a function or variable that may not be used in a particular configuration, and the compiler warns you that its definition is unused, mark the definition as unused __maybe_unused
rather than wrapping it in a preprocessor conditional. (However, if the function or variable is never used, delete it.)
Within the code, where possible, use IS_ENABLED
macros to convert Kconfig symbols into C boolean expressions and use it in ordinary C conditionals:
-
if (IS_ENABLED(CONFIG_SOMETHING)) {
-
...
-
}
The compiler will keep folding the conditionals and #ifdef
include or exclude blocks of code as such, so this doesn't add any runtime overhead. However, this approach still allows the C compiler to look at the code within the block and check if it is correct (syntax, types, symbol references, etc.). Therefore, if code within a block refers to a symbol that would not exist if the condition was not met, it must still be used #ifdef
.
At the end of any important #if
or #ifdef
code block (multiple lines), place a comment after the same line #endif
and note the conditional expression used. For example:
-
#ifdef CONFIG_SOMETHING
-
...
-
#endif /* CONFIG_SOMETHING */
Appendix I) Reference
The C Programming Language, Second Edition by Brian W. Kernighan and Dennis M. Ritchie. Prentice Hall, Inc., 1988. ISBN 0-13-110362-8 (paperback), 0-13-110370-9 (hardback).
The Practice of Programming by Brian W. Kernighan and Rob Pike. Addison-Wesley, Inc., 1999. ISBN 0-201-61586-X.
GNU manuals - where in compliance with K&R and this text - for cpp, gcc, gcc internals and indent, all available from https://www.gnu.org/manual/
WG14 is the international standardization working group for the programming language C, URL: http://www.open-std.org/JTC1/SC22/WG14/
Kernel :ref:process/coding-style.rst <codingstyle>
, by [email protected] at OLS 2002: http://www.kroah.com/linux/talks/ols_2002_kernel_codingstyle_talk/html/