Linux kernel source code analysis 2: Linux kernel version number and source code directory structure

insert image description here

1. Linux version

1. Stable and development versions

Linux kernel version

LinuxThe kernel is mainly divided into two versions:

  • Stable version (long-term support version): The stable version of the kernel has industrial strength and can be widely used and deployed. Most of the newly released stable kernels of each generation only fix some bugs or add some new device drivers.
  • Development version: The kernel developers in the development version of the kernel are constantly experimenting with new solutions, so the code in the development version of the kernel changes very quickly

These two versions Linuxare distinguished by version numbers.

2. Linux version number

insert image description here

LinuxThe version number of is mainly composed of three groups of numbers xxx.yyy.zzz:

  • xxx: major version number ( Major Release)
  • yyy: minor version number ( Minor Release)
  • zzz: revision number ( Revision Count)

in:

  • Add new drivers and fix bugs between minor versions, for example: 3.21.xand3.22.x
  • When the minor version has accumulated to a certain extent, a new major version will be released, for example: 5.x.xand6.x.x
  • After each minor version is released, people may find BUG in the process of using it, so a revised version of the minor version will be released, such as 5.6.13and5.6.14

The Linuxstable version and the development version are distinguished by the minor version number:

  • An even version number indicates that the version of the kernel is a stable version
  • An odd version number indicates that the version of the kernel is a beta version

For example, Linux 3.6.34:

  • 主版本号:3
  • 次版本号:6, stable version
  • 修订版本号:34, indicating that the current version is Linux 3.6revision 34 of

However, the combination of the major version number and the minor version number can actually describe a version of the kernel, because subsequent revisions are all bug fixes and do not involve the release of new features.

Two, Linux kernel source code organization structure

LinuxThe kernel source code is huge. The 2020 version of the kernel source code already has 27.8 million lines, and these codes are scattered in 66,492 C files. LinuxThe source files of the code are stored in different directories according to a certain organizational structure, and the functions of the source code in each directory have the same point in logic .

In addition to directories, there are also some files, which have special roles and purposes.

insert image description here

1. Directory

LinuxA series of directories are stored under the root directory of the source code, and there are many C codes in each directory. According to the functional division of these C codes logically, put them into the same folder.

insert image description here

A. archfolder

archThe code related to the architecture is stored in the directory, which is used to shield the differences between different systems and platforms. For example:

  • RISC-VThe address translation function of the architecture that CPUopens the virtual address needs to read and write satpregisters and mstatusregisters
  • x86The address translation function of the CPUopen virtual address needs to read and write cr3registers and cr0registers

LinuxThe function to open the virtual address translation function in is , switch_mmand because different architectures require different operations to open the virtual address translation, the switch_mmfunction is placed archin the directory.

insert image description here

In fact, switch_mmthere are many similar functions, such as interrupt processing functions... So, Linuxthe code that depends on each system is placed in archthe directory in the kernel
insert image description here

B. blockfolder

blockThe folder stores Linuxthe kernel's block device drivers. A block device refers to storing information in fixed-size blocks, each block has its own address, and then CPUyou can read a certain length of data at any location on the device by specifying the address.

A block device actually refers to a type of device, specifically I/Oa device, that is, a device that stores data, rather than CPUa computing device such as a graphics card, such as: 硬盘, U盘, SD卡.

Note that there is a directory Linuxunder the root directory of the kernel source code that will be mentioned later, and the drivers for various devices are placed in this directory. driversLogically speaking, the block device driver should also be a kind of driver, but why pull it blockout separately instead of putting it driversunder the directory?

This is actually because driversthe drivers of various devices are actually placed under the directory, for example:

  • driver/cdromThe driver of the CD-ROM is stored in the folder
  • driver/usbThe driver of the device is stored in the folder USB, such as a U disk

In fact, both read-only CDs and CD-ROMs USBare block devices, so the block operations, such as reading blocks and writing blocks, are the same, but the operations of reading blocks are different for specific devices.

Therefore, Linuxthe blockgeneral block device driver is stored in the directory, that is, the functions of writing blocks and reading blocks, and the functions of specific devices reading blocks are in the directory driver/设备. So Linuxput blockthe directory in the root directory.

In terms of implementation, except for blockthe directory, it is the same ipcas the directory. netIt just stores general ipcdevice drivers and general network drivers.

C. certsDirectory

certsLinuxCodes related to authentication and signing are stored in the directory . This directory contains some pre-installed digital certificates, which can be used to verify signed modules, kernel code and user space applications, etc. These certificates can be used to ensure that these components come from trusted sources, thereby improving System security.

The reason why this directory is needed is because Linuxthe kernel 2.2supports dynamic loading since the version 内核模块.

kernel module

When compiling the kernel, all driver, file system or network protocol codes will be compiled into the kernel. In the end, the kernel will be full of drivers for various devices. Maybe there are only more than 10 kinds of devices on your computer's hardware system, but there are more than 400 drivers for devices in the running kernel.

It is precisely because of this that the concept proposed after the version Linuxis to separate the file system, driver, etc. from the binary program of the kernel, and generate a separate file during the compilation phase. Which program is needed in the future, just load the corresponding file.2.2内核模块.so.so

For example, suppose we are now a researcher of file system and storage. We exthave improved the shortcomings of the file system and proposed our own file system Ironfile system. We want to test our file system, so at this time we can 内核模块write Ironthe program of the file system in the form of , and the kernel can load the file system we wrote into the memory Linuxwhen it is runningIron内核模块

内核模块They can be dynamically loaded and unloaded while the system is running, which means they are not necessarily loaded at system startup, but are loaded when needed. This dynamic loading method brings great benefits to the flexibility and scalability of the system, but it also brings some potential security risks to the system.

For example, an attacker can add some malicious code on top of the code that implements normal kernel module functions, such as stealing sensitive information, denial of service attacks, and privilege escalation. Finally, it is disguised as a legitimate kernel module, and the compiled .sofile is replaced by the original normal one 内核模块. Finally, when the user loads these malicious 内核模块kernel modules, these malicious kernel modules start to damage the system.

Because of this, multiple components Linuxincluded need to 内核模块be verified to prevent the source code of an open source, verified Linuxkernel from being inserted with malicious code, or loaded with a malicious kernel module.

insert image description here

D. cryptodirectory

cryptoLinuxThe compression and encryption algorithms commonly used by the kernel are stored in the directory .

1. Encryption algorithm in the kernel

LinuxThe kernel needs to use encryption algorithms in many places. Basically, Linuxvarious security functions are implemented in the system, such as: encrypted file system, network transmission encryption, digital signature, secure login, etc. where passwords need to be saved, all need to be encrypted.

Therefore, the directories Linuxin the kernel cryptocontain the implementations of commonly used encryption algorithms, including: AES, DES, SHA1, SHA256and so on.

In addition, because encryption and decryption are used very frequently, Linuxthe kernel cryptodirectory also provides drivers for hardware that accelerates encryption/decryption, such as hardware-based AESaccelerators. Using these drivers to call these hardware can improve the performance Linuxof kernel encryption and decryption. performance.

2. Kernel compression algorithm

LinuxThe source code of the kernel is currently 1.1G, and the final compiled binary format kernel will only be larger. Before booting, Linuxthe kernel exists on the disk in the form of a file, and it needs to be loaded into the memory if it wants to run. Therefore, in order to reduce startup time and memory usage, Linuxthe kernel will compress itself, and then decompress the corresponding code when needed.

Therefore, the main purpose of compression algorithms in the kernel is to reduce the size of the kernel image . Without compression, it will take longer to load and execute the kernel image and take up more memory space.

By using the compression algorithm, the size of the kernel image can be reduced to half or even smaller, thereby improving the startup speed and saving memory space . In addition, some file systems also use compression algorithms to reduce storage space usage and improve file system performance.

Of course, both compression and decompression are overhead. Although the compression algorithm can reduce the size of the kernel image, it will also increase the decompression time when the kernel starts. Therefore, Linuxwhen the kernel selects a specific compression algorithm, it will choose a compression algorithm according to the need to balance the compression ratio and decompression time, or directly use the compression algorithm specified by the user to obtain the best performance.

Specifically, the various compression algorithms implemented cryptoin the directory include: , , , …LinuxLZOLZ4ZlibDeflate

E. DocumentationDirectory

DocumentationThe directory is Linuxthe document of the kernel, which mainly describes the functions of various modules and defines some specifications

For example:

cat Documentation/riscv/boot-image-header.rst | less

insert image description here

F. driversDirectory

driversThe directory stores Linuxthe drivers for various hardware of the kernel. Example GPIOdevice:

ls drivers/gpio | less

insert image description here

Different CPUmodels GPIOhave different settings and initialization methods, so the codes for different chips are stored in the Linuxdirectory .drivers/gpioGPIO

G. fsDirectory

fsLinuxThe code of the virtual file system and the codes of various types of file systems are stored in the directory .

For example, the implementation of Windowsthe more common ntfsfile system is in fs/ntfsthe directory

ls fs/ntfs | less

insert image description here

H. includeDirectory

includeLinuxMost of the header files that the kernel source code depends on are stored in the directory . Each header file contains various definitions and declarations of the kernel, providing the necessary support for the construction and development of the kernel.

I. initDirectory

initThe code for kernel initialization is stored in the directory.

The entire running process of the kernel can actually be regarded as two processes:

  • Press the power button to start the kernel, and then the kernel is ready to wait for the user to use this process. This process is called初始化
  • The user starts to use the kernel, and the kernel works normally until it is shut down. This process is the normal operation process of the kernel

During the initialization phase, the kernel needs to do many things, such as:

  • Count available physical memory, enable virtual address translation, initialize memory management module...
  • Initialize the thread management module, build the kernel thread, create the init thread to run and user interactive shell...
  • ……

Since the kernel is composed of multiple components, the code in the kernel initialization phase actually enters each component of the kernel to complete the initialization of each component .

J. ipcDirectory

ipcThe directory is the implementation of inter-process communication, and will be compiled into the inter-process communication module of the kernel in the future. LinuxVarious commonly used inter-process communication mechanisms, such as:

  • amount of signal
  • Shared memory
  • anonymous pipe
  • ……

The implementation codes are all ipcunder this directory.

ls ipc

insert image description here

K. kernelDirectory

kernelThe directory is the core code of the kernel, including:

  • process management
  • interrupt management
  • clock
  • ……

These are the core functional components of the kernel, so they are all placed kernelunder this directory.

L. libDirectory

libThe directory contains some common library functions, such as: memset, strlen... These functions can be used by other parts of the kernel, thus simplifying the development of the kernel. Note that we can also adapt #include <stdlib.h>to the user program written by ourselves , but this is different from the one in the kernel directory.#include<stdio.h>memsetmemsetlibmemset

We use the functions memsetdefined C标准库in , but C标准库actually need the support of the operating system, so when we write the source code of the kernel, they are not C标准库used for us. We have to manually implement some C标准库functions by ourselves.

Therefore, from this perspective, the directory in the kernel source code libis actually C标准库a subset of the implementation.

In addition, libthere are implementations of some common data structures and algorithms in the directory, such as linked list, hash table, red-black tree, bitmap, etc.

M. mmCatalog

mmThe directory is Linuxthe implementation related to kernel memory management, including:

  • physical memory management
  • Page Allocation Algorithm
  • Page fault interrupt, page change algorithm
  • ……

N. netdirectory

netThe directory blockis similar to the directory, because there are many kinds of network devices, such as: wireless network card (that is WiFi), Ethernet (that is, wired), and 4G... and the driver codes of these specific network devices are placed in driversthe directory.

No matter how many types of devices there are, the network protocol stack is actually the same. For example, wireless network cards and Ethernet are used to send and receive data packets. Although the devices used to send and receive data packets are different, they all follow the IPv4 network protocol. Therefore, Linuxthe kernel puts the code implemented by the network protocol in netthe directory.

netNetwork protocols implemented in the catalog include:

  • TCP
  • IPv6
  • DNS
  • ……

O. samplesDirectory

samplesSome kernel sample codes and programs are stored in the directory Linux. These codes and programs can help novice kernel engineers better understand the Linuxworking principle and implementation details of the kernel.

samplesThe sample codes and programs in the directory can help developers learn how to use Linuxfunctions implemented by the kernel, such as functions related to network protocol stacks, file systems, drivers, schedulers, and other related functions in functional modules.

LinuxAt the same time, these sample codes and application programs can also be used as references and examples for developers to develop their own kernel modules and application programs.

Finally, samplesthere are some programs in the directory as test cases, which can be used to test Linuxthe correctness and performance of various functions and interfaces of the kernel.

Basically, when we first started to learn Linuxkernel development, or Hacking Linux Kernelwhen, in order to avoid directly modifying the code in other directories to insert our own code would make it Linuximpossible to run and difficult to debug, we generally put the code in samplesthe directory for testing. When we improve our skills in the future, we will modify the code in other directories.

P. scriptscatalog

LinuxThe directories in the kernel source directory scriptscontain some scripting tools that can help kernel developers perform kernel compilation, debugging, analysis, and optimization.

Specifically, scriptsthe directory contains the following categories of tools:

  1. Compilation tools : scriptsThe directory contains some tool scripts for compiling the kernel, such as make, gcc, ldand other tool scripts. These tool scripts can help developers compile kernel source code and generate executable kernel image files.
  2. Debugging tools : scriptsThe directory contains some scripting tools for debugging the kernel, such as scripts gdbfor , kgdb, kdband other tools. These scripts can help developers debug the kernel, locate and solve various problems in the kernel.
  3. Analysis tools : scriptsThe directory contains some script tools for analyzing the kernel, such as perf, , trace-cmdand so on. These tools can help developers perform performance analysis, trace and statistics on the kernel to find performance bottlenecks and optimization points in the kernel.
  4. Code inspection tools : scriptsThe directory contains some script tools for code inspection, such as checkpatch.pl, , sparseand so on. These scripting tools can help developers check code style, syntax errors, memory leaks and other issues in the kernel source code to improve code quality and maintainability.

Q. securityCatalog

LinuxDirectories within the kernel source directory securityprovide Linuximplementations of kernel security mechanisms such as: Access Contol List(i.e. ACL),SELinux …

The code in this directory mainly implements security-related functions, so this directory provides security-related modules and interfaces, which can help developers enhance the security performance of the system Linux.

Specifically, securitythe directory contains the following types of modules and interfaces:

  1. Security modules : securityThe directory contains some security modules, such as SELinux, AppArmoretc. These security modules can help developers implement access control and security strategy management for various resources in the system, thereby improving the security performance of the system.
  2. Security interface : securityThe directory also contains some security interfaces, such as security_inode_permission, security_file_permissionetc. These security interfaces can help developers implement access control and rights management to various resources in the system, thereby protecting sensitive data and applications in the system.
  3. Security policy : securityThe directory also contains some security policies, such as capability, , posix_acland so on. These security policies can help developers implement access control and rights management to various resources in the system, thereby protecting sensitive data and applications in the system.

R. soundDirectory

soundThe directory is Linuxthe directory in the kernel source that contains sound-related drivers and modules. Similar to the netdirectory block, the driver of a specific type of sound card device is placed driversin the directory, and soundthe code in the directory focuses on the general sound playback, recording and processing functions.

Therefore, soundthe directory calls driversthe drivers of different types of sound cards in the directory, so that the kernel Linuxcan support the use of multiple sound card devices to play, record and process sounds, and provides a series of interfaces and functions for user programs.

Specifically, soundwhat the directory mainly implements, or Linuxthe architecture of the module that currently processes sound , is the abbreviation for , ALSAwhich is an advanced architecture for sound processing in the system. The specification defines a series of drivers and libraries, enabling the system to support a variety of sound card devices, and provides a series of interfaces and functions, enabling applications to conveniently record, play, and process sounds.Advanced Linux Sound ArchitectureLinuxalsaLinux

In alsaaddition, soundthe directory also contains some other sound drivers and modules, such as ossthe subdirectory, which Open Sound Systemis an acronym for the old Linuxsound architecture, which has gradually been alsareplaced.

In addition, soundthe directory also contains some other sound drivers and modules, such as USB sound card driver, Bluetooth headset driver, etc. These drivers and modules can support a variety of sound card devices, and provide a series of interfaces and functions, enabling the Linux system to play, record and process sound.

S. toolsDirectory

In the Linux kernel source directory, toolsthe directory contains the source code of some tools and utilities, which are usually used for kernel development and debugging. Some of these tools are written in C, while others are Pythonwritten in or other scripting languages. Specifically:

  • These tools include some programs for system debugging and performance analysis, such as perfandftrace
  • Also includes some tools for kernel construction and compilation, such as kconfigandkbuild
  • Also contains some programs for simulation and testing, such as ktestand kvmetc.

toolsThe tools and utilities in the directory are very important for kernel development and debugging. Usually, a senior Linuxkernel developer needs to be familiar with the usage and implementation principles of these tools.

T. usrCatalog

usrThe directory is the source code for user-packaged and compressed kernel implementations.

U. virtDirectory

virtThe directory provides Linuxthe code implementation of the kernel's support for virtualization.

For example, virt/kvmthe directory contains the source code of the virtualization module Linuxin the kernel , which allows the Linux kernel to act as a virtual machine monitor ( ) to run virtual machinesKVMVMM

W. LICENSEDirectory

LinuxThe directories in the kernel source directory LICENSEScontain Linuxthe text of the various licenses used in the kernel source, including GPL, LGPL, BSD, MITand so on.

The purpose of this directory is to allow Linuxdevelopers of the kernel source code to easily view and understand the specific content and restrictions of each license.

This directory is to ensure Linuxthe openness and transparency of the kernel source code, so that everyone can understand Linuxthe usage conditions and restrictions of the kernel source code, so as to better comply with these licenses and regulations.

2. Documentation

In addition to the directory, Linuxthere are several files stored in the kernel source code directory, and these files have different usages.

insert image description here

A. COPYINGDocuments

LinuxThe files in the kernel source code directory COPYINGare copyright notice files, ie licenses, which stipulate Linuxthe conditions and restrictions on the use of the kernel source code.

LinuxThe license used by the kernel source code is GPLv2. GPLv2The open source license states that anyone is free to use, copy, distribute, and modify Linuxthe kernel source code, but the results obtained are also GPLopen sourced using the license.

It should be noted that Linuxmost of the source code of the kernel uses the same GPLv2, but there are some exceptions, such as some system calls, the corresponding authorization statement is LICENSES/exceptions/Linux-syscall-notein

b. CREDITfile

LinuxThe files in the kernel source directory CREDITlist all Linuxpeople who have contributed to the kernel, including:

  • kernel developer
  • maintainer
  • Testers
  • ……

CREDITThe purpose of the file is to recognize and thank all Linuxthose who have contributed to the kernel, documenting their contributions and achievements.

At the same time, CREDITthe document is also Linuxa part of the community culture, which emphasizes Linuxthe openness, cooperation and community spirit of kernel development, so that everyone can participate in Linuxthe development and maintenance of the kernel.

c. Kbuildfile

LinuxThe files in the kernel source directory Kbuildare used to build Linuxthe kernel Makefile. Note that Kbuildthe file mainly defines the configuration for building Linuxthe kernel Makefile. The commands to run when compiling the kernel are mainly defined in Makefilethe file

KbuildThe function of the file is to automate the process of building Linuxthe kernel, so that developers can easily compile, build and install Linuxthe kernel.

At the same time, Kbuildthe file also provides some advanced compilation functions, such as:

  • Support modular compilation
  • cross compile
  • parallel compilation
  • ……

These features make Linuxthe construction of the kernel more flexible and efficient.

D. MAINTAINERSfile

LinuxThe files in the kernel source directory MAINTAINERSkeep a list of maintainers for all current kernels. It records Linuxthe maintainers and contributors of various subsystems in the kernel.

MAINTAINERSThe role of the file is to help developers quickly find the maintainer or contributor responsible for a subsystem, so as to facilitate code submission, repair, or collaborative development.

MAINTAINERSThe file lists Linuxthe names, email addresses, companies, responsible subsystems, and other information of the maintainers and contributors of each subsystem in the kernel. Through this file, developers can find the maintainers or contributors who are responsible for the subsystems they care about, and submit code or report problems to them.

In addition to being a reference for developers, MAINTAINERSthe documentation also serves as Linuxan organizational and management tool for the kernel community. Through this file, Linuxthe kernel community can manage and coordinate the maintainers and contributors of each subsystem to ensure the Linuxgood development and stability of the kernel.

E. MakefileDocumentation

LinuxThe files in the kernel source directory Makefileare used to compile the kernel. It contains a series of commands that should be executed when compiling the kernel, including:

  • Commands for generating kernel images
  • Commands to generate kernel modules
  • ……

MakefileThe file will read Kbuildthe configuration information in the file to generate compilation commands to compile the kernel source code.

In addition to compiling kernel images and modules, Makefileother operations are defined in the file, such as:

  • install kernel
  • packaged kernel
  • Clear compiled files
  • ……

MakefileThe file makeperforms different operations according to specifying different targets when calling the command, such as make allcompiling the kernel and modules, make installinstalling the kernel, make cleanclearing files generated by compilation, and so on.

F. READMEDocumentation

LinuxThe files in the kernel source directory READMEprovide some basic information about that particular version of the kernel, such as:

  • version number of the kernel
  • Supported hardware platforms
  • Installation and Configuration Guide
  • ……

Additionally, READMEthe file may contain other useful information such as:

  • Known Issues, Limitations
  • BUGfix

Typically, READMEfiles provide basic information about the kernel to help users get started quickly.

Guess you like

Origin blog.csdn.net/qq_45488242/article/details/130915883