Introduction and use of makefile

what makefile is used for

Usually a large-scale program is composed of multiple program modules. According to their functions, the module files will be distributed in different directories, and there will also be dependencies between modules. In most cases, we only modify the program when we write the program. Some files, certainly not all files are updated at the same time. It stands to reason that you only need to recompile those modified files, and you don’t need to compile all files. If you compile all files, the time cost for those large projects is still very high, such as the Linux kernel. Something with tens of millions of lines of code.

Then our problem is to automatically compile for those files that have been changed. This problem is divided into two small problems

(1) The target file depends on those files

(2) Whether the dependent files are updated

For the first question, if you only rely on manual maintenance of dependencies between files, it is fine when the program size is not large, but when the program size becomes large, those dependencies will drive people crazy. Back then I'd rather compile all files directly.

Fortunately, the almighty Linux provides the make command, which can help us automatically find out the changed files, and find out other related files affected by the changed files according to the dependencies, and then process these files separately according to the rules. Here Rules generally refer to compilation, such as calling gcc, but they can also be other actions such as deleting files.

The above rules and dependencies are defined in a file called makefile. The main mission of these two buddies is: after a file is found to be updated, only the file and related files affected by the file are compiled, and other files are not affected. Affected files are not recompiled, which improves compilation efficiency.

Here I want to emphasize to everyone that make and makefile are not used to compile programs, but gcc is used to compile programs. The two brothers just find out which files need to be updated, and then call other commands to process these files. In most cases, they call Compile with gcc or nasm, or delete files with rm. Of course, you can also execute other commands in the command rule, which is up to you.

makefile basic syntax

目标文件:依赖文件
[Tab]命令

The basic grammar of makefile includes three parts, which together are called a set of rules, and the meaning of each part is explained below.

Target file: The target file refers to the file to be generated in this rule. It can be a target file ending in .o, an executable file, or a pseudo-target. Pseudo-targets will be introduced later.

Dependent files: Dependent files refer to which files are required to generate the target files in this rule. Usually there is not one dependent file, so here is a list of dependent files.

Command: command refers to the actions to be executed in this rule. These actions refer to various shell commands. There can be more than one command, but a command must occupy a single line and must start with a Tab at the beginning of the line. This is the usage specified by make .

The meaning of the above rules is: to generate the target file, you need to prepare the dependent files in advance, and if any file in the list of dependent files is newer than the target file, execute the command in the rule.

How does the make program judge that the file has been updated

In Linux, files are divided into attributes and data. Each file has three types of time, which are used to record the time related to file attributes and file data. These three times are atime, mtime, and ctime.

atime , that is, access time, indicates the time to access the file data part, and atime will be updated every time the file data part is read. It is emphasized that the atime is changed when the file data (content) is read, such as cat or less command to view the file. updates atime, while the ls command does not.

ctime , that is, change time, indicates the change time of file attributes or data. Whenever the file attributes or data are modified, ctime will be updated, that is to say, ctime tracks the time when file attributes and file data change at the same time.

mtime , that is, modify time, indicates the modification time of the data part of the file, and mtime will be updated every time the data of the file is modified. As mentioned above, ctime also tracks the data change time, so when the file data is modified, mtime and ctime are updated together.

To view these three times in Linux, you can use statthe command .

lovess@Lyajpunov:~$ stat a.c
  File: a.c
  Size: 119             Blocks: 0          IO Block: 4096   regular file
Device: 2h/2d   Inode: 5629499536446330  Links: 1
Access: (0644/-rw-r--r--)  Uid: ( 1000/  lovess)   Gid: ( 1000/  lovess)
Access: 2023-03-20 20:59:05.709967700 +0800
Modify: 2023-03-20 20:58:41.805699100 +0800
Change: 2023-03-20 20:58:41.806695800 +0800
 Birth: -

For the file, we pay more attention to its data part, so as long as the make program obtains the mtime of the dependent file and the target file respectively, and compares whether the mtime of the dependent file is newer than the mtime of the target file, we will know whether to execute the command in the rule. The command here does not have to be a compilation command,

for example

1:2
	echo "OKKKKKKKKKK"

We use the touch command to generate the two files 1 and 2, and then make to get the string in the terminal

echo "OKKKKKKKKKK"
OKKKKKKKKKK

But our command is also output here, as long as @ is added before the command, the command can not be output

1:2
	@echo "OKKKKKKKKKK"

Through this simple example, we can also know that make and makefile are not specially used to compile files, they just execute the commands in the rules.

The file name of the makefile is not fixed, it can be specified with the -f parameter when executing make. If not specified with -f, by default, make will first look for the file named GNUmakefile, if the file does not exist, then look for the file named makefile, if the makefile does not exist, then find the file named Makefile document

jump to target

When there are many targets in make, we can use the target name as the parameter of make, and use the method of "make target name" to execute the rules at the target name separately. Note that this method will only execute the rules at the target name, and then exit, even if there are other targets, it will not be executed again.

for example

t1:1
    @echo "t1"
t2:1
    @echo "t2"

implement

lovess@Lyajpunov:~$ make t1
t1
lovess@Lyajpunov:~$ make t2
t2

When make does not have a target name as an argument, make will start executing at the first target that appears in the makefile.

In general, whether the command can be executed depends on whether the mtime of the dependent file is newer than the target file. If the mtime of the dependent file is older than the target file, it means that the target file is already up-to-date and does not need to be updated at all, so it stands to reason that the commands in the rule will not be executed in this case, and the fact is exactly the same.

false target

From the above example, you can see that the commands in the rules are not always executed. Sometimes we don’t care whether to generate real target files. We just hope that make can always execute some commands instead of considering mtime.

There is still a way to meet this requirement. Make stipulates that when there is no dependent file in the rule, the name of the target file is called a pseudo-target.

False targets, as the name suggests, do not generate real target files, so of course there is no need to rely on files. Therefore, the rule where the pseudo-target is located becomes purely executing commands. As long as you specify the name of the pseudo-target as a parameter to make, the commands in the pseudo-target rule can be executed directly.

for example

all:
	@echo "test ok"

Since there is only one target all in the makefile, if you execute make all or make at this time, the program will only output test ok.

Note that the pseudo-target cannot have the same name as the real target file, otherwise the meaning of the pseudo-target will be lost. In order to avoid the situation that the pseudo-target and the real target file have the same name, the keyword ".PHONY" can be used to modify the pseudo-target, and the format is ".PHONY :Pseudo-target name", so that no matter whether the file with the same name as the pseudo-target exists or not, make still executes the command at the pseudo-target.

Usually, when you need to explicitly modify the pseudo-target with .PHONY, you need to delete the .o file during the compilation process. This is to avoid compiling due to the existence of the old .o file. If you have experience compiling source code under Linux, you will understand the function of make clean. Usually clean is a pseudo-target used to delete .o files during compilation, such as:

.PHONY:clean
clean:
    rm ./build/*.o

There are no fixed rules for the naming of pseudo-targets, and users can define their favorite names according to their own wishes. However, since the makefile has been widely circulated, there are already some customary rules in the industry for the naming of pseudo-targets. Everyone defines pseudo-targets with similar functions as the same name. For example, clean mentioned above, this pseudo-target name is also recognized by everyone, its function is usually to clear the target file, of course, the corresponding command part has to be clear related commands such as rm. Here are some other recognized pseudo-target names.

fake target name	Functional description
all	Usually used to complete the compilation of all modules, similar to rebuild all
clean	Clear and compile all target files, generally use rm to achieve
dist	Usually used to recompress files packaged into tar
install	Copy the compiled program to the installation directory, which is configured by executing the configure script through the –prefix parameter
printf	print changed files
tar	Used to package files into tar files, also known as archives
test	Test the makefile process

make recursive derivation

The target in the makefile is recursively searched for the target layer by layer, just like the way to find the way back from the exit of the maze, finding the cause from the effect, and deducing it one by one. This is especially true when multiple targets are interdependent.

For example, two files have dependencies

lovess@Lyajpunov:~$ cat -n test1.c
     1  void my_print(char *);
     2
     3  int main(){
    
    
     4      my_print("hello makefile!");
     5  }
lovess@Lyajpunov:~$ cat -n test2.c
     1  #include <stdio.h>
     2
     3  void my_print(char *str) {
    
    
     4      printf(str);
     5  }

We try to compile these two files into test.bin

test2.o:test2.c
    gcc -c -o test2.o test2.c
test1.o:test1.c
    gcc -c -o test1.o test1.c
test.bin:test1.o test2.o
    gcc -o test.bin test1.o test2.o
all:test.bin
    @echo "compile done"

Excuting an order

make all

to generate the final executabletest.bin

What we focus on is the recursive process here

(1) If make does not find the file GNUmakefile, it continues to find the file makefile. After finding it, it finds the rule where all is located from the file according to the parameter all of the command.

(2) make finds that the dependent file test.bin of all does not exist, so it looks for the rule with test.bin as the target file.

(3) Finally found the test.bin rule on line 5, but make found that the dependent files test1.o and test2.o of test.bin do not exist, so first look for the rule with test1.o as the target file .

(4) The rule for generating test1.o is found in line 3, but its dependent file is test1.c. Since test1.o itself does not exist, there is no need to check the mtime of test1.c, and directly execute the command of this rule.

(5) After generating test1.o, the execution process returns to the rule where test.bin is located, that is, line 5. At this time, make finds that test2.o does not exist, so it continues to recursively search for the target test2.o.

(6) The rule where test2.o is located is found on the first line. Since test2.o itself does not exist, the mtime of the file test2.c it depends on is no longer checked, and the compilation command in the rule is directly executed

(7) After test2.o is generated, the execution flow returns to line 5 at this time, and make finds that the two dependent files test1.o and test2.o are ready, so execute the command of this rule, that is, line 6 gcc -o test.bin test1.o test2.o, generate the executable file test.bin from these two object files.

(8) test.bin is finally generated. At this time, it returns to the rule where the target all is located in step 2, so execute the command @echo "compile done" in the rule, and print a string to indicate that the compilation is complete.

Although all is treated as a real object file, the command we gave was not intended to generate it, so it acts like a pseudo-target.

Custom variables and system variables

Since makefile can be called programming, it must have the necessary basic functions of programming language, for example, variables can be defined in makefile. It is a pity that makefiles are not yet Turing-complete for programming languages.

Format of variable definition: variable name=value (string), multiple values are separated by spaces. The make program breaks up the values with spaces when processing, and then traverses each value. Also, values only support string types, even numbers are treated as strings.

Format of variable reference: $(variable name). This way, every time a variable is referenced, the variable name is replaced by its value (a string).

Note that although the value of the variable will be treated as a string type, it cannot be enclosed in double quotes or single quotes, otherwise the double quotes or single quotes will also be regarded as part of the variable value. For example, var = 'file.c', the value of var is not file1.c, but 'file.c'. When referencing the variable $(var) as a dependent file, make will look for a target named 'file.c' instead of file.c.

for example

test2.o:test2.c
    gcc -c -o test2.o test2.c
test1.o:test1.c
    gcc -c -o test1.o test1.c
objfiles = test1.o test2.o
test.bin:$(objfiles)
    gcc -o test.bin $(objfiles)
all:test.bin
    @echo "compile done"

The effect is the same as defined above.

In addition to user-defined variables, make also defines some system-level variables, which can be divided into command-related variables and parameter-related variables according to their uses. see table below

variable name	describe
AR	Packager, default is "ar"
AS	Assembly language compiler, default is "as"
CC	C language compiler, the default is "cc"
CXX	C++ language compiler, the default is "g++"
CPP	C preprocessor, the default is "$(CC) –E", such as gcc -E
FC	Fortran compiler and preprocessor, Ratfor compiler, default is "f77"
GET	Program to extract files from SCCS files, default is "get"
PC	Pascal language compiler, the default is "pc"
MAKEINFO	Convert texinfo file to info file, default is "makeinfo"
RM	Delete command, the default is "rm -f"
TEX	Program to create TexDVI files from TeX source files, default is "tex"
WEAVE	Program to convert Web to TeX, default is "weave"
YACC	Yacc lexical analyzer for processing C programs, default is "yacc"
YACCR	Yacc lexical analyzer for processing Ratfor programs, default is "yacc -r"

implicit rule

When writing a rule, if a line cannot be written, you can add a backslash character '\' at the end of the line, so that the content of the next line will be considered as the same line. In fact, this is a function supported by many compilers and interpreters. Not only It is make like this.

Another necessary function in makefile is comment. Just like shell script, # is used in makefile to comment on a single line. As long as the first non-empty character (except space and tab) of each line is '#', the content of this line will be commented. If there is a backslash character '\' at the end of the line, it means that the next line should also be treated as the current line, so, together with the next line, it is also commented out.

#test2.o:test2.c
#    gcc -c -o test2.o test2.c
#test1.o:test1.c
#    gcc -c -o test1.o test1.c
objfiles = test1.o test2.o
test.bin:$(objfiles)
    gcc -o test.bin $(objfiles)
all:test.bin
    @echo "compile done"

But we execute this program

lovess@Lyajpunov:~$ make all
cc    -c -o test1.o test1.c
cc    -c -o test2.o test2.c
gcc -o test.bin test1.o test2.o
compile done

These two files are still compiled, using the cc command, the compilation command in the makefile is gcc, and the compilation command output here is cc, so the gcc in the makefile is really commented out, please rest assured . In addition, cc is actually a soft link of gcc. These two are the same program, both pointing to /usr/bin/gcc.

For some rules that are used very frequently, make regards them as the default and does not need to be written out explicitly. When the user does not explicitly define the rules in the makefile, the implicit rules will be used for derivation by default.

The implicit rules are different for different programming languages, and are automatically derived according to general dependencies, and belong to a general method for reconstructing target files.

For different programming language dependencies, the make program can deduce the final executable file through the part of the file name except the extension, and then according to the implicit rules. That is to say, if you want to automatically deduce the generated target through implicit rules, the files existing on the file system must have the same part of the file name except the extension. For example, the C source file of xo must be named xc, so that xo can be successfully generated through implicit rules.

Here are some implicit rules for common languages

C program

The generation of "xo" depends on "xc", the command to generate xo is:

$(CC) -c $(CPPFLAGS) $(CFLAGS)
C++ program

The generation of "xo" depends on "x.cc" or "xC". The command to generate xo is:

$(CXX) -c $(CPPFLAGS) $(CFLAGS)

automation variable

make also supports an automation variable, which represents a set of file names, whether it is the target file name or a dependent file name, the range of this variable value belongs to the set of file names, that is to say, the automation variable is equivalent to the file name The collection is looped through. For different sets of file names, there are different automation variable names, some of which are listed below.

$@, represents the set of target file names in the rule, if there are multiple target files, $@ represents each of the file names

$<, means the first file in the dependent file in the rule. Mnemonic, '<' is like the leftmost of the set, which is the first.

$ ^{, represents the collection of all dependent files in the rule. If there are duplicate files in the collection, $} will automatically remove the duplicates. Mnemonic, '^' is like the action of covering from top to bottom, it can cover a large range, so it is called a collection

$?, indicates the collection of all dependent files newer than the target file mtime in the rule. Mnemonic, '?' means doubt, the biggest doubt for make is whether the mtime of the dependent file is newer than the mtime of the target file.

for example

objfiles = test1.o test2.o
test.bin:$(objfiles)
    gcc -o $@ $^
all:test.bin
    @echo "compile done"

We replaced test.bin with $@ on line 3 and all dependent files with $^.

pattern rules

Mode, that is, pattern, actually refers to the string module (mú, second tone). This concept is used in regular expressions to represent character or string matching, and to find out the string that matches this model. Make also supports this character String matching usage.

% is used to match any number of non-empty characters. For example, %.o means all files ending with .o, g%so means all files ending with .o starting with the character g, make will use this string pattern to search for files on the file system, and the default is the current path Down.

% is usually used in the target file in the rule to match all target files, and % can also be used in the dependent file in the rule, because the target file is the file to be generated, so when % is used in the dependent file , the matched file name shall be subject to the target file. Take %.o:%.c as an example, if %.o matches the target files ao and bo, then %.c in the dependent file will match ac and bc respectively.

for example

%.o:%.c
	gcc -c -o $@ $^
objfiles = test1.o test2.o
test.bin:$(objfiles)
    gcc -o $@ $^
all:test.bin
    @echo "compile done"

But will there be other risks in this way? For example, if there are other .c files in our file directory, will this c file be compiled? The answer is no, because the recursion starts from test.bin Yes, the derivation to the above will only derive test1.cand test2.cwill not involve other c files, so other C files will not be compiled.