It turns out that the underlying debugging principle of gdb is so simple

I. Introduction

This article will talk about the famous GDB . Let’s not mention its rich background. Like its brother GCC , it was born with a golden key , and its status in the GNU family is unshakable. I believe that every embedded development engineer has used gdb to debug programs. If you say that you haven't used it, it can only mean that your development experience is not rough enough, and you need to continue to be beaten by BUG.

We all know that when compiling with gcc, you can use the -g option to embed more debugging information in the executable file, so what debugging information is embedded? How does the debugging information interact with the binary instructions? When debugging, how to get the context information in the function call stack in the debugging information?

In response to the above doubts, Brother Dao used two articles to thoroughly describe the deepest problems at the bottom, so that you can enjoy watching them at once.

The first article is the current one. The main content is to introduce the underlying debugging principle of GDB. Let's take a look at what mechanism GDB uses to control the execution order of the debugged program.

In the second article, we choose a compact and well-equipped LUA language to analyze, from source code analysis to function call stack, from instruction set to debugging library modification , all in one go.

There is more content, and it may take longer to finish reading this article. For your health, it is not recommended to read this article while in a squatting position.

Two, GDB debugging model

GDB debugging includes two programs: the gdb program and the debugged program. According to whether these two programs are running on the same computer, the debugging model of GDB can be divided into two types:

  1. Local debugging
  2. Remote debugging

Local debugging : The debugger and the debugged program run on the same computer .

Remote debugging : The debugging program runs on one computer , and the debugged program runs on another computer .

The visual debugging program is not the point, it is just a shell used to encapsulate GDB. We can either use the dark terminal window to manually enter the debugging commands; we can also choose the integrated development environment (IDE), which has embedded debugging in the IDE , so that we can use various buttons instead of manually entering the debugging commands.

Compared with local debugging, there is one more GdbServer program in remote debugging. Both it and the target program run on the target machine , which may be an x86 computer or an ARM board. The red line in the figure indicates the communication between GDB and GdbServer through the network or serial port. Since it is communication, a set of communication protocols must be required: RSP protocol , the full name is: GDB Remote Serial Protocol (GDB remote communication protocol).

Regarding the specific format and content of the communication protocol, we don’t need to care, we just need to know: they are all strings , with a fixed start character ('$') and end character ('#'), and two sixteen at the end. The ASCII character of the base is used as the checksum, and it is enough to know so much. As for more details, if you can take a look at the idle XX, in fact, these agreements, like all kinds of weird regulations in society, are all thought of by a bunch of bricks in the toilet.

In the second article explaining LUA, we will implement a similar remote debugging prototype. The communication protocol is also a string. After simplifying the HTTP protocol directly, it is used and it is very clear and convenient.

Three, GDB debugging instructions

For the sake of completeness, here are some GDB debugging commands posted here, just with perceptual knowledge.

In addition, not all the instructions are listed here. The instructions listed are all commonly used and easier to understand. When explaining LUA, we will choose some of the instructions for detailed comparison, including the underlying implementation mechanism.

Each debugging command has many command options. For example, breakpoints include: set breakpoints, delete breakpoints, conditional breakpoints, temporarily disable and enable them . The focus of this article is to understand the underlying debugging mechanism of gdb, so the usage of these instructions at the application layer is no longer listed. There are many resources on the network.

Fourth, the relationship between GDB and the debugged program

For the convenience of description, first write the simplest C program:

#include <stdio.h>

int main(int argc, char *argv[])
{
    int a = 1;
    int b = 2;
    int c = a + b;
    printf("c = %d \n", c);
    return 0;
}

Compile command:

$ gcc -g test.c -o test

We debug the executable program test and enter the command:

$ gdb ./test

The output is as follows:

In the last line, you can see that the cursor is blinking. This is the gdb program waiting for us to issue debugging commands to it.

When the above dark terminal window was executing gdb ./test, many complicated things happened in the operating system:

The system first starts the gdb process . This process calls the system function fork() to create a child process . This child process does two things:

  1. Call the system function ptrace(PTRACE_TRACEME, [other parameters]);
  2. The executable program test is loaded and executed through execc, then the test program starts to execute in this sub-process.

One point to add: In the text, it is sometimes referred to as program and sometimes as process. "Program" describes a static concept, that is, a bunch of data lying on the hard disk, and "process" describes a dynamic process. After the program is read and loaded into the memory, there is a The task control block (a data structure) is specially used to manage this process.

After laying the groundwork for a long time, it is finally the protagonist's turn to debut, that is, the system call function ptrace (the parameters will be explained later). It is with its help that gdb has powerful debugging capabilities. The function prototype is:

#include <sys/ptrace.h>
long ptrace(enum __ptrace_request request, pid_t pid, void *addr, void *data);

Let's take a look at the introduction to this function in man:

Tracer is the debugging program, which can be understood as the gdb program; tracee is the debugged program, which corresponds to the target program test in the figure. Foreigners generally like to use -er and -ee to express the active and passive relationship. For example, employee is the employer (boss), and employee is the hard-pressed hired (beating worker).

The ptrace system function is a system call for process tracking provided by the Linux kernel. Through it, one process (gdb) can read and write the instruction space, data space, stack, and register values ​​of another process (test). And the gdb process takes over all the signals of the test process, which means that all the signals sent by the system to the test process are received by the gdb process. In this way, the execution of the test process is controlled by gdb to achieve the purpose of debugging.

In other words, if there is no gdb debugging, there is a direct interaction between the operating system and the target process; if gdb is used to debug the program, then the signal sent by the operating system to the target process will be intercepted by gdb , and gdb will determine according to the attributes of the signal : When continuing to run the target program, whether to transfer the currently intercepted signal to the target program, in this way, the target program will perform corresponding actions under the command of the signal sent by gdb.

Five, how GDB debugs the service process that has been executed

Is there a small partner who will raise such a question: The above debugged program test is executed from the beginning, can you use gdb to debug a service process that is already in execution? The answer is: Yes. This involves the first parameter of the ptrace system function. This parameter is an enumerated value, of which two are important: PTRACE_TRACEME and PTRACE_ATTACH< .

In the above explanation, the parameter used by the child process to call the ptrace system function is PTRACE_TRACEME , pay attention to the orange text: the child process calls ptrace, which is equivalent to the child process saying to the operating system: the gdb process is my dad, what will you have in the future? If you want to send me a signal, please send it directly to the gdb process!

If you want to debug an already executed process B , you must call ptrace( PTRACE_ATTACH ,[other parameters]) in the parent process of gdb. At this time, the gdb process will attach (bind) to the executed process B , gdb adopts process B as its own child process , and the behavior of child process B is equivalent to a PTRACE_TRACEME operation. At this time, the gdb process will send the SIGSTO signal to the child process B. After the child process B receives the SIGSTOP signal, it will suspend execution and enter the TASK_STOPED state, indicating that it is ready to be debugged.

Therefore, whether you are debugging a new program or a service program that is already in execution, through the ptrace system call, the final result is: the gdb program is the parent process, the debugged program is the child process, and all the signals of the child process All are taken over by the parent process gdb, and the parent process gdb can view and modify the internal information of the child process, including stacks, registers, and so on .

Regarding binding, there are several restrictions that need to be understood: self-binding is not allowed, multiple bindings to the same process are not allowed, and process No. 1 is not allowed to be bound.

Six, peeping at how GDB implements breakpoint instructions

The principle is over, here we set a breakpoint (break) debugging instruction, to take a peek at the internal debugging mechanism of gdb.
Still take the above code as an example, and repost the code here:

#include <stdio.h>

int main(int argc, char *argv[])
{
    int a = 1;
    int b = 2;
    int c = a + b;
    printf("c = %d \n", c);
    return 0;
}

Let's take a look at what the compiled disassembly code looks like, the compilation instruction:

gcc -S test.c; cat test.S)

Only a part of the disassembly code is posted here, as long as the underlying principle can be explained, our goal will be achieved.

As mentioned above, after executing gdb ./test, gdb will fork a child process. This child process first calls ptrace and then executes the test program, so that the debugging environment is ready.

We put the source code and assembly code together for easy understanding:

Enter the breakpoint command "break 5" in the debug window, and gdb does two things at this time:

  1. The 10th line of assembly code corresponding to the 5th line of source code is stored in the breakpoint linked list .
  2. In the 10th line of the assembly code, insert the interrupt instruction INT3, which means that the 10th line in the assembly code is replaced with INT3 .

Then, continue to enter the execution command "run" in the debug window ( execute until it hits a breakpoint and pause ). When the PC pointer (an internal pointer that points to the line of code to be executed) in the assembly code is executed on line 10, it is found It is an INT3 instruction, so the operating system sends a SIGTRAP signal to the test process.

At this moment, the 10th line of assembly code has been executed, and the PC pointer points to the 11th line.

As mentioned above, any signal sent by the operating system to test is taken over by gdb, which means that gdb will receive the SIGTRAP signal first, and gdb finds that the current assembly code is executing line 10, so it goes to the breakpoint list. In search, it is found that the 10th line of code is stored in the linked list, indicating that a breakpoint is set on the 10th line. So gdb did two more operations:

  1. Replace the 10th line "INT3" in the assembly code with the original code in the breakpoint list.
  2. Take the PC pointer back one step, that is, set it to point to line 10.

Then, gdb continues to wait for the user's debugging instructions.

At this moment, it is equivalent to the next executed instruction is the 10th line in the assembly code , which is the 5th line in the source code . From our debugger's point of view, the program being debugged is paused at the breakpoint on line 5. At this point, we can continue to input other debugging commands to debug, such as: viewing variable values, viewing stack information, modifying local variable values, etc. Wait.

Seven, peeping at how GDB implements the single-step instruction next

Take the source code and assembly code just now as an example, assuming that the program stops at the 6th line of the source code, that is, the 11th line of the assembly code:

Enter the single-step execution command next in the debugging window . Our goal is to execute a line of code , that is, to finish executing the 6th line of the source code, and then stop at the 7th line . When gdb receives the next execution, it will calculate the 7th line of the source code, which should correspond to the 14th line of the assembly code , so gdb controls the PC pointer in the assembly code to execute until the end of the 13th line , that is, the PC points to the first 14 lines when it stopped, and then continue to wait for user input debug commands.

8. Summary

Through the two debugging instructions of break and next, we have understood how debugging instructions are handled in gdb. Of course, there are many more debugging instructions in gdb, including more complicated acquisition of stack information, modification of variable values, etc. Interested friends can continue to follow in-depth.

When I write the debugging library in the LUA language later, I will discuss this issue in more depth and detail. After all, the LUA language is smaller and simpler. I will also show the code part of how to set the PC pointer in the LUA code, so that we will have a better understanding and grasp of the internal implementation of a programming language, and may also record a video, so that we can Better explain the internal details of the LUA language.


If this article can bring you a little help, welcome to comment, forward, and share with your friends.

I will continue to summarize the actual combat experience in the development process of embedded projects in the public-public-number IOT Internet of Things Town , I believe you will not be disappointed!

Guess you like

Origin blog.csdn.net/u012296253/article/details/111150497