The principle of gdb: ptrace system call


        First of all, the process status of Linux is roughly divided into the following categories: (The above content comes from the manual manual of the ps command)

  1. D (TASK_UNINTERRUPTIBLE), uninterruptible sleep state.
  2. R (TASK_RUNNING), the process is executing.
  3. S (TASK_INTERRUPTIBLE), interruptible sleep state.
  4. T (TASK_STOPPED), pause state.
  5. t (TASK_TRACED), the process is tracked.
  6. w (TASK_PAGING), process paging, has been removed in kernels of version 2.6 and above.
  7. X (TASK_DEAD – EXIT_DEAD), exit status, the process is about to be destroyed.
  8. Z (TASK_DEAD – EXIT_ZOMBIE), exit status, the process becomes a zombie process.

        Among them, the above 5 is what we want to discuss, the t state when gdb debugs the program, the program is tracked. (For other statuses of the process, please Baidu yourself).

1 ptrace

         Prototype of ptrace system call:

long ptrace(enum __ptrace_request request, pid_t pid,void *addr,void *data); 

      The meanings of the 4 parameters are:

  1. enum __ptrace_request request: indicates the command to be executed by ptrace.
  2. pid_t pid: indicates the process to be traced by ptrace.
  3. void *addr: Indicates the memory address to be monitored.
  4. void *data: Store the data read or written.

        The principle is based on the ptrace system call to establish a trace relationship between the debugged program and gdb. All signals (except SIGKILL) sent to the debugged program (tracked thread) will be intercepted by gdb, and the tracked will be blocked. At this time, the status of the child process will be marked by the system as TASK_TRACED. After gdb receives the signal, it can check and modify the stopped child process, and then let the child process continue to run. 

       ptrace is so powerful that many commonly used tools are based on ptrace, such as strace and gdb. Next, we will take a look at how ptrace is used by implementing strace and gdb.
  You can check the specific usage through the man manual: the main options of the man ptrace 
  request parameter

PTRACE_TRACEME: Called by the child process, which means that the process will be tracked by its parent process, and all signals delivered to this process, even if the signal is ignored (except SIGKILL), will be stopped, and the parent process will pass wait() Be aware of this situation.

PTRACE_ATTACH:  Attach to a specified process, making it a child process traced by the current process, and the child process's behavior is equivalent to a PTRACE_TRACEME operation. However, it should be noted that although the current process becomes the parent process of the tracked process, the child process using getppid() will still be the pid of its original parent process. 
        Now the attach function of gdb is clear. When you use the attach command in gdb to track a specified process/thread, gdb automatically becomes the parent process of the process, and the tracked process uses PTRACE_TRACEME once, and gdb takes over the process logically.

PTRACE_CONT: Continue to run the previously stopped child process. The specified signal can be delivered to the child process at the same time.

 2 common methods of gdb

       Commonly used methods of using GDB include breakpoint setting and single-step debugging. Next, let's analyze how they are implemented.

1. Establish a debugging relationship:

There are two modes for debugging a program with gdb, including starting the program with gdb and attaching to an existing process. Corresponding to the following two methods of establishing a debugging relationship:

  1) fork: Use fork+execve to execute the program under test. The child process calls ptrace (PTRACE_TRACEME) before executing execve to establish a trace relationship with the parent process (debugger).

  2) attach: The debugger can call ptrace(PTRACE_ATTACH, pid,...) to establish the trace relationship between itself and the process whose process number is pid. That is, use PTRACE_ATTACH to make yourself the parent process of the debugged program (you can see it with ps). The trace relationship established by attach can be released by calling ptrace(PTRACE_DETACH, pid,...). Pay attention to the permission problem when attaching the process. For example, a non-root process cannot be attached to a root process.

2. Breakpoint principle:

1) The realization principle of the breakpoint is to insert a breakpoint instruction at the specified position. When the debugged program runs to the breakpoint, the SIGTRAP signal is generated. The signal is captured by gdb and a breakpoint hit determination is performed. When gdb determines that this SIGTRAP is a breakpoint hit, it will wait for user input for the next step, otherwise continue. 

  2) Breakpoint setting principle: To set a breakpoint in the program, it is to save the original instruction at the position first, and then write int 3 to the position. When the execution reaches int 3, a soft interrupt occurs, and the kernel will send a SIGTRAP signal to the child process. Of course, this signal will be forwarded to the parent process. Then replace int3 with the saved instruction and wait for the operation to resume.

  3) Breakpoint hit determination: gdb stores all the breakpoint positions in a linked list. The hit determination compares the current stop position of the debugged program with the breakpoint position in the linked list to see the signal generated by the breakpoint. Still irrelevant signals.

  4) Judgment of conditional breakpoint: The principle is the same as 3), except that after restoring the instruction at the breakpoint, one more conditional judgment is added. If the expression is true, a breakpoint is triggered. Since it needs to be judged once, after adding a conditional breakpoint, whether or not the conditional breakpoint is triggered, it will affect the performance . On the x86 platform, some hardware supports hardware breakpoints. Instead of inserting int 3 at the conditional breakpoint, insert another instruction. When the program reaches this address, the int 3 signal is not sent out, but to compare. The content of a specific register and a certain address, and then decide whether to send int 3. Therefore, when the location of your breakpoint is frequently "passed by" by the program, try to use hardware breakpoints, which will help improve performance .

 3. Single-step tracking principle:

This is the simplest, because ptrace itself supports single-step function, just call ptrace(PTRACE_SINGLESTEP, pid,...).

2 The basis of gdb debugging-signal

       The implementation of gdb debugging is based on signals. After the debugging relationship is established using the ptrace system call with the parameter PTRACE_TRACEME or PTRACE_ATTACH , any signal delivered to the target program will first be intercepted by gdb

       Therefore, gdb can process the signal first, and decide whether to deliver the signal to the target program according to the properties of the signal. 
  1. Set breakpoints:    
  Signals are the basis for realizing breakpoints. After setting a breakpoint with breakpoint, gdb will find the specific address corresponding to the position at =, and then write the breakpoint instruction INT3, which is 0xCC, to the address. 
  When the target program runs to this instruction, it will trigger the SIGTRAP signal, and gdb will capture this signal first. Then query the breakpoint list maintained by gdb according to the current stop position of the target program. If it exists, it can be judged as hitting the breakpoint. 
  The way gdb suspends the operation of the target program is to think of sending the SIGSTOP signal. 
  2. Next single-step debugging: The 
  next instruction can realize single-step debugging, that is, only one line of statement is executed at a time. A line of statement may correspond to multiple and its instructions. When the next instruction is executed, gdb will calculate the address of the first instruction corresponding to the next statement, and then control the target program to stop at this position. 
  Write picture description here

Related Links:

1: GDB debugging principle-ptrace system call

2: The working principle of gdb

3: Ptrace detailed explanation

Debugging steps:

1: gdb debugging multi-process and multi-thread

2: GDB debug multi-process or multi-threaded applications

Guess you like