Linux stack traceback

Preface

  日常工作中,我们在开发软件程序时,经常会遇到程序奔溃的问题,导致程序奔溃的原因有很多,我们一般都是定位到相关代码,再去查询具体原因。而定位到bug相关代码往往需要依赖栈回溯这个功能,知道程序是在哪里挂掉的。

1. What is stack traceback?

  In the Linux system, stack trace is a technology used to track function calls during program execution. It records that when an error or exception occurs, the program traces back to the starting point of the program's execution from the current position, including each The calling relationship of the function and the corresponding return address. Stack traceback can help developers quickly locate the problem. Linux systems provide several methods for obtaining stack tracebacks:

  1. backtrace function: This function is defined in the execinfo.h header file and can obtain the stack traceback information of the current thread. Using this function requires adding the -g option when compiling to enable debugging symbols. Instructions:
#include <stdio.h>
#include <execinfo.h>
#include <stdlib.h>

void printStackTrace() {
    
    
  void *callstack[128];
  int i, frames;
  char **strs;

  frames = backtrace(callstack, 128);
  strs = backtrace_symbols(callstack, frames);
  
  printf("Stack Trace:\n");
  for (i = 0; i < frames; i++) {
    
    
    printf("%s\n", strs[i]);
  }
  
  free(strs);
}

void func1() {
    
    
  printf("Entering func1...\n");
  printStackTrace();
  printf("Exiting func1...\n");
}

void func2() {
    
    
  printf("Entering func2...\n");
  func1();
  printf("Exiting func2...\n");
}

void func3() {
    
    
  printf("Entering func3...\n");
  func2();
  printf("Exiting func3...\n");
}

int main() {
    
    
  printf("Entering main...\n");
  func3();
  printf("Exiting main...\n");

  return 0;
}

The compilation instructions are as follows:

gcc -rdynamic stacktrace.c -o stacktrace
./stacktrace

By adding the -rdynamic parameter, you will generate a complete symbol table for the executable file, allowing the backtrace_symbols function to correctly parse the stack frame's information.

  1. pstack command: This command can print the stack traceback information of the specified process or thread. In order to use the pstack tool, your system must have gdb (GNU Debugger) installed. In most Linux distributions, gdb comes preinstalled or can be easily installed through a package manager. How to use pstack:
pstack <pid>
pstack -T <pid>
  1. gdb debugger: This debugger can obtain stack traceback information when the program crashes. Instructions:
gdb <executable>
(gdb) run
<program crashes>
(gdb) bt

 No matter which method is used, stack tracebacks can provide valuable information about program crashes or exceptions, helping developers quickly locate the problem.

2. Implementation principle of stack backtracing

  On Linux ARM64 systems, the principle of implementing stack backtracing is similar to that of other architectures, mainly involving concepts such as registers, stack frames, and symbol tables.

  1. 寄存器: The ARM64 architecture has a set of general-purpose registers used to store parameters, local variables, return values, etc. of function calls. During the stack backtracing process, the key register is the Program Counter (PC), which stores the address of the current instruction. Stack backtracing requires obtaining the return address of each function call from the PC register.
	ARM64 架构中的 CPU 寄存器是用于存储和处理数据的关键组件。寄存器在计算过程中用于存储
操作数、中间结果和控制信息。以下是 ARM64 架构中常见的 CPU 寄存器:

1. 通用寄存器(General-Purpose Registers):
   -3164 位的通用寄存器,用来存储整数类型数据。
   - 这些寄存器被用于算术运算、数据传输、函数参数传递和临时存储等。
   - 寄存器命名为 x0-x30,其中 x30(栈帧指针)一般作为函数的帧指针使用。

2. 程序计数器(Program Counter):
   - 存储当前执行的指令的地址。
   - 通常使用 `pc` 表示,是一个 64 位的寄存器。

3. 标志寄存器(Flags Register):
   - 存储运算结果的条件信息,例如是否溢出、是否相等等。
   - 在 ARM64 架构中,标志寄存器叫做 Condition Flags Register(CPSR)。

4. 浮点寄存器(Floating-Point Registers):
   -32128 位的浮点寄存器,用于存储浮点数和进行浮点运算。
   - 寄存器命名为 v0-v31,每个寄存器可以存储一个 128 位的浮点数或者多个较小精度的浮点数。

5. SIMD 寄存器(Single Instruction, Multiple Data Registers):
   -32128 位的 SIMD 寄存器,用于存储数据并执行 SIMD(单指令多数据)运算。
   - 在 ARM64 架构中,SIMD 寄存器被称为向量寄存器(Vector Registers)。
   - 寄存器命名为 v0-v31,每个寄存器可以存储一个 128 位的向量或者多个较小精度的元素。

	除了上述常见的寄存器之外,ARM64 架构还有一些特殊用途的寄存器,如堆栈指针寄存器
(Stack Pointer Register,SP)等。

	这些寄存器在程序的执行过程中起着重要的作用,用于处理数据、控制程序流程、传递参数等。
编程时,需要根据需求合理使用寄存器来优化性能和实现所需的功能。

Insert image description here

	x0-x30 是 ARM64 架构中的通用寄存器,共有 31 个寄存器,用于存储整数类型数据以及执行
各种操作。下面对这些寄存器进行详细说明:

1. x0-x30 (x0~x30)- 这些寄存器是通用寄存器,每个寄存器的大小为 64 位。
   - 在函数调用中,寄存器 x0-x7 用于函数参数的传递,后续的参数(这几个寄存器存满了)存储在栈上。
   - 寄存器 x8 保留给系统调用使用。
   - 寄存器 x9-x15 可用作临时寄存器。
   - 寄存器 x16-x17 用作特殊用途寄存器。
   - 寄存器 x18 保留给全局数据指针使用(例如 TLS 模型)。
   - 寄存器 x19-x28 用作通用寄存器,可以用于存储数据和进行算术运算。
   - 寄存器 x29(FP,Frame Pointer)用作栈帧指针,指向当前函数的栈帧的起始位置。
   - 寄存器 x30(LR,Link Register)用于存储函数调用时的返回地址。

总体而言,x0-x30 寄存器在 ARM64 架构中用于存储数据、进行算术运算、传递函数参数、控制程序流程等。在函数调用过程中,这些寄存器的使用通过约定规则来进行管理,有助于提高运行效率和优化编译代码。开发者需要根据编程需求合理使用这些寄存器,并遵循相关规则来保证程序的正确性和性能。
  1. 栈帧: During the function call process, the local variables, return address and other call-related information of the function are saved through the stack frame. Each stack frame consists of a protected area (Saved Registers) and a local variable area (Local Variables). The stack frame contains the return address of the called function, which will be restored to this address to continue execution when the function returns.

  2. 栈回溯算法: The implementation of stack backtracing usually uses recursive or iterative algorithms. Here is an iterative stack backtracing algorithm:

    • Get the return address of the current function in the current stack frame.
    • Based on the address and debugging symbol table information of the executable file, the corresponding function name and line number are matched.
    • Print or record the matching function name and line number.
    • For the next return address parsed from the address, repeat the above steps until backtracking to the top-most function or reaching the set number of backtracking levels.
  3. 符号表: The symbol table is a mapping relationship that associates function names with corresponding addresses. During the stack traceback process, the address can be parsed into debugging information such as function name and line number through the symbol table. On Linux systems, you can use debug symbol table files (such as ELF files) to obtain symbol table information.

  It should be noted that the accuracy and readability of stack tracebacks are affected by the symbol table. If the executable file does not contain debugging symbol information, or the path to the symbol file is not set correctly, the stack traceback may only provide an address but cannot be resolved into a function name and line number. Therefore, when building an executable file, it is recommended to enable the generation of debugging symbol information and properly save the symbol table file.

3. Reference reading

ARM体系结构:https://zhuanlan.zhihu.com/p/577979125?utm_id=0
内核中dump_stack:https://www.cnblogs.com/pengdonglin137/p/11109427.html
dump_stack:https://blog.csdn.net/weixin_52849254/article/details/130559085

  dump_stackfunction is a debugging function in the Linux kernel, used to print the current function call stack information in the kernel code. It is used to diagnose and debug problems occurring in the kernel, such as kernel crashes or deadlocks, etc.

  dump_stackThe prototype of the function is defined in the kernel/lib/dump_stack.c header file:

void dump_stack(void);
	__dump_stack();
		dump_stack_print_info(KERN_DEFAULT);
		show_stack(NULL, NULL) //arch\arm64\kernel\traps.c

  By calling the dump_stack function, the current function call stack information can be output in the kernel log. This is useful for locating problems, especially in the event of a kernel panic, where you can determine which function is causing the problem by looking at the stack trace information in the kernel log.

  To use the dump_stack function in kernel code, simply call it at the appropriate location. For example, when an error or exception is encountered in a driver, the dump_stack function can be called in the corresponding error handling path to get more information about the problem in the kernel log.

Note: I recommend a book about debugging skills "Debug Hacks Chinese Edition - In-depth Debugging Technologies and Tools", it is very nice! ! !

Guess you like

Origin blog.csdn.net/weixin_45842280/article/details/132955659