Systems Programmer growth plans - like a machine thinking (b)

Systems Programmer growth plans - like a machine thinking (b)

Original: https://blog.csdn.net/absurd/article/details/4207357

Modify: by ChrisZZ, 2019.10.06, the same content, use markdown code and output content to do highlighted.

Who call me -backtrace implementation principle

Display function call relations (backtrace / callstack) is one of the essential functions of a debugger, such as gdb, the use bt command you can view the backtrace. When the program crashes, the function call relationship helps to quickly locate the root of the problem, understand its implementation principle, can expand their knowledge, in the absence of a debugger circumstances, can achieve their backtrace. More importantly, the realization of the principle of analysis backtrace very interesting. Now we take a look at:

glibc provides a backtrace function that can help us get the current backtrace function, take a look at its use, we'll write a follow behind it.

#include <stdio.h>
#include <stdlib.h>
#include <execinfo.h> 

#define MAX_LEVEL 4 

static void test2()
{
    int i = 0;
    void* buffer[MAX_LEVEL] = {0}; 

    int size = backtrace(buffer, MAX_LEVEL); 

    for(i = 0; i < size; i++)
    {
        printf("called by %p/n",    buffer[i]);
    } 

    return;
} 

static void test1()
{
    int a=0x11111111;
    int b=0x11111112; 

    test2();
    a = b; 

    return;
} 

static void test()
{
    int a=0x10000000;
    int b=0x10000002; 

    test1();
    a = b; 

    return;
} 

int main(int argc, char* argv[])
{
    test(); 

    return 0;
}

Compile and run it:

gcc -g -Wall bt_std.c -o bt_std
./bt_std

Screen printing:

called by 0×8048440
called by 0×804848a
called by 0×80484ab
called by 0×80484c9

Printed above the address of the caller, the programmer less intuitive, the glibc also provides another function backtrace_symbols, it can convert these addresses to the source location (usually the name of the function). But this is not how useful function, especially in the absence of debugging information, Debu almost any useful information. Here we use another tool addr2line achieved addresses to the source location:

run:

./bt_std |awk ‘{print “addr2line “$3″ -e bt_std”}’>t.sh;. t.sh;rm -f t.sh

Screen printing:

/home/work/mine/sysprog/think-in-compway/backtrace/bt_std.c:12
/home/work/mine/sysprog/think-in-compway/backtrace/bt_std.c:28
/home/work/mine/sysprog/think-in-compway/backtrace/bt_std.c:39
/home/work/mine/sysprog/think-in-compway/backtrace/bt_std.c:48

Backtrace is how to achieve it? On x86 machine, the function call, the stack data structure as follows:

---------------------------------------------
参数N
参数…       函数参数入栈的顺序与具体的调用方式有关
参数 3
参数 2
参数 1
---------------------------------------------
EIP        完成本次调用后,下一条指令的地址
EBP        保存调用者的EBP,然后EBP指向此时的栈顶。
----------------新的EBP指向这里---------------
临时变量1
临时变量2
临时变量3
临时变量…
临时变量5
---------------------------------------------

(Note: The following is a low address, high address the above, the stack grows downwards)

When invoked, the parameter is first pushed onto the stack transfer function in C language is push mode: the first press-fitting last parameter, and then pushes the penultimate parameter stack in this order, and finally into a first pressure parameters.

EIP and then pressed into EBP, EIP at this time point after the completion of this call address of the next instruction, the address can be approximated to a function that is the address of the caller. EBP is the dividing line between the caller and the called function, the boundary line on the temporary variable is the caller, the parameters are a function, the function returns the address (the EIP), and a layer of EBP function, under the boundary is a temporary variable called function.

And finally into the called function and allocate space for its temporary variables. Different versions of gcc process is not the same for older versions of gcc (such as gcc3.4), the first temporary variable in the highest address, followed by a second, followed by the order of distribution. As for the new version of gcc (such as gcc4.3), location of the temporary variables are reversed, that is, the last temporary variable at the highest address, the penultimate Secondly, in turn order distribution.

In order to achieve backtrace, we need to:

  1. Get the current EBP function.
  2. EIP get the caller through the EBP.
  3. EBP get on stage by EBP.
  4. Repeat this process until the end.

By embedding assembly code, we can obtain the current EBP function, but here we do not assemble, and to obtain EBP current function of the address of the temporary variable. We know, for gcc3.4 generated code, the next position of the first temporary variable is the current function of EBP. For gcc4.3 generated code, the last current position of the next function is temporary variable EBP.

With that background, we come to realize their backtrace:

#ifdef NEW_GCC
#define OFFSET 4
#else
#define OFFSET 0
#endif/*NEW_GCC*/ 

int backtrace(void** buffer, int size)
{
    int  n = 0xfefefefe;
    int* p = &n;
    int  i = 0; 

    int ebp = p[1 + OFFSET];
    int eip = p[2 + OFFSET]; 

    for(i = 0; i < size; i++)
    {
        buffer[i] = (void*)eip;
        p = (int*)ebp;
        ebp = p[0];
        eip = p[1];
    } 

    return size;
}

For older versions of gcc, OFFSET is defined as 0, then p + 1 is EBP, and p [1] is one of the EBP, p [2] is the caller EIP. This function is a total of five temporary variables int, so the new version of gcc, OFFSET is defined as 5, then p + 5 is the EBP. In one cycle, the layer is repeated to take EBP and EIP, EIP all callers finally obtained, thereby realizing a backtrace.

Now we use a complete program to test (bt.c):

#include <stdio.h> 

#define MAX_LEVEL 4
#ifdef NEW_GCC
#define OFFSET 4
#else
#define OFFSET 0
#endif/*NEW_GCC*/ 

int backtrace(void** buffer, int size)
{
    int  n = 0xfefefefe;
    int* p = &n;
    int  i = 0; 

    int ebp = p[1 + OFFSET];
    int eip = p[2 + OFFSET]; 

    for(i = 0; i < size; i++)
    {
        buffer[i] = (void*)eip;
        p = (int*)ebp;
        ebp = p[0];
        eip = p[1];
    } 

    return size;
} 

static void test2()
{
    int i = 0;
    void* buffer[MAX_LEVEL] = {0}; 

    backtrace(buffer, MAX_LEVEL); 

    for(i = 0; i < MAX_LEVEL; i++)
    {
        printf("called by %p/n",    buffer[i]);
    } 

    return;
} 

static void test1()
{
    int a=0x11111111;
    int b=0x11111112; 

    test2();
    a = b; 

    return;
} 

static void test()
{
    int a=0x10000000;
    int b=0x10000002; 

    test1();
    a = b; 

    return;
} 

int main(int argc, char* argv[])
{
    test(); 

    return 0;
}

Write a simple Makefile:

CFLAGS=-g -Wall
all:
    gcc34 $(CFLAGS) bt.c -o bt34
    gcc $(CFLAGS) -DNEW_GCC  bt.c -o bt
    gcc $(CFLAGS) bt_std.c -o bt_std 

clean:
    rm -f bt bt34 bt_std

Compile and run:

make
./bt|awk ‘{print “addr2line “$3″ -e bt”}’>t.sh;. t.sh;

Screen printing:

/home/work/mine/sysprog/think-in-compway/backtrace/bt.c:37
/home/work/mine/sysprog/think-in-compway/backtrace/bt.c:51
/home/work/mine/sysprog/think-in-compway/backtrace/bt.c:62
/home/work/mine/sysprog/think-in-compway/backtrace/bt.c:71

For executable files, this method works fine. For shared libraries, addr2line can not find the source code corresponding to the position according to this address. The reason is: addr2line only to find an address by the offset, and print out the address is an absolute address. Because the shared library is loaded into memory location is uncertain, in order to calculate the address offset, we also need to help process maps file:

By maps document the process (/ proc / process ID / maps), we can find the location of the shared library load, such as:


00c5d000-00c5e000 r-xp 00000000 08:05 2129013 /home/work/mine/sysprog/think-in-compway/backtrace/libbt_so.so
00c5e000-00c5f000 rw-p 00000000 08:05 2129013 /home/work/mine/sysprog/think-in-compway/backtrace/libbt_so.so

Libbt_so.so code segment is loaded into 0 × 00c5d000-0 × 00c5e000, and print out the address backtrace:

called by 0xc5d4eb
called by 0xc5d535
called by 0xc5d556
called by 0×80484ca

Here you can print out the address minus address loaded to calculate the offset. E.g., by subtracting 0xc5d4eb load address 0 × 00c5d000, offset 0 × 4eb obtained, and then passed to the 0 × 4eb addr2line:

addr2line 0×4eb -f -s -e ./libbt_so.so

Screen printing:

/home/work/mine/sysprog/think-in-compway/backtrace/bt_so.c:38

The stack of data is very interesting, in the previous section, by analyzing the data in the stack, we understand the implementation of the principle of variable argument function. In this section, by analyzing the data in the stack, we have learned a backtrace implementation principle.

Guess you like

Origin www.cnblogs.com/zjutzz/p/11628807.html