Debugging segfaults with core dump and GDB on Linux

I often encounter segfaults, and debugging is very laborious. In addition to unit testing and basic testing, sometimes it is in an online environment without basic development and testing tools, which requires debugging skills. I have previously introduced the use of strace for system debugging and tracking "linux dynamic tracking artifact - Strace example introduction". Today, I will introduce to you how to use core dump files and gdb for application debugging and tracking.

segfault

A "segmentation fault" is a situation in which a program attempts to manipulate memory that is not allowed or attempted to access. The main reasons that may cause a segmentation fault are:

1. Attempt to dereference a null pointer (you are not allowed to access memory address 0)

2. Attempt to dereference other pointers that are not in your memory

3. A C++ vtable pointer is corrupted and points to the wrong place, which causes the program to try to execute some non-executable memory.

4. Other situations, such as unaligned memory accesses, may also segfault.

core dump file

Under linux, when an application terminates abnormally or crashes, the linux kernel dumps the memory status and other related information of the application during the running period to disk for system troubleshooting or debugging. This dump file is called core dump file. The core dump file will record information such as memory calls, stack references, process and thread calls of the program at that time, which can help developers and maintainers to understand the environmental parameters and information at the time of the exception, so core dump is of great importance for troubleshooting and bug debugging. meaning.

Debugging segfaults via valgrind

The easiest way to debug a segfault is to use valgrind: its run method:

valgrind -v app

An example of his output is as follows:

Debugging segfaults with core dump and GDB on Linux

It will provide a stack trace about the application. However, what valgrind gives is limited. To explore in depth, you have to use the core dump file. Let's explore it further:

how to get core dump

We mentioned earlier that core dump is a dump file of a copy of the memory usage when the program is abnormal. It is very useful when you need to debug the information about the specific program error.

When a program segfaults, the Linux kernel sometimes writes a core dump file to disk. Many people may wonder if they followed the tutorial step by step, but they did not get the desired core dump in the end. Under normal circumstances, the system settings do not output core dump, so no core dump file is generated.

If no core dump file is generated, follow the steps below to set it up:

1. Execute the following command in the linux terminal ulimit -c unlimited

2.运行sysctl -w kernel.core_pattern=/tmp/core-%e.%p.%h.%t

ulimit:

Set the maximum value of core dump through ulimit -c under linux. It is set to 0 by default, at this time the kernel will not generate a core dump. It is in KB. ulimit is set per process. We can view the size limit of a specific process by running cat /proc/PID/limit.

For example, these are the size limits for a random nginx process on my system:

cat /proc/8854/limits (PID is replaced with the specific process number in your system, here the process number in my system is 8854)

Debugging segfaults with core dump and GDB on Linux

The kernel determines the size of the core file written by the soft limit value (for example, our nginx "max core file size = 0" in the above figure). We use ulimit -c unlimited to make the soft limit unlimited, and the core dump file can be increased indefinitely. We can also replace the umlimited value with a specific file size.

kernel.core_pattern

kernel.core_pattern is a kernel parameter, configured through the sysctl command, to control the location and filename format where the Linux kernel writes core dumps to disk.

We can get a list of all kernel parameters and setting values ​​for the current system by running sysctl -a. Or use sysctl kernel.core_pattern to view only the set value of kernel.core_pattern.

Debugging segfaults with core dump and GDB on Linux

sysctl -w kernel.core_pattern=/tmp/core-%e.%p.%h.%t The core dump file will be written to /tmp/core- (the parameter value that identifies the process). For details about the representation of the %e.%p.%h parameter, please refer to the man core.

kernel.core_pattern settings under Ubuntu

By default, on Ubuntu, the content of the kernel.core_pattern setting is:

sysctl kernel.core_pattern

kernel.core_pattern = |/usr/share/apport/apport %p %s %c %d %P

This used to confuse me, what is this and how does it handle my core dump. So I searched for relevant information and found out:

Ubuntu uses a system called "apport" to log crashes in the apt package manager

设置kernel.core_pattern = |/usr/share/apport/apport %p %s %c %d %P

Indicates that the core dump content is redirected to apport, and its log is /var/log/apport.log

By default, apport will ignore crash logs from parts of binaries that are not Ubuntu packages. So the default apport.log will not record core dump information by default. In order to get the core dump, the specific method is to reset the value of kernel.core_pattern and set it to sysctl -w kernel.core_pattern=/tmp/core-%e.%p.%h.%t.

Tracing with gdb

The information in the core dump supports debugging with gdb. About gdb is a powerful debug debugging program under linux. If you are not familiar with it, please search it first.

Open a core dump file with the following gdb command:

gdb -c my_core_file

Next, we want to know what the stack is when the program crashes. Running bt at the gdb prompt will give you a stack trace. By default, symbol debugging is not done during compilation, and gdb cannot load binary symbols, so the trace results will all be ??. As shown below:

Debugging segfaults with core dump and GDB on Linux

In this case, we need to load the symbol table to make the display normal. It can be executed under the gdb command:

Executor of the symbol-file application (absolute path)

sharedlibrary

This loads symbols from the binary program file and the shared library it brings in. After execution, enter bt again, and gdb will return stack trace information with line numbers.

Debugging segfaults with core dump and GDB on Linux

If you want it to work properly, you should enable debug symbol compilation (gcc -g) when debugging the program. Having line numbers in the stack trace is very useful when trying to figure out why a program crashed.

You can also view the stack of each thread in gdb, the specific method is as follows: thread apply all bt full

Other ways to debug segfaults

ASAN method

Other ways to debug segfaults are ddressSanitizer("ASAN") ($CC -fsanitize=address) compiling the program and running it.

dmesg method

Debugging segfaults with core dump and GDB on Linux

ldd method:

Debugging segfaults with core dump and GDB on Linux

nm method:

Debugging segfaults with core dump and GDB on Linux

objdump method (combined with demsg to get the address)

Debugging segfaults with core dump and GDB on Linux

catchsegv method

Debugging segfaults with core dump and GDB on Linux

Due to space limitations, this article will not describe them. If students are interested in this, please leave a message to the bug, and will write a special article to introduce it in the future.

Summarize

Getting stack traces from core dump is fairly simple and easy to use. Finally, we summarize the steps of stack tracing for the program that has a segmentation fault as follows:

First consider using valgrind

If that doesn't work, or you want a core dump for debugging:

1 Make sure the binary is compiled with debug symbols

2. Correctly set ulimit and kernel.core_pattern

3. Run the program

4. Open your core dump with gdb, load the symbols, then run bt

5. Try to figure out what's going on!

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325246749&siteId=291194637