Start GDB debugging

There are generally three ways to use GDB to debug a program:

  • gdb filename
  • gdb attach pid
  • gdb filename corename

This also corresponds to the core content of this lesson:

  • Directly debug the target program
  • Additional process
  • Debug core file

Next we explain one by one.

Directly debug the target program

In the development stage or when researching other people’s projects, when the target binary file is successfully generated by the compilation, you can use  gdb filename to  directly start the debugging of this program, where  filename  is the name of the debugging program file that needs to be started. This way is to directly use GDB to start one The program is debugged. Note that it is not rigorous to start a program for debugging , because it is actually just attaching an executable file, but not starting the program; then you need to enter the run  command to actually run the program. It   will be introduced in detail in the course following the run command. The GDB debugging hello_server series in the previous lesson uses this method.

Suppose there is a program called  fileserver , use  gdb fileserver to  attach the program, and then use the  run  command to start the program. As shown below:

enter image description here

Additional process

In some cases, a program has already started, and we want to debug this program, but we don't want to restart it. Suppose there is a scenario where our chat test server program is running. After running for a period of time, it is found that the chat server cannot accept new client connections. At this time, the program must not be restarted. If it is restarted, the current program The status information is lost. How to do it? You can use the  gdb attach process ID  to attach the GDB debugger to the chat test server program. For example, if the chat program is called chatserver, you can use the ps command to obtain the PID of the process, and then use gdb attach to debug it. The operation is as follows:

[zhangyl@iZ238vnojlyZ flamingoserver]$ ps -ef | grep chatserver
zhangyl  21462 21414  0 18:00 pts/2    00:00:00 grep --color=auto chatserver
zhangyl  26621     1  5 Oct10 ?        2-17:54:42 ./chatserver -d

copy

The actual execution is shown in the following figure:

enter image description here

Through the above code, the PID of the chatserver is 26621, and then use  gdb attach 26621  to attach GDB to the chatserver process. The operation and output are as follows:

[zhangyl@localhost flamingoserver]$ gdb attach 26621
Attaching to process 26661
Reading symbols from /home/zhangyl/flamingoserver/chatserver...done.
Reading symbols from /usr/lib64/mysql/libmysqlclient.so.18...Reading symbols from /usr/lib64/mysql/libmysqlclient.so.18...(no debugging symbols found)...done.
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done.
[New LWP 42931]
[New LWP 42930]
[New LWP 42929]
[New LWP 42928]
[New LWP 42927]
[New LWP 42926]
[New LWP 42925]
[New LWP 42924]
[New LWP 42922]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.

copy

To save space, I deleted some irrelevant information in the above code. When prompted  "Attaching to process 26621"  , it means that we have successfully attached GDB to the target process. It should be noted that the program uses some system libraries (such as libc.so). Since this is a released version of the Linux system, these libraries do not have debugging symbols, so GDB will prompt that it cannot find the debugging symbols of these libraries. Because the purpose is to debug the chatserver, it does not pay attention to the internal implementation of the system API calls, so these prompts can be ignored, as long as the chatserver file has debugging information.

When the target process is attached with gdb, the debugger will pause. At this time, you can use the continue command to continue the program, or add the corresponding breakpoints to continue running the program (it does not matter if the continue command mentioned here is unfamiliar. The use of these commands will be described in detail later).

When you finish debugging the program and want to end the debugging, and it will not have any impact on the current process chatserver, that is to say, if you want to keep the program running, you can enter the detach command in the GDB command line interface to separate the program from the GDB debugger, so that the chatserver You can continue to run:

(gdb) detach
Detaching from program: /home/zhangyl/flamingoserver/chatserver, process 42921

copy

Then exit GDB again:

(gdb) quit
[zhangyl@localhost flamingoserver]$

copy

Debug core file

Sometimes, the server program will suddenly crash after running for a period of time. This is not what we want to see. This problem needs to be solved. As long as a core file is generated when the program crashes, you can use this core file to locate the cause of the crash. Of course, the Linux system does not enable the core file mechanism for program crashes by default. We can use the ulimit -c command to check whether the system has enabled this mechanism.

By the way, the ulimit command can not only check whether the core file generation is enabled, but also check other functions, such as the maximum number of file descriptors allowed by the system, etc. You can use the ulimit -a command to check specifically, because of this content It has nothing to do with the topic of this lesson, so I won't repeat it here.

[zhangyl@localhost flamingoserver]$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 15045
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 4096
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

copy

It is found that the core file size line is 0 by default, which means that the core file generation is closed and can be modified by using "ulimit option name setting value". For example, you can change the core file generation to a specific value (the maximum allowed number of bytes). Here we use  ulimit -c unlimited ( unlimited  is the  value of the  -c option) to directly modify it to unlimited size.

[zhangyl@localhost flamingoserver]$ ulimit -c unlimited
[zhangyl@localhost flamingoserver]$ ulimit -a
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 15045
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 4096
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

copy

Note that this command is easy to remember. The first ulimit is a  Linux command . Unlimited after the -c option is the value of the option , which means that the size is not limited. Of course, it can be changed to a specific value. Many beginners always confuse the ulimit command with the  value of unlimited when learning this command  . If the reader can understand its meaning, it will generally not be confused.

Another problem is that after this modification, when we close the Linux session, the value of the setting item will be restored to 0, and the server program generally runs as a background program (daemon) for a long period of time, which means that although the current session If it is closed, the server program still continues to run in the background, so that the program cannot generate core files after it crashes at a certain moment. This situation is not conducive to troubleshooting. Therefore, we want this option to take effect permanently. The way to take effect permanently is to add the line "ulimit -c unlimited" to the /etc/profile file and put it on the last line of the file.

Concrete example

The default naming method of the generated core file is core.pid. For example, for example, when a program is running and its process ID is 16663, then the name of the core file generated by its crash is core.16663. Let's take a look at a specific example. One time I found that msg_server on the server crashed, and a core file like the following was generated:

-rw------- 1 root root 10092544 Sep  9 15:14 core.21985

copy

You can use this core.21985 file to troubleshoot the cause of the crash. The command to debug the core file is:

gdb filename corename

copy

Among them, filename is the program name, here is msg_server; corename is core.21985, we enter gdb msg_server core.21985 to start debugging:

[root@myaliyun msg_server]# gdb msg_server core.21985
Reading symbols from /root/teamtalkserver/src/msg_server/msg_server...done.
[New LWP 21985]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `./msg_server -d'.
Program terminated with signal 11, Segmentation fault.
#0  0x00000000004ceb1f in std::less<CMsgConn*>::operator() (this=0x2283878, __x=@0x7ffca83563a0: 0x2284430, __y=@0x51: <error reading variable>)
    at /usr/include/c++/4.8.2/bits/stl_function.h:235
235           { return __x < __y; }

copy

You can see that the program crashed in line 235 of stl_function.h, and then use the  bt  command (this command will be described in detail later) to view the call stack at the time of the crash, and further analysis can find the cause of the crash.

(gdb) bt
#0  0x00000000004ceb1f in std::less<CMsgConn*>::operator() (this=0x2283878, __x=@0x7ffca83563a0: 0x2284430, __y=@0x51: <error reading variable>)
    at /usr/include/c++/4.8.2/bits/stl_function.h:235
#1  0x00000000004cdd70 in std::_Rb_tree<CMsgConn*, CMsgConn*, std::_Identity<CMsgConn*>, std::less<CMsgConn*>, std::allocator<CMsgConn*> >::_M_get_insert_unique_pos
    (this=0x2283878, __k=@0x7ffca83563a0: 0x2284430) at /usr/include/c++/4.8.2/bits/stl_tree.h:1324
#2  0x00000000004cd18a in std::_Rb_tree<CMsgConn*, CMsgConn*, std::_Identity<CMsgConn*>, std::less<CMsgConn*>, std::allocator<CMsgConn*> >::_M_insert_unique<CMsgConn* const&> (this=0x2283878, __v=@0x7ffca83563a0: 0x2284430) at /usr/include/c++/4.8.2/bits/stl_tree.h:1377
#3  0x00000000004cc8bd in std::set<CMsgConn*, std::less<CMsgConn*>, std::allocator<CMsgConn*> >::insert (this=0x2283878, __x=@0x7ffca83563a0: 0x2284430)
    at /usr/include/c++/4.8.2/bits/stl_set.h:463
#4  0x00000000004cb011 in CImUser::AddUnValidateMsgConn (this=0x2283820, pMsgConn=0x2284430) at /root/teamtalkserver/src/msg_server/ImUser.h:42
#5  0x00000000004c64ae in CDBServConn::_HandleValidateResponse (this=0x227f6a0, pPdu=0x22860d0) at /root/teamtalkserver/src/msg_server/DBServConn.cpp:319
#6  0x00000000004c5e3d in CDBServConn::HandlePdu (this=0x227f6a0, pPdu=0x22860d0) at /root/teamtalkserver/src/msg_server/DBServConn.cpp:203
#7  0x00000000005022b3 in CImConn::OnRead (this=0x227f6a0) at /root/teamtalkserver/src/base/imconn.cpp:148
#8  0x0000000000501db3 in imconn_callback (callback_data=0x7f4b20 <g_db_server_conn_map>, msg=3 '\003', handle=8, pParam=0x0)
    at /root/teamtalkserver/src/base/imconn.cpp:47
#9  0x0000000000504025 in CBaseSocket::OnRead (this=0x227f820) at /root/teamtalkserver/src/base/BaseSocket.cpp:178
#10 0x0000000000502f8a in CEventDispatch::StartDispatch (this=0x2279990, wait_timeout=100) at /root/teamtalkserver/src/base/EventDispatch.cpp:386
#11 0x00000000004fddbe in netlib_eventloop (wait_timeout=100) at /root/teamtalkserver/src/base/netlib.cpp:160
#12 0x00000000004d18c2 in main (argc=2, argv=0x7ffca8359978) at /root/teamtalkserver/src/msg_server/msg_server.cpp:213
(gdb)

copy

Stack #4 is not library code. We can check the code here and find the cause of the problem.

Custom core file name

However, careful readers will find a problem: when a program is running, its PID can be obtained, but when the program crashes, a core file is generated, especially if multiple programs crash at the same time, we cannot pass the core file name at all. There are two ways to distinguish which service is used to solve this problem in the PID:

  • When the program starts, record your PID
void writePid()
{
      uint32_t curPid = (uint32_t) getpid();
      FILE* f = fopen("xxserver.pid", "w");
      assert(f);
      char szPid[32];
      snprintf(szPid, sizeof(szPid), "%d", curPid);
      fwrite(szPid, strlen(szPid), 1, f);
      fclose(f);
}

copy

We call the above writePID  function when the program is started, and  record the PID of the program at the time in the  xxserver.pid  file, so that when the program crashes, we can get the PID of the process running at the time from this file, so that it can be compared with the default core file The PID after the name is matched.

  • Customize the name and directory of the core file

/proc/sys/kernel/core_uses_pid You can control whether PID is added as an extension to the file name of the generated core file. If it is added, the file content is 1, otherwise it is 0; /proc/sys/kernel/core_pattern  can set the formatted core file save location or file name. The modification method is as follows:

echo "/corefile/core-%e-%p-%t" > /proc/sys/kernel/core_pattern

copy

The description of each parameter is as follows:

parameter name Parameter meaning (English) Parameter meaning (Chinese)
%p insert pid into filename Add pid to core file name
% u insert current uid into filename Add the current uid to the core file name
%g insert current gid into filename Add the current gid to the core file name
%s insert signal that caused the coredump into the filename Add the signal that caused the core to the core file name
%t insert UNIX time that the coredump occurred into filename Add core file generation time (UNIX) to core file name
%h insert hostname where the coredump happened into filename Add the host name to the core file name
%e insert coredumping executable name into filename Add the program name to the core file name

Assuming that the current program is called  test , we set the core file name when the program crashes as follows:

echo "/root/testcore/core-%e-%p-%t" > /proc/sys/kernel/core_pattern

copy

Then  the core file name format of the test generated in the  /root/testcore/ directory is as follows:

-rw-------. 1 root root 409600 Jan 14 13:54 core-test-13154-1547445291

copy

It should be noted that the user you are using must have write permission to the specified core file directory, otherwise the core file cannot be generated due to insufficient permissions during generation.

summary

This lesson introduces three ways to use GDB to debug programs. Understanding and proficiently using these three ways can help readers accurately choose debugging methods when they encounter problems.

Guess you like

Origin blog.csdn.net/weixin_38293850/article/details/107975379