Article source: http://www.cnblogs.com/cy568searchx/archive/2013/10/28/3391790.html
Your software stops serving at a certain moment, and the CPU usage reaches 100%+. One possible cause of this problem is an infinite loop. Assuming that there is a potential infinite loop somewhere in the program and it will be triggered under certain conditions, this article Take an example to locate the position where the infinite loop occurs.
When there is an infinite loop somewhere in the program, the usual way to locate the problem and narrow the scope is to add a log to the suspicious code or comment out the suspicious code. This is good for programs that easily reproduce the problem, but for "occasional" It is difficult to debug the problematic program because it is difficult for us to reproduce the program failure. The debugging process described in this article is exactly in this case, assuming that the problem has occurred, we require environmental protection of the site, that is, the program in question is still running.
1. We first need to know which thread has the problem:
first check the pid of the problematic process, for example
ovtsvn
11065
1
50
11
:
57
?
00
:
00
:
07
.
/
icdn
ovtsvn
11076
10971 0 11 : 57 pts / 2 00 : 00 : 00 grep
ovtsvn@ovtsvn:
~/
MASS4
/
src
/
icdn
/
src$
ovtsvn@ovtsvn:
~/
MASS4
/
src
/
icdn
/
src$
Then the top command to view thread information:
top -H -p 11065
PID USER PR NI VIRT RES SHR S
%
CPU
%
MEM TIME
+
COMMAND
11073
ovtsvn
25
0
325m
3980
2236
R
100
0.4
1
:
40.84
icdn
11065
ovtsvn
18
0
325m
3980
2236
S
0
0.4
0
:
00.01
icdn
11066
ovtsvn
18
0
325m
3980
2236
S
0
0.4
0
:
00.00
icdn
11067
ovtsvn
15
0
325m
3980
2236
S
0
0.4
0
:
00.00
icdn
It can be seen from the above that the PID of the thread in question is 11073
2. Next, we use gdb to attach the target process
execution: gdb icdn 11065
In gdb, list the thread status:
(gdb) info threads
9
Thread
47056948181264
(LWP
11066
)
0x00002acc4a3dec91
in
nanosleep () from
/
lib
/
libc.so.
6
8
Thread
47056956573968
(LWP
11067
)
0x00002acc4a406fc2
in
select () from
/
lib
/
libc.so.
6
7
Thread
47056964966672
(LWP
11068
)
0x00002acc4a3dec91
in
nanosleep () from
/
lib
/
libc.so.
6
6
Thread
47056973359376
(LWP
11069
)
0x00002acc4a3dec91
in
nanosleep () from
/
lib
/
libc.so.
6
5
Thread
47056981752080
(LWP
11070
)
0x00002acc4a3dec91
in
nanosleep () from
/
lib
/
libc.so.
6
4
Thread
47056990144784
(LWP
11071
)
0x00002acc4a40e63c
in
recvfrom () from
/
lib
/
libc.so.
6
3
Thread
47057194060048
(LWP
11072
)
0x00002acc4a406fc2
in
select () from
/
lib
/
libc.so.
6
2
Thread
47057226893584
(LWP
11073
) CSendFile::SendFile (
this
=
0x2acc5d4aff40
, pathname
=
@
0x2acc5d4afee0
) at ..
/
src
/
csendfile.cpp:
101
1
Thread
47056939784832
(LWP
11065
)
0x00002acc4a3dec91
in
nanosleep () from
/
lib
/
libc.so.
6
(gdb)
gdb已经列出了各线程正在执行的函数,我们需要更多信息,记住11073对应的行首标号,这是gdb为线程分配的id,这里为2,然后执行切换:
(gdb) thread
2
[Switching to thread
2
(Thread
47057226893584
(LWP
11073
))]#
0
CSendFile::SendFile (
this
=
0x2acc5d4aff40
, pathname
=
@
0x2acc5d4afee0
) at ..
/
src
/
csendfile.cpp:
101
101
while
(
1
)
(gdb)
bt一下:
(gdb) bt
#
0
CSendFile::SendFile (
this
=
0x2acc5d4aff40
, pathname
=
@
0x2acc5d4afee0
) at ..
/
src
/
csendfile.cpp:
101
#
1
0x000000000040592e
in
CIcdn::TaskThread (pParam
=
0x7fff617eafe0
) at ..
/
src
/
cicdn.cpp:
128
#
2
0x00002acc4a90b73a
in
start_thread () from
/
lib
/
libpthread.so.
0
#
3
0x00002acc4a40d6dd
in
clone () from
/
lib
/
libc.so.
6
#
4
0x0000000000000000
in
??
()
来看一下101行的代码:
(gdb) l
96
}
97
98
int
CSendFile::SendFile(
const
string
&
pathname)
99
{
100 int n;
101 while(1)
102 {
103 n++;
104 }
105 //read file and send
现在我们定位到了出问题的代码位置,这里的循环只用来演示的。
最后别忘了detach()