gdb 调试入门，大牛写的高质量指南

2016/11/23 · 开发 · 3 评论 · Brendan Gregg, GDB, 调试

分享到：

本文由伯乐在线 - 逆旅翻译，艾凌风校稿。未经许可，禁止转载！
英文出处： Brendan Gregg。欢迎加入翻译组。

没想到Brendan Gregg这样的大牛，会写出这样一篇gdb tutorials文章：gdb Debugging Full Example (Tutorial): ncurses 。但可能正如文章开头所说，大牛对网上的gdb文章都不太满意，所以才有了这篇高质量指南，gdb入门者的福音。—— 何登成

如果你是系统管理员，但还不认识 Brendan Gregg，那网上流传甚广的 3 张 Linux 性能工具图（链接），你应该看过的。—— 伯小乐。

（ Brendan Gregg）

gdb 调试 ncurses 全过程：

发现网上的“gdb 示例”只有命令而没有对应的输出，我有点不满意。gdb 是 GNU 调试器，Linux 上的标配调试器。当我看 Greg Law 在 CppCon 2015 上的演讲《给我 15 分钟，我将改变你的对 GDB 的认知》的时候，我想起了示例输出的不足，幸运的是，这次有输出！这 15 分钟太值了。

它也启发我去分享一个完整的 gdb 调试实例，包含输出和每个步骤，甚至钻牛角尖的情况。这不是一个特别有趣或奇怪的问题，只是常规的 gdb 调试会话。但它包含了基础的东西可以勉强作为教程使用，记住 gdb 里还有很多东西我这里没用到。

我会以 root 权限运行下面的命令，因为我在调试一个工具，它需要 root 权限（目前）。需要的时候可用 sudo 获取 root 权限。你也没必要通读全篇︰我已列出每一步，你可以浏览它们找感兴趣的看。

1. 问题概述

BPF 工具箱里的 bcc 工具集有一个对cachetop.py 的 pull 请求，它通过程序使用 top-like display 显示 page cache 的统计。太好了！然而，当我测试它时，遇到了段错误︰

 
            1 
          
            2 
          
           # ./cachetop.py 
          
           Segmentation  
           fault

注意它说的是“段错误”，不是“段错误（核心已转储）”。我想要一个核心转储文件用来调试。（核心转储文件是进程内存的拷贝 – 这个名字来源于磁芯存储器时代 – 可用调试器分析）

分析核心转储文件是一种方法，但不是调试这个问题的唯一方法。我可以在 gdb 中运行此程序，来检查这个问题。我也可以在段错误发生时，用外部追踪器去抓数据和栈帧。我们从核心转储文件入手。

2. 解决核心转储问题

我检查一下核心转储的设置：

 
            1 
          
            2 
          
            3 
          
            4 
          
           # ulimit -c 
          
           0 
          
           # cat /proc/sys/kernel/core_pattern 
          
           core

ulimit -c 显示核心转储文件大小的最大值，这里是零：禁止核心转储（对于本进程和它的子进程）。

/proc/…/core_pattern 仅仅被设为 “core”，表示会在当前目录下生成一个文件名为 “core” 的核心转储文件。目前这样就行了，但是我要演示如何把它设置为全局位置。

 
            1 
          
            2 
          
            3 
          
           # ulimit -c unlimited 
          
           # mkdir /var/cores 
          
           # echo "/var/cores/core.%e.%p" > /proc/sys/kernel/core_pattern

你可以进一步定制 core_pattern；例如，%h 为主机名，%t 为转储的时间。这些选项被写在 Linux 内核源码 Documentation/sysctl/kernel.txt中。

要使 core_pattern 保持不变，重启之后仍然有效，你可以通过设置 /etc/sysctl.conf 里的 “kernel.core_pattern” 实现。

再来一次：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
           # ./cachetop.py 
          
           Segmentation  
           fault 
             
           ( 
           core  
           dumped 
           ) 
          
           # ls -lh /var/cores 
          
           total 
             
           19M 
          
           - 
           rw 
           -- 
           -- 
           -- 
           - 
             
           1 
             
           root  
           root 
             
           20M 
             
           Aug 
              
           7 
             
           22 
           : 
           15 
             
           core 
           . 
           python 
           . 
           30520 
          
           # file /var/cores/core.python.30520  
          
           / 
           var 
           / 
           cores 
           / 
           core 
           . 
           python 
           . 
           30520 
           : 
             
           ELF 
             
           64 
           - 
           bit  
           LSB  
           core  
           file  
           x86 
           - 
           64 
           , 
             
           version 
             
           1 
             
           ( 
           SYSV 
           ) 
           , 
             
           SVR4 
           - 
           style 
           , 
             
           from 
             
           'python ./cachetop.py'

好多了：我们有了自己的核心转储文件。

3. 启动 GDB

现在我要用 gdb 启动目标程序（用 shell 替换符，”`”，不过在你确定能用的情况下，也可指定完整路径），和核心转储文件：

 
      
       
        
            1 
          

            2 
          

            3 
          

            4 
          

            5 
          

            6 
          

            7 
          

            8 
          

            9 
          

            10 
          

            11 
          

            12 
          

            13 
          

            14 
          

            15 
          

            16 
          

            17 
          

            18 
          

            19 
          

            20 
          

            21 
          

            22 
          

            23 
          

            24 
          
 
         
           # gdb `which python` /var/cores/core.python.30520 
          
 
           GNU  
           gdb 
             
           ( 
           Ubuntu 
             
           7.11.1 
           - 
           0ubuntu1 
           ~ 
           16.04 
           ) 
             
           7.11.1 
          
 
           Copyright 
             
           ( 
           C 
           ) 
             
           2016 
             
           Free  
           Software  
           Foundation 
           , 
             
           Inc 
           . 
          
 
           License  
           GPLv3 
           + 
           : 
             
           GNU  
           GPL  
           version 
             
           3 
             
           or 
             
           later  
          
 
           This 
             
           is 
             
           free  
           software 
           : 
             
           you  
           are  
           free  
           to 
             
           change  
           and 
             
           redistribute  
           it 
           . 
          
 
           There  
           is 
             
           NO  
           WARRANTY 
           , 
             
           to 
             
           the  
           extent  
           permitted  
           by  
           law 
           . 
              
           Type 
             
           "show copying" 
          
 
           and 
             
           "show warranty" 
             
           for 
             
           details 
           . 
          
 
           This 
             
           GDB  
           was  
           configured  
           as 
             
           "x86_64-linux-gnu" 
           . 
          
 
           Type 
             
           "show configuration" 
             
           for 
             
           configuration  
           details 
           . 
          
 
           For 
             
           bug  
           reporting  
           instructions 
           , 
             
           please  
           see 
           : 
          
 
           . 
          
 
           Find  
           the  
           GDB  
           manual  
           and 
             
           other  
           documentation  
           resources  
           online  
           at 
           : 
          
 
           . 
          
 
           For 
             
           help 
           , 
             
           type 
             
           "help" 
           . 
          
 
           Type 
             
           "apropos word" 
             
           to 
             
           search  
           for 
             
           commands  
           related  
           to 
             
           "word" 
           . 
           . 
           . 
          
 
           Reading  
           symbols  
           from 
             
           / 
           usr 
           / 
           bin 
           / 
           python 
           . 
           . 
           . 
           ( 
           no  
           debugging  
           symbols  
           found 
           ) 
           . 
           . 
           . 
           done 
           . 
          
 
           warning 
           : 
             
           core  
           file  
           may  
           not 
             
           match  
           specified  
           executable  
           file 
           . 
          
 
           [ 
           New 
             
           LWP 
             
           30520 
           ] 
          
 
           [ 
           Thread  
           debugging  
           using  
           libthread_db  
           enabled 
           ] 
          
 
           Using  
           host  
           libthread_db  
           library 
             
           "/lib/x86_64-linux-gnu/libthread_db.so.1" 
           . 
          
 
           warning 
           : 
             
           JITed  
           object 
             
           file  
           architecture  
           unknown  
           is 
             
           not 
             
           compatible  
           with  
           target  
           architecture  
           i386 
           : 
           x86 
           - 
           64. 
          
 
           Core  
           was  
           generated  
           by 
             
           ` 
           python 
             
           . 
           / 
           cachetop 
           . 
           py' 
           . 
          
 
           Program  
           terminated  
           with  
           signal  
           SIGSEGV 
           , 
             
           Segmentation  
           fault 
           . 
          
 
           #0  0x00007f0a37aac40d in doupdate () from /lib/x86_64-linux-gnu/libncursesw.so.5 
          
 
       
 
      
    

最后两行很有趣：它告诉我们这个段错误发生在 libncursesw 库里 doupdate() 函数中。可以先在网上搜一下，以防这是个很常见的问题。我搜了一下，可是没发现一个常见的原因。

我已经猜到 libncursesw 是什么了，如果你对它很陌生，它在 “/lib” 目录下以 “.so.*” 结尾表明这是一个动态库文件，可能有 man 手册、网站、包描述等。

 
            1 
          
            2 
          
            3 
          
           # dpkg -l | grep libncursesw 
          
           ii   
           libncursesw5 
           : 
           amd64 
                              
           6.0 
           + 
           20160213 
           - 
           1ubuntu1 
                                
           amd64 
          
           shared  
           libraries  
           for 
             
           terminal  
           handling 
             
           ( 
           wide  
           character  
           support 
           )

我是碰巧在 Ubuntu 上调试，但用什么 Linux发行版对使用 gdb 并没有影响。

4. 回溯

栈回溯显示我们是如何到达失败点的，通常足够帮助我们确定常见的问题。bt （backtrace的简写）常常是我在 gdb 中使用的第一条命令：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
            18 
          
            19 
          
            20 
          
            21 
          
            22 
          
           ( 
           gdb 
           ) 
             
           bt 
          
           #0  0x00007f0a37aac40d in doupdate () from /lib/x86_64-linux-gnu/libncursesw.so.5 
          
           #1  0x00007f0a37aa07e6 in wrefresh () from /lib/x86_64-linux-gnu/libncursesw.so.5 
          
           #2  0x00007f0a37a99616 in ?? () from /lib/x86_64-linux-gnu/libncursesw.so.5 
          
           #3  0x00007f0a37a9a325 in wgetch () from /lib/x86_64-linux-gnu/libncursesw.so.5 
          
           #4  0x00007f0a37cc6ec3 in ?? () from /usr/lib/python2.7/lib-dynload/_curses.x86_64-linux-gnu.so 
          
           #5  0x00000000004c4d5a in PyEval_EvalFrameEx () 
          
           #6  0x00000000004c2e05 in PyEval_EvalCodeEx () 
          
           #7  0x00000000004def08 in ?? () 
          
           #8  0x00000000004b1153 in PyObject_Call () 
          
           #9  0x00000000004c73ec in PyEval_EvalFrameEx () 
          
           #10 0x00000000004c2e05 in PyEval_EvalCodeEx () 
          
           #11 0x00000000004caf42 in PyEval_EvalFrameEx () 
          
           #12 0x00000000004c2e05 in PyEval_EvalCodeEx () 
          
           #13 0x00000000004c2ba9 in PyEval_EvalCode () 
          
           #14 0x00000000004f20ef in ?? () 
          
           #15 0x00000000004eca72 in PyRun_FileExFlags () 
          
           #16 0x00000000004eb1f1 in PyRun_SimpleFileExFlags () 
          
           #17 0x000000000049e18a in Py_Main () 
          
           #18 0x00007f0a3be10830 in __libc_start_main (main=0x49daf0 <main>, argc=2, argv=0x7ffd33d94838, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,  
          
           stack_end 
           = 
           0x7ffd33d94828 
           ) 
             
           at 
             
           . 
           . 
           / 
           csu 
           / 
           libc 
           - 
           start 
           . 
           c 
           : 
           291 
          
           #19 0x000000000049da19 in _start ()

从下往上，按照从父函数到子函数的顺序看。有 “??” 的地方是因为符号解析失败。遍历栈 – 用来生成栈帧 — 也会失败。在这种情况下你可能会看到一个正常的栈帧，跟着一个小数值的假地址。如果符号或栈破损很严重，导致无法理解栈回溯，这里有几个常用的办法来修复：安装 debug info 包（给 gdb 提供更多的符号，让它来做基于 DWARF 的栈遍历），或者重新用源码编译（-fno-omit-frame-pointer -g）一个带帧指针和调试信息的版本。以上大多数 “??”
可以通过安装 python-dbg 包来修复。

这些栈看起来不太有用：帧 5 到 17 （左边的索引）在 Python 内部，虽然还看不到 Python 方法。帧 4 是 _curses 库，然后就到了 libncursesw。看起来调用顺序是 wgetch()->wrefresh()->doupdate()。根据函数名来看，我猜是刷新窗口。为什么会导致核心转储呢？

5. 反汇编

我从反汇编发生段错误的函数 doupdate() 开始：

 
      
       
        
            1 
          

            2 
          

            3 
          

            4 
          

            5 
          

            6 
          

            7 
          

            8 
          

            9 
          

            10 
          

            11 
          

            12 
          

            13 
          

            14 
          

            15 
          

            16 
          

            17 
          

            18 
          

            19 
          

            20 
          

            21 
          

            22 
          

            23 
          

            24 
          
 
         
           ( 
           gdb 
           ) 
             
           disas  
           doupdate 
          
 
           Dump  
           of  
           assembler  
           code  
           for 
             
           function 
             
           doupdate 
           : 
          
 
               
           0x00007f0a37aac2e0 
             
           < 
           + 
           0 
           > 
           : 
               
           push 
               
           % 
           r15 
          
 
               
           0x00007f0a37aac2e2 
             
           < 
           + 
           2 
           > 
           : 
               
           push 
               
           % 
           r14 
          
 
               
           0x00007f0a37aac2e4 
             
           < 
           + 
           4 
           > 
           : 
               
           push 
               
           % 
           r13 
          
 
               
           0x00007f0a37aac2e6 
             
           < 
           + 
           6 
           > 
           : 
               
           push 
               
           % 
           r12 
          
 
               
           0x00007f0a37aac2e8 
             
           < 
           + 
           8 
           > 
           : 
               
           push 
               
           % 
           rbp 
          
 
               
           0x00007f0a37aac2e9 
             
           < 
           + 
           9 
           > 
           : 
               
           push 
               
           % 
           rbx 
          
 
               
           0x00007f0a37aac2ea 
             
           < 
           + 
           10 
           > 
           : 
              
           sub 
                
           $ 
           0xc8 
           , 
           % 
           rsp 
          
 
           [ 
           . 
           . 
           . 
           ] 
          
 
           -- 
           - 
           Type 
             
           < 
           return 
           > 
             
           to 
             
           continue 
           , 
             
           or 
             
           q 
             
           < 
           return 
           > 
             
           to 
             
           quit 
           -- 
           - 
          
 
           [ 
           . 
           . 
           . 
           ] 
          
 
               
           0x00007f0a37aac3f7 
             
           < 
           + 
           279 
           > 
           : 
             
           cmpb 
               
           $ 
           0x0 
           , 
           0x21 
           ( 
           % 
           rcx 
           ) 
          
 
               
           0x00007f0a37aac3fb 
             
           < 
           + 
           283 
           > 
           : 
             
           je 
                 
           0x7f0a37aacc3b 
             
           < 
           doupdate 
           + 
           2395 
           > 
          
 
               
           0x00007f0a37aac401 
             
           < 
           + 
           289 
           > 
           : 
             
           mov 
                
           0x20cb68 
           ( 
           % 
           rip 
           ) 
           , 
           % 
           rax 
                    
           # 0x7f0a37cb8f70 
          
 
               
           0x00007f0a37aac408 
             
           < 
           + 
           296 
           > 
           : 
             
           mov 
                
           ( 
           % 
           rax 
           ) 
           , 
           % 
           rsi 
          
 
               
           0x00007f0a37aac40b 
             
           < 
           + 
           299 
           > 
           : 
             
           xor 
                
           % 
           eax 
           , 
           % 
           eax 
          
 
           = 
           > 
             
           0x00007f0a37aac40d 
             
           < 
           + 
           301 
           > 
           : 
             
           mov 
                
           0x10 
           ( 
           % 
           rsi 
           ) 
           , 
           % 
           rdi 
          
 
               
           0x00007f0a37aac411 
             
           < 
           + 
           305 
           > 
           : 
             
           cmpb 
               
           $ 
           0x0 
           , 
           0x1c 
           ( 
           % 
           rdi 
           ) 
          
 
               
           0x00007f0a37aac415 
             
           < 
           + 
           309 
           > 
           : 
             
           jne 
                
           0x7f0a37aac6f7 
             
           < 
           doupdate 
           + 
           1047 
           > 
          
 
               
           0x00007f0a37aac41b 
             
           < 
           + 
           315 
           > 
           : 
             
           movswl 
             
           0x4 
           ( 
           % 
           rcx 
           ) 
           , 
           % 
           ecx 
          
 
               
           0x00007f0a37aac41f 
             
           < 
           + 
           319 
           > 
           : 
             
           movswl 
             
           0x74 
           ( 
           % 
           rdx 
           ) 
           , 
           % 
           edi 
          
 
               
           0x00007f0a37aac423 
             
           < 
           + 
           323 
           > 
           : 
             
           mov 
                
           % 
           rax 
           , 
           0x40 
           ( 
           % 
           rsp 
           ) 
          
 
           [ 
           . 
           . 
           . 
           ] 
          
 
       
 
      
    

部分输出。（我也可以只输入 “disas” 它会默认反汇编 doupdate ）

“=>” 指向段错误地址，此处是一条 mov 指令 mov 0x10(%rsi),%rdi：从%rsi中指向内存地址的值加偏移量 0x10 处取值，送到 %rdi 寄存器中。接下来我会检查寄存器的状态。

6. 查看寄存器

使用 i r（info registers 的简写）打印寄存器值：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
            18 
          
            19 
          
            20 
          
            21 
          
            22 
          
            23 
          
            24 
          
            25 
          
           ( 
           gdb 
           ) 
             
           i 
             
           r 
          
           rax 
                        
           0x0 
              
           0 
          
           rbx 
                        
           0x1993060 
                
           26816608 
          
           rcx 
                        
           0x19902a0 
                
           26804896 
          
           rdx 
                        
           0x19ce7d0 
                
           27060176 
          
           rsi 
                        
           0x0 
              
           0 
          
           rdi 
                        
           0x19ce7d0 
                
           27060176 
          
           rbp 
                        
           0x7f0a3848eb10 
               
           0x7f0a3848eb10 
             
           < 
           SP 
           > 
          
           rsp 
                        
           0x7ffd33d93c00 
               
           0x7ffd33d93c00 
          
           r8 
                         
           0x7f0a37cb93e0 
               
           139681862489056 
          
           r9 
                         
           0x0 
              
           0 
          
           r10 
                        
           0x8 
              
           8 
          
           r11 
                        
           0x202 
                
           514 
          
           r12 
                        
           0x0 
              
           0 
          
           r13 
                        
           0x0 
              
           0 
          
           r14 
                        
           0x7f0a3848eb10 
               
           139681870703376 
          
           r15 
                        
           0x19ce7d0 
                
           27060176 
          
           rip 
                        
           0x7f0a37aac40d 
               
           0x7f0a37aac40d 
             
           < 
           doupdate 
           + 
           301 
           > 
          
           eflags 
                     
           0x10246 
              
           [ 
             
           PF  
           ZF  
           IF 
             
           RF 
             
           ] 
          
           cs 
                         
           0x33 
             
           51 
          
           ss 
                         
           0x2b 
             
           43 
          
           ds 
                         
           0x0 
              
           0 
          
           es 
                         
           0x0 
              
           0 
          
           fs 
                         
           0x0 
              
           0 
          
           gs 
                         
           0x0 
              
           0

哦，%rsi是零，这就是我们的问题所在！零不太可能是有效地址，并且解引用一个未初始化的指针或空指针引起的段错误是常见的软件 bug。

7. 内存映射

你可以使用 i proc m （info proc mappings 的简写）核查零是不是有效地址：

 
      
       
        
            1 
          

            2 
          

            3 
          

            4 
          

            5 
          

            6 
          

            7 
          

            8 
          

            9 
          

            10 
          

            11 
          

            12 
          

            13 
          

            14 
          

            15 
          

            16 
          

            17 
          

            18 
          

            19 
          

            20 
          

            21 
          
 
         
           ( 
           gdb 
           ) 
             
           i 
             
           proc 
             
           m 
          
 
           Mapped  
           address  
           spaces 
           : 
          

              
          
 
                  
           Start  
           Addr            
           End 
             
           Addr        
           Size      
           Offset  
           objfile 
          
 
                    
           0x400000 
                       
           0x6e7000 
               
           0x2e7000 
                    
           0x0 
             
           / 
           usr 
           / 
           bin 
           / 
           python2 
           . 
           7 
          
 
                    
           0x8e6000 
                       
           0x8e8000 
                 
           0x2000 
               
           0x2e6000 
             
           / 
           usr 
           / 
           bin 
           / 
           python2 
           . 
           7 
          
 
                    
           0x8e8000 
                       
           0x95f000 
                
           0x77000 
               
           0x2e8000 
             
           / 
           usr 
           / 
           bin 
           / 
           python2 
           . 
           7 
          
 
              
           0x7f0a37a8b000 
                 
           0x7f0a37ab8000 
                
           0x2d000 
                    
           0x0 
             
           / 
           lib 
           / 
           x86_64 
           - 
           linux 
           - 
           gnu 
           / 
           libncursesw 
           . 
           so 
           . 
           5.9 
          
 
              
           0x7f0a37ab8000 
                 
           0x7f0a37cb8000 
               
           0x200000 
                
           0x2d000 
             
           / 
           lib 
           / 
           x86_64 
           - 
           linux 
           - 
           gnu 
           / 
           libncursesw 
           . 
           so 
           . 
           5.9 
          
 
              
           0x7f0a37cb8000 
                 
           0x7f0a37cb9000 
                 
           0x1000 
                
           0x2d000 
             
           / 
           lib 
           / 
           x86_64 
           - 
           linux 
           - 
           gnu 
           / 
           libncursesw 
           . 
           so 
           . 
           5.9 
          
 
              
           0x7f0a37cb9000 
                 
           0x7f0a37cba000 
                 
           0x1000 
                
           0x2e000 
             
           / 
           lib 
           / 
           x86_64 
           - 
           linux 
           - 
           gnu 
           / 
           libncursesw 
           . 
           so 
           . 
           5.9 
          
 
              
           0x7f0a37cba000 
                 
           0x7f0a37ccd000 
                
           0x13000 
                    
           0x0 
             
           / 
           usr 
           / 
           lib 
           / 
           python2 
           . 
           7 
           / 
           lib 
           - 
           dynload 
           / 
           _curses 
           . 
           x86_64 
           - 
           linux 
           - 
           gnu 
           . 
           so 
          
 
              
           0x7f0a37ccd000 
                 
           0x7f0a37ecc000 
               
           0x1ff000 
                
           0x13000 
             
           / 
           usr 
           / 
           lib 
           / 
           python2 
           . 
           7 
           / 
           lib 
           - 
           dynload 
           / 
           _curses 
           . 
           x86_64 
           - 
           linux 
           - 
           gnu 
           . 
           so 
          
 
              
           0x7f0a37ecc000 
                 
           0x7f0a37ecd000 
                 
           0x1000 
                
           0x12000 
             
           / 
           usr 
           / 
           lib 
           / 
           python2 
           . 
           7 
           / 
           lib 
           - 
           dynload 
           / 
           _curses 
           . 
           x86_64 
           - 
           linux 
           - 
           gnu 
           . 
           so 
          
 
              
           0x7f0a37ecd000 
                 
           0x7f0a37ecf000 
                 
           0x2000 
                
           0x13000 
             
           / 
           usr 
           / 
           lib 
           / 
           python2 
           . 
           7 
           / 
           lib 
           - 
           dynload 
           / 
           _curses 
           . 
           x86_64 
           - 
           linux 
           - 
           gnu 
           . 
           so 
          
 
              
           0x7f0a38050000 
                 
           0x7f0a38066000 
                
           0x16000 
                    
           0x0 
             
           / 
           lib 
           / 
           x86_64 
           - 
           linux 
           - 
           gnu 
           / 
           libgcc_s 
           . 
           so 
           . 
           1 
          
 
              
           0x7f0a38066000 
                 
           0x7f0a38265000 
               
           0x1ff000 
                
           0x16000 
             
           / 
           lib 
           / 
           x86_64 
           - 
           linux 
           - 
           gnu 
           / 
           libgcc_s 
           . 
           so 
           . 
           1 
          
 
              
           0x7f0a38265000 
                 
           0x7f0a38266000 
                 
           0x1000 
                
           0x15000 
             
           / 
           lib 
           / 
           x86_64 
           - 
           linux 
           - 
           gnu 
           / 
           libgcc_s 
           . 
           so 
           . 
           1 
          
 
              
           0x7f0a38266000 
                 
           0x7f0a3828b000 
                
           0x25000 
                    
           0x0 
             
           / 
           lib 
           / 
           x86_64 
           - 
           linux 
           - 
           gnu 
           / 
           libtinfo 
           . 
           so 
           . 
           5.9 
          
 
              
           0x7f0a3828b000 
                 
           0x7f0a3848a000 
               
           0x1ff000 
                
           0x25000 
             
           / 
           lib 
           / 
           x86_64 
           - 
           linux 
           - 
           gnu 
           / 
           libtinfo 
           . 
           so 
           . 
           5.9 
          
 
           [ 
           . 
           . 
           . 
           ] 
          
 
       
 
      
    

第一个有效的虚拟地址是 0x400000。任何小于它的地址都是非法的，这些地址如果被引用，就会引起段错误。

目前有几种不同的方式可做进一步分析。我先一步一步的看指令。

8. 断点

先回到反汇编：

 
      
       
        
            1 
          

            2 
          

            3 
          

            4 
          
 
         
               
           0x00007f0a37aac401 
             
           < 
           + 
           289 
           > 
           : 
               
           mov 
                
           0x20cb68 
           ( 
           % 
           rip 
           ) 
           , 
           % 
           rax 
                    
           # 0x7f0a37cb8f70 
          
 
               
           0x00007f0a37aac408 
             
           < 
           + 
           296 
           > 
           : 
               
           mov 
                
           ( 
           % 
           rax 
           ) 
           , 
           % 
           rsi 
          
 
               
           0x00007f0a37aac40b 
             
           < 
           + 
           299 
           > 
           : 
               
           xor 
                
           % 
           eax 
           , 
           % 
           eax 
          
 
           = 
           > 
             
           0x00007f0a37aac40d 
             
           < 
           + 
           301 
           > 
           : 
               
           mov 
                
           0x10 
           ( 
           % 
           rsi 
           ) 
           , 
           % 
           rdi 
          
 
       
 
      
    

看这四条指令：好像是从栈中取东西放到 %rax，然后解引用 %rax 到 %rsi，再将 %eax 置零（ xor 是一个优化，替换掉移动 0 的动作），最后将 %rsi 解引用再加一个偏移，不过我们知道 %rsi 是零。这几条指令用来访问数据结构。可能 %rax 会很有趣，但是它已经被前面的指令置零，所以我们在核心转储文件的寄存器里看不到它的值。

我可以在 doupdate+289 下个断点，然后逐条指令查看寄存器的值如何变化。首先，我需要启动 gdb 把程序跑起来：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
           # gdb `which python` 
          
           GNU  
           gdb 
             
           ( 
           Ubuntu 
             
           7.11.1 
           - 
           0ubuntu1 
           ~ 
           16.04 
           ) 
             
           7.11.1 
          
           Copyright 
             
           ( 
           C 
           ) 
             
           2016 
             
           Free  
           Software  
           Foundation 
           , 
             
           Inc 
           . 
          
           License  
           GPLv3 
           + 
           : 
             
           GNU  
           GPL  
           version 
             
           3 
             
           or 
             
           later  
          
           This 
             
           is 
             
           free  
           software 
           : 
             
           you  
           are  
           free  
           to 
             
           change  
           and 
             
           redistribute  
           it 
           . 
          
           There  
           is 
             
           NO  
           WARRANTY 
           , 
             
           to 
             
           the  
           extent  
           permitted  
           by  
           law 
           . 
              
           Type 
             
           "show copying" 
          
           and 
             
           "show warranty" 
             
           for 
             
           details 
           . 
          
           This 
             
           GDB  
           was  
           configured  
           as 
             
           "x86_64-linux-gnu" 
           . 
          
           Type 
             
           "show configuration" 
             
           for 
             
           configuration  
           details 
           . 
          
           For 
             
           bug  
           reporting  
           instructions 
           , 
             
           please  
           see 
           : 
          
           . 
          
           Find  
           the  
           GDB  
           manual  
           and 
             
           other  
           documentation  
           resources  
           online  
           at 
           : 
          
           . 
          
           For 
             
           help 
           , 
             
           type 
             
           "help" 
           . 
          
           Type 
             
           "apropos word" 
             
           to 
             
           search  
           for 
             
           commands  
           related  
           to 
             
           "word" 
           . 
           . 
           . 
          
           Reading  
           symbols  
           from 
             
           / 
           usr 
           / 
           bin 
           / 
           python 
           . 
           . 
           . 
           ( 
           no  
           debugging  
           symbols  
           found 
           ) 
           . 
           . 
           . 
           done 
           .

现在用 b （break 的简写）来下断点：

哦。我想演示这个错误来解释为什么我们经常以在主函数设置断点作为开始，因为这时候符号可能被加载，可以设置感兴趣的断点。我直接在 doupdate 函数设断点，避开这个问题，一旦断点被触发就设置加了偏移的断点。

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
            18 
          
           ( 
           gdb 
           ) 
             
           b 
             
           doupdate 
          
           Function 
             
           "doupdate" 
             
           not 
             
           defined 
           . 
          
           Make  
           breakpoint  
           pending  
           on  
           future  
           shared  
           library  
           load 
           ? 
             
           ( 
           y 
             
           or 
             
           [ 
           n 
           ] 
           ) 
             
           y 
          
           Breakpoint 
             
           1 
             
           ( 
           doupdate 
           ) 
             
           pending 
           . 
          
           ( 
           gdb 
           ) 
             
           r 
             
           cachetop 
           . 
           py 
          
           Starting  
           program 
           : 
             
           / 
           usr 
           / 
           bin 
           / 
           python  
           cachetop 
           . 
           py 
          
           [ 
           Thread  
           debugging  
           using  
           libthread_db  
           enabled 
           ] 
          
           Using  
           host  
           libthread_db  
           library 
             
           "/lib/x86_64-linux-gnu/libthread_db.so.1" 
           . 
          
           warning 
           : 
             
           JITed  
           object 
             
           file  
           architecture  
           unknown  
           is 
             
           not 
             
           compatible  
           with  
           target  
           architecture  
           i386 
           : 
           x86 
           - 
           64. 
          
           Breakpoint 
             
           1 
           , 
             
           0x00007ffff34ad2e0 
             
           in 
             
           doupdate 
             
           ( 
           ) 
             
           from 
             
           / 
           lib 
           / 
           x86_64 
           - 
           linux 
           - 
           gnu 
           / 
           libncursesw 
           . 
           so 
           . 
           5 
          
           ( 
           gdb 
           ) 
             
           b * 
           doupdate 
             
           + 
             
           289 
          
           Breakpoint 
             
           2 
             
           at 
             
           0x7ffff34ad401 
          
           ( 
           gdb 
           ) 
             
           c 
          
           Continuing 
           . 
          
           Breakpoint 
             
           2 
           , 
             
           0x00007ffff34ad401 
             
           in 
             
           doupdate 
             
           ( 
           ) 
             
           from 
             
           / 
           lib 
           / 
           x86_64 
           - 
           linux 
           - 
           gnu 
           / 
           libncursesw 
           . 
           so 
           . 
           5

我们到了断点处。

如果你之前没有做这些，r (run) 命令会把参数传给我们早先在命令行指定的 gdb 目标（python）。这样的话程序会以执行 “python cachetop.py” 结束。

9. 单步调试

我跳到下一条指令（si，stepi的简写），然后检查寄存器：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
            18 
          
            19 
          
            20 
          
            21 
          
            22 
          
            23 
          
            24 
          
            25 
          
            26 
          
            27 
          
            28 
          
            29 
          
           ( 
           gdb 
           ) 
             
           si 
          
           0x00007ffff34ad408 
             
           in 
             
           doupdate 
             
           ( 
           ) 
             
           from 
             
           / 
           lib 
           / 
           x86_64 
           - 
           linux 
           - 
           gnu 
           / 
           libncursesw 
           . 
           so 
           . 
           5 
          
           ( 
           gdb 
           ) 
             
           i 
             
           r 
          
           rax 
                        
           0x7ffff3e8f948 
               
           140737285519688 
          
           rbx 
                        
           0xaea060 
             
           11444320 
          
           rcx 
                        
           0xae72a0 
             
           11432608 
          
           rdx 
                        
           0xa403d0 
             
           10748880 
          
           rsi 
                        
           0x7ffff7ea8e10 
               
           140737352732176 
          
           rdi 
                        
           0xa403d0 
             
           10748880 
          
           rbp 
                        
           0x7ffff3e8fb10 
               
           0x7ffff3e8fb10 
             
           < 
           SP 
           > 
          
           rsp 
                        
           0x7fffffffd390 
               
           0x7fffffffd390 
          
           r8 
                         
           0x7ffff36ba3e0 
               
           140737277305824 
          
           r9 
                         
           0x0 
              
           0 
          
           r10 
                        
           0x8 
              
           8 
          
           r11 
                        
           0x202 
                
           514 
          
           r12 
                        
           0x0 
              
           0 
          
           r13 
                        
           0x0 
              
           0 
          
           r14 
                        
           0x7ffff3e8fb10 
               
           140737285520144 
          
           r15 
                        
           0xa403d0 
             
           10748880 
          
           rip 
                        
           0x7ffff34ad408 
               
           0x7ffff34ad408 
             
           < 
           doupdate 
           + 
           296 
           > 
          
           eflags 
                     
           0x202 
                
           [ 
             
           IF 
             
           ] 
          
           cs 
                         
           0x33 
             
           51 
          
           ss 
                         
           0x2b 
             
           43 
          
           ds 
                         
           0x0 
              
           0 
          
           es 
                         
           0x0 
              
           0 
          
           fs 
                         
           0x0 
              
           0 
          
           gs 
                         
           0x0 
              
           0 
          
           ( 
           gdb 
           ) 
             
           p 
           / 
           a 
             
           0x7ffff3e8f948 
          
           $ 
           1 
             
           = 
             
           0x7ffff3e8f948 
             
           < 
           cur_term 
           >

又一条线索。所以我们解引用的空指针好像是一个叫 “cur_term” 的符号（p/a 是 print/a 的简写，这里 “/a” 指以地址的形式）。考虑到这是 ncurses, 是我们的环境变量 TERM 设置有问题吗？

 
            1 
          
            2 
          
           # echo $TERM 
          
           xterm 
           - 
           256color

我试过将其设置为 vt100 并运行程序，还是遇到了同样的段错误。

注意我只是在 doupdate() 第一次被调用的时候查看了寄存器，但是它可以被多次调用，所以问题可能出在后边的调用中。我可以通过执行 c（ continue 的简写）一步步到达出问题的地方。如果它被调用几次的话这样做是可行的，如果它被调用几千次的话我得用别的办法。（我会在 15 节的里介绍。）

10. 回退

gdb 有一个超棒的功能叫回退，Greg Law 在他的演讲中提到过。这里有一个例子。

我再启动一个 python 会话，从头演示：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
           # gdb `which python` 
          
           GNU  
           gdb 
             
           ( 
           Ubuntu 
             
           7.11.1 
           - 
           0ubuntu1 
           ~ 
           16.04 
           ) 
             
           7.11.1 
          
           Copyright 
             
           ( 
           C 
           ) 
             
           2016 
             
           Free  
           Software  
           Foundation 
           , 
             
           Inc 
           . 
          
           License  
           GPLv3 
           + 
           : 
             
           GNU  
           GPL  
           version 
             
           3 
             
           or 
             
           later  
          
           This 
             
           is 
             
           free  
           software 
           : 
             
           you  
           are  
           free  
           to 
             
           change  
           and 
             
           redistribute  
           it 
           . 
          
           There  
           is 
             
           NO  
           WARRANTY 
           , 
             
           to 
             
           the  
           extent  
           permitted  
           by  
           law 
           . 
              
           Type 
             
           "show copying" 
          
           and 
             
           "show warranty" 
             
           for 
             
           details 
           . 
          
           This 
             
           GDB  
           was  
           configured  
           as 
             
           "x86_64-linux-gnu" 
           . 
          
           Type 
             
           "show configuration" 
             
           for 
             
           configuration  
           details 
           . 
          
           For 
             
           bug  
           reporting  
           instructions 
           , 
             
           please  
           see 
           : 
          
           < 
           http 
           : 
           //www.gnu.org/software/gdb/bugs/>. 
          
           Find  
           the  
           GDB  
           manual  
           and 
             
           other  
           documentation  
           resources  
           online  
           at 
           : 
          
           < 
           http 
           : 
           //www.gnu.org/software/gdb/documentation/>. 
          
           For 
             
           help 
           , 
             
           type 
             
           "help" 
           . 
          
           Type 
             
           "apropos word" 
             
           to 
             
           search  
           for 
             
           commands  
           related  
           to 
             
           "word" 
           . 
           . 
           . 
          
           Reading  
           symbols  
           from 
             
           / 
           usr 
           / 
           bin 
           / 
           python 
           . 
           . 
           . 
           ( 
           no  
           debugging  
           symbols  
           found 
           ) 
           . 
           . 
           . 
           done 
           .

和之前一样我在 doupdate 下断点，一旦触发，我就启动 recording，然后继续运行程序直到崩溃。Recording 会增加相当大的开销，所以我不想在主函数里就将它打开。

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
           ( 
           gdb 
           ) 
             
           b 
             
           doupdate 
          
           Function 
             
           "doupdate" 
             
           not 
             
           defined 
           . 
          
           Make  
           breakpoint  
           pending  
           on  
           future  
           shared  
           library  
           load 
           ? 
             
           ( 
           y 
             
           or 
             
           [ 
           n 
           ] 
           ) 
             
           y 
          
           Breakpoint 
             
           1 
             
           ( 
           doupdate 
           ) 
             
           pending 
           . 
          
           ( 
           gdb 
           ) 
             
           r 
             
           cachetop 
           . 
           py 
          
           Starting  
           program 
           : 
             
           / 
           usr 
           / 
           bin 
           / 
           python  
           cachetop 
           . 
           py 
          
           [ 
           Thread  
           debugging  
           using  
           libthread_db  
           enabled 
           ] 
          
           Using  
           host  
           libthread_db  
           library 
             
           "/lib/x86_64-linux-gnu/libthread_db.so.1" 
           . 
          
           warning 
           : 
             
           JITed  
           object 
             
           file  
           architecture  
           unknown  
           is 
             
           not 
             
           compatible  
           with  
           target  
           architecture  
           i386 
           : 
           x86 
           - 
           64. 
          
           Breakpoint 
             
           1 
           , 
             
           0x00007ffff34ad2e0 
             
           in 
             
           doupdate 
             
           ( 
           ) 
             
           from 
             
           / 
           lib 
           / 
           x86_64 
           - 
           linux 
           - 
           gnu 
           / 
           libncursesw 
           . 
           so 
           . 
           5 
          
           ( 
           gdb 
           ) 
             
           record 
          
           ( 
           gdb 
           ) 
             
           c 
          
           Continuing 
           . 
          
           Program  
           received  
           signal  
           SIGSEGV 
           , 
             
           Segmentation  
           fault 
           . 
          
           0x00007ffff34ad40d 
             
           in 
             
           doupdate 
             
           ( 
           ) 
             
           from 
             
           / 
           lib 
           / 
           x86_64 
           - 
           linux 
           - 
           gnu 
           / 
           libncursesw 
           . 
           so 
           . 
           5

这里我可以逐行或逐条指令的回退。它通过播放我们记录的寄存器状态来工作。我回退两条指令，然后打印寄存器值：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
            18 
          
            19 
          
            20 
          
            21 
          
            22 
          
            23 
          
            24 
          
            25 
          
            26 
          
            27 
          
            28 
          
            29 
          
            30 
          
            31 
          
           ( 
           gdb 
           ) 
             
           reverse 
           - 
           stepi 
          
           0x00007ffff34ad40d 
             
           in 
             
           doupdate 
             
           ( 
           ) 
             
           from 
             
           / 
           lib 
           / 
           x86_64 
           - 
           linux 
           - 
           gnu 
           / 
           libncursesw 
           . 
           so 
           . 
           5 
          
           ( 
           gdb 
           ) 
             
           reverse 
           - 
           stepi 
          
           0x00007ffff34ad40b 
             
           in 
             
           doupdate 
             
           ( 
           ) 
             
           from 
             
           / 
           lib 
           / 
           x86_64 
           - 
           linux 
           - 
           gnu 
           / 
           libncursesw 
           . 
           so 
           . 
           5 
          
           ( 
           gdb 
           ) 
             
           i 
             
           r 
          
           rax 
                        
           0x7ffff3e8f948 
               
           140737285519688 
          
           rbx 
                        
           0xaea060 
             
           11444320 
          
           rcx 
                        
           0xae72a0 
             
           11432608 
          
           rdx 
                        
           0xa403d0 
             
           10748880 
          
           rsi 
                        
           0x0 
              
           0 
          
           rdi 
                        
           0xa403d0 
             
           10748880 
          
           rbp 
                        
           0x7ffff3e8fb10 
               
           0x7ffff3e8fb10 
             
           < 
           SP 
           > 
          
           rsp 
                        
           0x7fffffffd390 
               
           0x7fffffffd390 
          
           r8 
                         
           0x7ffff36ba3e0 
               
           140737277305824 
          
           r9 
                         
           0x0 
              
           0 
          
           r10 
                        
           0x8 
              
           8 
          
           r11 
                        
           0x302 
                
           770 
          
           r12 
                        
           0x0 
              
           0 
          
           r13 
                        
           0x0 
              
           0 
          
           r14 
                        
           0x7ffff3e8fb10 
               
           140737285520144 
          
           r15 
                        
           0xa403d0 
             
           10748880 
          
           rip 
                        
           0x7ffff34ad40b 
               
           0x7ffff34ad40b 
             
           < 
           doupdate 
           + 
           299 
           > 
          
           eflags 
                     
           0x202 
                
           [ 
             
           IF 
             
           ] 
          
           cs 
                         
           0x33 
             
           51 
          
           ss 
                         
           0x2b 
             
           43 
          
           ds 
                         
           0x0 
              
           0 
          
           es 
                         
           0x0 
              
           0 
          
           fs 
                         
           0x0 
              
           0 
          
           gs 
                         
           0x0 
              
           0 
          
           ( 
           gdb 
           ) 
             
           p 
           / 
           a 
             
           0x7ffff3e8f948 
          
           $ 
           1 
             
           = 
             
           0x7ffff3e8f948 
             
           < 
           cur_term 
           >

所以，又找到了 “cur_term” 的线索。我很想看这里的源代码，但我将从调试信息入手。

11. 调试信息

这是 libncursesw，我没有安装调试信息（Ubuntu）：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
           # apt-cache search libncursesw 
          
           libncursesw5 
             
           - 
             
           shared  
           libraries  
           for 
             
           terminal  
           handling 
             
           ( 
           wide  
           character  
           support 
           ) 
          
           libncursesw5 
           - 
           dbg 
             
           - 
             
           debugging 
           / 
           profiling  
           libraries  
           for 
             
           ncursesw 
          
           libncursesw5 
           - 
           dev 
             
           - 
             
           developer' 
           s 
             
           libraries  
           for 
             
           ncursesw 
          
           # dpkg -l | grep libncursesw 
          
           ii   
           libncursesw5 
           : 
           amd64 
                              
           6.0 
           + 
           20160213 
           - 
           1ubuntu1 
                                
           amd64         
           shared  
           libraries  
           for 
             
           terminal  
           handling 
             
           ( 
           wide  
           character  
           support 
           )

我把它装上：

 
      
       
        
            1 
          

            2 
          

            3 
          

            4 
          

            5 
          

            6 
          

            7 
          

            8 
          

            9 
          

            10 
          

            11 
          

            12 
          

            13 
          

            14 
          

            15 
          

            16 
          
 
         
           # apt-get install -y libncursesw5-dbg 
          
 
           Reading  
           package 
             
           lists 
           . 
           . 
           . 
             
           Done 
          
 
           Building  
           dependency  
           tree        
          
 
           Reading  
           state  
           information 
           . 
           . 
           . 
             
           Done 
          
 
           [ 
           . 
           . 
           . 
           ] 
          
 
           After  
           this 
             
           operation 
           , 
             
           2 
           , 
           488 
             
           kB  
           of  
           additional  
           disk  
           space  
           will  
           be  
           used 
           . 
          
 
           Get 
           : 
           1 
             
           http 
           : 
           //us-west-1.ec2.archive.ubuntu.com/ubuntu xenial/main amd64 libncursesw5-dbg amd64 6.0+20160213-1ubuntu1 [729 kB] 
          
 
           Fetched 
             
           729 
             
           kB  
           in 
             
           0s 
             
           ( 
           865 
             
           kB 
           / 
           s 
           ) 
                      
          
 
           Selecting  
           previously  
           unselected  
           package 
             
           libncursesw5 
           - 
           dbg 
           . 
          
 
           ( 
           Reading  
           database 
             
           . 
           . 
           . 
             
           200094 
             
           files  
           and 
             
           directories  
           currently  
           installed 
           . 
           ) 
          
 
           Preparing  
           to 
             
           unpack 
             
           . 
           . 
           . 
           / 
           libncursesw5 
           - 
           dbg_6 
           . 
           0 
           + 
           20160213 
           - 
           1ubuntu1_amd64.deb 
             
           . 
           . 
           . 
          
 
           Unpacking  
           libncursesw5 
           - 
           dbg 
             
           ( 
           6.0 
           + 
           20160213 
           - 
           1ubuntu1 
           ) 
             
           . 
           . 
           . 
          
 
           Setting  
           up  
           libncursesw5 
           - 
           dbg 
             
           ( 
           6.0 
           + 
           20160213 
           - 
           1ubuntu1 
           ) 
             
           . 
           . 
           . 
          
 
           # dpkg -l | grep libncursesw 
          
 
           ii   
           libncursesw5 
           : 
           amd64 
                              
           6.0 
           + 
           20160213 
           - 
           1ubuntu1 
                                
           amd64         
           shared  
           libraries  
           for 
             
           terminal  
           handling 
             
           ( 
           wide  
           character  
           support 
           ) 
          
 
           ii   
           libncursesw5 
           - 
           dbg 
                                
           6.0 
           + 
           20160213 
           - 
           1ubuntu1 
                                
           amd64         
           debugging 
           / 
           profiling  
           libraries  
           for 
             
           ncursesw 
          
 
       
 
      
    

太好了，版本匹配。那么现在我们的段错误是什么样子呢？

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
            18 
          
            19 
          
            20 
          
            21 
          
            22 
          
            23 
          
            24 
          
            25 
          
            26 
          
            27 
          
            28 
          
            29 
          
            30 
          
            31 
          
            32 
          
            33 
          
            34 
          
           # gdb `which python` /var/cores/core.python.30520 
          
           GNU  
           gdb 
             
           ( 
           Ubuntu 
             
           7.11.1 
           - 
           0ubuntu1 
           ~ 
           16.04 
           ) 
             
           7.11.1 
          
           [ 
           . 
           . 
           . 
           ] 
          
           warning 
           : 
             
           JITed  
           object 
             
           file  
           architecture  
           unknown  
           is 
             
           not 
             
           compatible  
           with  
           target  
           architecture  
           i386 
           : 
           x86 
           - 
           64. 
          
           Core  
           was  
           generated  
           by 
             
           ` 
           python 
             
           . 
           / 
           cachetop 
           . 
           py' 
           . 
          
           Program  
           terminated  
           with  
           signal  
           SIGSEGV 
           , 
             
           Segmentation  
           fault 
           . 
          
           #0  ClrBlank (win=0x1993060) at /build/ncurses-pKZ1BN/ncurses-6.0+20160213/ncurses/tty/tty_update.c:1129 
          
           1129 
                    
           if 
             
           ( 
           back_color_erase 
           ) 
          
           ( 
           gdb 
           ) 
             
           bt 
          
           #0  ClrBlank (win=0x1993060) at /build/ncurses-pKZ1BN/ncurses-6.0+20160213/ncurses/tty/tty_update.c:1129 
          
           #1  ClrUpdate () at /build/ncurses-pKZ1BN/ncurses-6.0+20160213/ncurses/tty/tty_update.c:1147 
          
           #2  doupdate () at /build/ncurses-pKZ1BN/ncurses-6.0+20160213/ncurses/tty/tty_update.c:1010 
          
           #3  0x00007f0a37aa07e6 in wrefresh (win=win@entry=0x1993060) at /build/ncurses-pKZ1BN/ncurses-6.0+20160213/ncurses/base/lib_refresh.c:65 
          
           #4  0x00007f0a37a99499 in recur_wrefresh (win=win@entry=0x1993060) at /build/ncurses-pKZ1BN/ncurses-6.0+20160213/ncurses/base/lib_getch.c:384 
          
           #5  0x00007f0a37a99616 in _nc_wgetch (win=win@entry=0x1993060, result=result@entry=0x7ffd33d93e24, use_meta=1) 
          
           at 
             
           / 
           build 
           / 
           ncurses 
           - 
           pKZ1BN 
           / 
           ncurses 
           - 
           6.0 
           + 
           20160213 
           / 
           ncurses 
           / 
           base 
           / 
           lib_getch 
           . 
           c 
           : 
           491 
          
           #6  0x00007f0a37a9a325 in wgetch (win=0x1993060) at /build/ncurses-pKZ1BN/ncurses-6.0+20160213/ncurses/base/lib_getch.c:672 
          
           #7  0x00007f0a37cc6ec3 in ?? () from /usr/lib/python2.7/lib-dynload/_curses.x86_64-linux-gnu.so 
          
           #8  0x00000000004c4d5a in PyEval_EvalFrameEx () 
          
           #9  0x00000000004c2e05 in PyEval_EvalCodeEx () 
          
           #10 0x00000000004def08 in ?? () 
          
           #11 0x00000000004b1153 in PyObject_Call () 
          
           #12 0x00000000004c73ec in PyEval_EvalFrameEx () 
          
           #13 0x00000000004c2e05 in PyEval_EvalCodeEx () 
          
           #14 0x00000000004caf42 in PyEval_EvalFrameEx () 
          
           #15 0x00000000004c2e05 in PyEval_EvalCodeEx () 
          
           #16 0x00000000004c2ba9 in PyEval_EvalCode () 
          
           #17 0x00000000004f20ef in ?? () 
          
           #18 0x00000000004eca72 in PyRun_FileExFlags () 
          
           #19 0x00000000004eb1f1 in PyRun_SimpleFileExFlags () 
          
           #20 0x000000000049e18a in Py_Main () 
          
           #21 0x00007f0a3be10830 in __libc_start_main (main=0x49daf0 <main>, argc=2, argv=0x7ffd33d94838, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,  
          
           stack_end 
           = 
           0x7ffd33d94828 
           ) 
             
           at 
             
           . 
           . 
           / 
           csu 
           / 
           libc 
           - 
           start 
           . 
           c 
           : 
           291 
          
           #22 0x000000000049da19 in _start ()

栈回溯看起来不太一样：我们确实不在 doupdate() 里边，而是在 ClrBlank() 中，它内联在 ClrUpdate() 里，ClrUpdate() 又内联在 doupdate() 中。

现在我真的要看源码了。

12. 源代码

安装了调试信息之后，gdb 可以同时列出源码和汇编：

 
      
       
        
            1 
          

            2 
          

            3 
          

            4 
          

            5 
          

            6 
          

            7 
          

            8 
          

            9 
          

            10 
          

            11 
          

            12 
          

            13 
          

            14 
          

            15 
          

            16 
          

            17 
          

            18 
          

            19 
          

            20 
          

            21 
          

            22 
          

            23 
          

            24 
          

            25 
          

            26 
          

            27 
          

            28 
          

            29 
          

            30 
          
 
         
           ( 
           gdb 
           ) 
             
           disas 
           / 
           s 
          
 
           Dump  
           of  
           assembler  
           code  
           for 
             
           function 
             
           doupdate 
           : 
          
 
           / 
           build 
           / 
           ncurses 
           - 
           pKZ1BN 
           / 
           ncurses 
           - 
           6.0 
           + 
           20160213 
           / 
           ncurses 
           / 
           tty 
           / 
           tty_update 
           . 
           c 
           : 
          
 
           759 
             
           { 
          
 
               
           0x00007f0a37aac2e0 
             
           < 
           + 
           0 
           > 
           : 
               
           push 
               
           % 
           r15 
          
 
               
           0x00007f0a37aac2e2 
             
           < 
           + 
           2 
           > 
           : 
               
           push 
               
           % 
           r14 
          
 
               
           0x00007f0a37aac2e4 
             
           < 
           + 
           4 
           > 
           : 
               
           push 
               
           % 
           r13 
          
 
               
           0x00007f0a37aac2e6 
             
           < 
           + 
           6 
           > 
           : 
               
           push 
               
           % 
           r12 
          
 
           [ 
           . 
           . 
           . 
           ] 
          
 
               
           0x00007f0a37aac3dd 
             
           < 
           + 
           253 
           > 
           : 
             
           jne 
                
           0x7f0a37aac6ca 
             
           < 
           doupdate 
           + 
           1002 
           > 
          

              
          
 
           1009 
                    
           if 
             
           ( 
           CurScreen 
           ( 
           SP_PARM 
           ) 
           -> 
           _clear 
             
           || 
             
           NewScreen 
           ( 
           SP_PARM 
           ) 
           -> 
           _clear 
           ) 
             
           { 
               
           /* force refresh ? */ 
          
 
               
           0x00007f0a37aac3e3 
             
           < 
           + 
           259 
           > 
           : 
             
           mov 
                
           0x80 
           ( 
           % 
           rdx 
           ) 
           , 
           % 
           rax 
          
 
               
           0x00007f0a37aac3ea 
             
           < 
           + 
           266 
           > 
           : 
             
           mov 
                
           0x88 
           ( 
           % 
           rdx 
           ) 
           , 
           % 
           rcx 
          
 
               
           0x00007f0a37aac3f1 
             
           < 
           + 
           273 
           > 
           : 
             
           cmpb 
               
           $ 
           0x0 
           , 
           0x21 
           ( 
           % 
           rax 
           ) 
          
 
               
           0x00007f0a37aac3f5 
             
           < 
           + 
           277 
           > 
           : 
             
           jne 
                
           0x7f0a37aac401 
             
           < 
           doupdate 
           + 
           289 
           > 
          
 
               
           0x00007f0a37aac3f7 
             
           < 
           + 
           279 
           > 
           : 
             
           cmpb 
               
           $ 
           0x0 
           , 
           0x21 
           ( 
           % 
           rcx 
           ) 
          
 
               
           0x00007f0a37aac3fb 
             
           < 
           + 
           283 
           > 
           : 
             
           je 
                 
           0x7f0a37aacc3b 
             
           < 
           doupdate 
           + 
           2395 
           > 
          

              
          
 
           1129 
                    
           if 
             
           ( 
           back_color_erase 
           ) 
          
 
               
           0x00007f0a37aac401 
             
           < 
           + 
           289 
           > 
           : 
             
           mov 
                
           0x20cb68 
           ( 
           % 
           rip 
           ) 
           , 
           % 
           rax 
                    
           # 0x7f0a37cb8f70 
          
 
               
           0x00007f0a37aac408 
             
           < 
           + 
           296 
           > 
           : 
             
           mov 
                
           ( 
           % 
           rax 
           ) 
           , 
           % 
           rsi 
          

              
          
 
           1128 
                    
           NCURSES_CH_T  
           blank 
             
           = 
             
           blankchar 
           ; 
          
 
               
           0x00007f0a37aac40b 
             
           < 
           + 
           299 
           > 
           : 
             
           xor 
                
           % 
           eax 
           , 
           % 
           eax 
          

              
          
 
           1129 
                    
           if 
             
           ( 
           back_color_erase 
           ) 
          
 
           = 
           > 
             
           0x00007f0a37aac40d 
             
           < 
           + 
           301 
           > 
           : 
             
           mov 
                
           0x10 
           ( 
           % 
           rsi 
           ) 
           , 
           % 
           rdi 
          
 
               
           0x00007f0a37aac411 
             
           < 
           + 
           305 
           > 
           : 
             
           cmpb 
               
           $ 
           0x0 
           , 
           0x1c 
           ( 
           % 
           rdi 
           ) 
          
 
               
           0x00007f0a37aac415 
             
           < 
           + 
           309 
           > 
           : 
             
           jne 
                
           0x7f0a37aac6f7 
             
           < 
           doupdate 
           + 
           1047 
           > 
          
 
       
 
      
    

好极了！看 “=>” 和它上边的代码。所以我们的段错误发生在 “if (back_color_erase)” ？看起来不可能。

这里我检查了一下，我的调试信息版本是对的，重新在 gdb 里边运行程序直到发生段错误。错误相同。

back_color_erase 有什么特殊吗？我们现在在 ClrBlank() 中，我先列出源码：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
           ( 
           gdb 
           ) 
             
           list  
           ClrBlank 
          
           1124 
                
           1125 
                
           static 
             
           NCURSES_INLINE  
           NCURSES_CH 
           _T 
          
           1126 
                
           ClrBlank 
           ( 
           NCURSES_SP_DCLx  
           WINDOW * 
           win 
           ) 
          
           1127 
                
           { 
          
           1128 
                    
           NCURSES_CH_T  
           blank 
             
           = 
             
           blankchar 
           ; 
          
           1129 
                    
           if 
             
           ( 
           back_color_erase 
           ) 
          
           1130 
                    
           AddAttr 
           ( 
           blank 
           , 
             
           ( 
           AttrOf 
           ( 
           BCE_BKGD 
           ( 
           SP_PARM 
           , 
             
           win 
           ) 
           ) 
             
           & 
             
           BCE_ATTRS 
           ) 
           ) 
           ; 
          
           1131 
                    
           return 
             
           blank 
           ; 
          
           1132 
                
           } 
          
           1133

啊，在这个函数里边没定义，难道是全局变量？

13. TUI

有必要看看这些代码在 gdb 的文本用户界面（TUI）里是什么样的，我用的不多，是看了 Greg 的演讲之后受到的启发。

你可以用 –tui 来启动：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
            18 
          
            19 
          
            20 
          
            21 
          
            22 
          
            23 
          
            24 
          
            25 
          
           # gdb --tui `which python` /var/cores/core.python.30520 
          
              ┌───────────────────────────────────────────────────────────────────────────┐ 
          
              │ 
                                                                                      │ 
          
              │ 
                                                                                      │ 
          
              │ 
                                                                                      │ 
          
              │ 
                                                                                      │ 
          
              │ 
                                                                                      │ 
          
              │ 
                                                                                      │ 
          
              │ 
                         
           [ 
             
           No  
           Source  
           Available 
             
           ] 
                                                  │ 
          
              │ 
                                                                                      │ 
          
              │ 
                                                                                      │ 
          
              │ 
                                                                                      │ 
          
              │ 
                                                                                      │ 
          
              │ 
                                                                                      │ 
          
              │ 
                                                                                      │ 
          
              └───────────────────────────────────────────────────────────────────────────┘ 
          
           None  
           No  
           process  
           In 
           : 
                                                            
           L 
           ? 
           ? 
               
           PC 
           : 
             
           ? 
           ? 
             
           GNU  
           gdb 
             
           ( 
           Ubuntu 
             
           7.11.1 
           - 
           0ubuntu1 
           ~ 
           16.04 
           ) 
             
           7.11.1 
          
           Copyright 
             
           ( 
           C 
           ) 
             
           2016 
             
           Free  
           Software  
           Foundation 
           , 
             
           Inc 
           . 
          
           License  
           GPLv3 
           + 
           : 
             
           GNU  
           GPL  
           version 
             
           3 
             
           or 
             
           later  
          
           This 
             
           is 
             
           free  
           software 
           : 
             
           you  
           are  
           free  
           to 
             
           change  
           and 
             
           redistribute  
           it 
           . 
          
           There  
           is 
             
           NO  
           WARRANTY 
           , 
             
           to 
             
           the  
           extent  
           permitted  
           by  
           law 
           . 
              
           Type 
             
           "show copying" 
          
           and 
             
           "show warranty" 
             
           for 
             
           details 
           . 
          
           This 
             
           GDB  
           was  
           configured  
           as 
             
           "x86_64-linux-gnu" 
           . 
          
           -- 
           - 
           Type   
           to 
             
           continue 
           , 
             
           or 
             
           q 
              
           to 
             
           quit 
           -- 
           -

它在抱怨没有 Python 源码。我可以搞定，但是我们是在 libncursesw 里边崩溃的。所以不管它敲回车让它完成加载，在发生错误的地方加载了 libncursesw 调试信息里的源码：

 
      
       
        
            1 
          

            2 
          

            3 
          

            4 
          

            5 
          

            6 
          

            7 
          

            8 
          

            9 
          

            10 
          

            11 
          

            12 
          

            13 
          

            14 
          

            15 
          

            16 
          

            17 
          

            18 
          

            19 
          

            20 
          

            21 
          

            22 
          

            23 
          

            24 
          
 
         
              ┌── 
           / 
           build 
           / 
           ncurses 
           - 
           pKZ1BN 
           / 
           ncurses 
           - 
           6.0 
           + 
           20160213 
           / 
           ncurses 
           / 
           tty 
           / 
           tty_update 
           . 
           c──────┐ 
          
 
              │ 
           1124 
                                                                                  │ 
          
 
              │ 
           1125 
                
           static 
             
           NCURSES_INLINE  
           NCURSES_CH 
           _T 
                                            │ 
          
 
              │ 
           1126 
                
           ClrBlank 
           ( 
           NCURSES_SP_DCLx  
           WINDOW * 
           win 
           ) 
                                         │ 
          
 
              │ 
           1127 
                
           { 
                                                                             │ 
          
 
              │ 
           1128 
                    
           NCURSES_CH_T  
           blank 
             
           = 
             
           blankchar 
           ; 
                                           │ 
          
 
              
           >│ 
           1129 
                    
           if 
             
           ( 
           back_color_erase 
           ) 
                                                     │ 
          
 
              │ 
           1130 
                        
           AddAttr 
           ( 
           blank 
           , 
             
           ( 
           AttrOf 
           ( 
           BCE_BKGD 
           ( 
           SP_PARM 
           , 
             
           win 
           ) 
           ) 
             
           & 
             
           BCE_ATTRS 
           )│ 
          
 
              │ 
           1131 
                    
           return 
             
           blank 
           ; 
                                                             │ 
          
 
              │ 
           1132 
                
           } 
                                                                             │ 
          
 
              │ 
           1133 
                                                                                  │ 
          
 
              │ 
           1134 
                
           / 
           * 
                                                                            │ 
          
 
              │ 
           1135 
                
           * 
           * 
                  
           ClrUpdate 
           ( 
           ) 
                                                           │ 
          
 
              │ 
           1136 
                
           * 
           * 
                                                                            │ 
          
 
              └───────────────────────────────────────────────────────────────────────────┘ 
          
 
           multi 
           - 
           thre  
           Thread 
             
           0x7f0a3c5e87 
             
           In 
           : 
             
           doupdate             
           L1129  
           PC 
           : 
             
           0x7f0a37aac40d 
             
          
 
           warning 
           : 
             
           JITed  
           object 
             
           file  
           architecture  
           unknown  
           is 
             
           not 
             
           compatible  
           with  
           target  
           ar 
          
 
           chitecture  
           i386 
           : 
           x86 
           - 
           64. 
          
 
           -- 
           - 
           Type 
             
           < 
           return 
           > 
             
           to 
             
           continue 
           , 
             
           or 
             
           q 
             
           < 
           return 
           > 
             
           to 
             
           quit 
           -- 
           - 
          
 
           Core  
           was  
           generated  
           by 
             
           ` 
           python 
             
           . 
           / 
           cachetop 
           . 
           py' 
           . 
          
 
           Program  
           terminated  
           with  
           signal  
           SIGSEGV 
           , 
             
           Segmentation  
           fault 
           . 
          
 
           #0  ClrBlank (win=0x1993060) 
          
 
                
           at 
             
           / 
           build 
           / 
           ncurses 
           - 
           pKZ1BN 
           / 
           ncurses 
           - 
           6.0 
           + 
           20160213 
           / 
           ncurses 
           / 
           tty 
           / 
           tty_update 
           . 
           c 
           : 
           1129 
          
 
           ( 
           gdb 
           ) 
          
 
       
 
      
    

棒极了！

“>” 指向发生崩溃的那行代码。更棒的是：用 layout split 命令，我们可以在不同的窗口查看源代码和汇编代码。

 
      
       
        
            1 
          

            2 
          

            3 
          

            4 
          

            5 
          

            6 
          

            7 
          

            8 
          

            9 
          

            10 
          

            11 
          

            12 
          

            13 
          

            14 
          

            15 
          

            16 
          

            17 
          

            18 
          

            19 
          

            20 
          

            21 
          

            22 
          

            23 
          

            24 
          

            25 
          

            26 
          
 
         
              ┌── 
           / 
           build 
           / 
           ncurses 
           - 
           pKZ1BN 
           / 
           ncurses 
           - 
           6.0 
           + 
           20160213 
           / 
           ncurses 
           / 
           tty 
           / 
           tty_update 
           . 
           c──────┐ 
          
 
              
           >│ 
           1129 
                    
           if 
             
           ( 
           back_color_erase 
           ) 
                                                     │ 
          
 
              │ 
           1130 
                        
           AddAttr 
           ( 
           blank 
           , 
             
           ( 
           AttrOf 
           ( 
           BCE_BKGD 
           ( 
           SP_PARM 
           , 
             
           win 
           ) 
           ) 
             
           & 
             
           BCE_ATTRS 
           )│ 
          
 
              │ 
           1131 
                    
           return 
             
           blank 
           ; 
                                                             │ 
          
 
              │ 
           1132 
                
           } 
                                                                             │ 
          
 
              │ 
           1133 
                                                                                  │ 
          
 
              │ 
           1134 
                
           / 
           * 
                                                                            │ 
          
 
              │ 
           1135 
                
           * 
           * 
                  
           ClrUpdate 
           ( 
           ) 
                                                           │ 
          
 
              └───────────────────────────────────────────────────────────────────────────┘ 
          
 
              
           >│ 
           0x7f0a37aac40d 
             
           < 
           doupdate 
           + 
           301 
           > 
               
           mov 
                
           0x10 
           ( 
           % 
           rsi 
           ) 
           , 
           % 
           rdi 
                                │ 
          
 
              │ 
           0x7f0a37aac411 
             
           < 
           doupdate 
           + 
           305 
           > 
               
           cmpb 
               
           $ 
           0x0 
           , 
           0x1c 
           ( 
           % 
           rdi 
           ) 
                                │ 
          
 
              │ 
           0x7f0a37aac415 
             
           < 
           doupdate 
           + 
           309 
           > 
               
           jne 
                
           0x7f0a37aac6f7 
             
           < 
           doupdate 
           + 
           1047 
           > 
                 │ 
          
 
              │ 
           0x7f0a37aac41b 
             
           < 
           doupdate 
           + 
           315 
           > 
               
           movswl 
             
           0x4 
           ( 
           % 
           rcx 
           ) 
           , 
           % 
           ecx 
                                 │ 
          
 
              │ 
           0x7f0a37aac41f 
             
           < 
           doupdate 
           + 
           319 
           > 
               
           movswl 
             
           0x74 
           ( 
           % 
           rdx 
           ) 
           , 
           % 
           edi 
                                │ 
          
 
              │ 
           0x7f0a37aac423 
             
           < 
           doupdate 
           + 
           323 
           > 
               
           mov 
                
           % 
           rax 
           , 
           0x40 
           ( 
           % 
           rsp 
           ) 
                                │ 
          
 
              │ 
           0x7f0a37aac428 
             
           < 
           doupdate 
           + 
           328 
           > 
               
           movl 
               
           $ 
           0x20 
           , 
           0x48 
           ( 
           % 
           rsp 
           ) 
                               │ 
          
 
              │ 
           0x7f0a37aac430 
             
           < 
           doupdate 
           + 
           336 
           > 
               
           movl 
               
           $ 
           0x0 
           , 
           0x4c 
           ( 
           % 
           rsp 
           ) 
                                │ 
          
 
              └───────────────────────────────────────────────────────────────────────────┘ 
          
 
           multi 
           - 
           thre  
           Thread 
             
           0x7f0a3c5e87 
             
           In 
           : 
             
           doupdate             
           L1129  
           PC 
           : 
             
           0x7f0a37aac40d 
             
          
 
           chitecture  
           i386 
           : 
           x86 
           - 
           64. 
          
 
           Core  
           was  
           generated  
           by 
             
           ` 
           python 
             
           . 
           / 
           cachetop 
           . 
           py' 
           . 
          
 
           Program  
           terminated  
           with  
           signal  
           SIGSEGV 
           , 
             
           Segmentation  
           fault 
           . 
          
 
           -- 
           - 
           Type 
             
           < 
           return 
           > 
             
           to 
             
           continue 
           , 
             
           or 
             
           q 
             
           < 
           return 
           > 
             
           to 
             
           quit 
           -- 
           - 
          
 
           #0  ClrBlank (win=0x1993060) 
          
 
                
           at 
             
           / 
           build 
           / 
           ncurses 
           - 
           pKZ1BN 
           / 
           ncurses 
           - 
           6.0 
           + 
           20160213 
           / 
           ncurses 
           / 
           tty 
           / 
           tty_update 
           . 
           c 
           : 
           1129 
          
 
           ( 
           gdb 
           ) 
             
           layout  
           split 
          
 
       
 
      
    

Greg 演示这个的时候，和这里的顺序相反，因此你可想像同时查看源代码和汇编的情景（这里我需要一个视频来演示）。

14. 外部工具：cscope

我需要对 back_color_erase 有更多了解，我可以试试 gdb 的搜索命令，但是我发现用一个外部工具：cscope 更快。 cscope 是一个基于文本的代码浏览器，诞生于80年代的贝尔实验室。如果你有喜欢的现代 IDE，可以不用它。

安装 cscope：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
           # apt-get install -y cscope 
          
           # wget http://archive.ubuntu.com/ubuntu/pool/main/n/ncurses/ncurses_6.0+20160213.orig.tar.gz 
          
           # tar xvf ncurses_6.0+20160213.orig.tar.gz 
          
           # cd ncurses-6.0-20160213 
          
           # cscope -bqR 
          
           # cscope -dq

cscope -bqR　用来建立查找数据库。cscope -dq 用来启动 cscope。

查找 back_color_erase 的定义：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
            18 
          
            19 
          
            20 
          
            21 
          
            22 
          
            23 
          
           Cscope  
           version 
             
           15.8b 
                                               
           Press  
           the 
             
           ? 
             
           key  
           for 
             
           help 
          
           Find  
           this 
             
           C 
             
           symbol 
           : 
          
           Find  
           this 
             
           global 
             
           definition 
           : 
             
           back_color_erase 
          
           Find  
           functions  
           called  
           by  
           this 
             
           function 
           : 
          
           Find  
           functions  
           calling  
           this 
             
           function 
           : 
          
           Find  
           this 
             
           text  
           string 
           : 
          
           Change  
           this 
             
           text  
           string 
           : 
          
           Find  
           this 
             
           egrep  
           pattern 
           : 
          
           Find  
           this 
             
           file 
           : 
          
           Find  
           files 
             
           #including this file: 
          
           Find  
           assignments  
           to 
             
           this 
             
           symbol 
           :

敲回车：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
           [ 
           . 
           . 
           . 
           ] 
          
           #define non_dest_scroll_region         CUR Booleans[26] 
          
           #define can_change                     CUR Booleans[27] 
          
           #define back_color_erase               CUR Booleans[28] 
          
           #define hue_lightness_saturation       CUR Booleans[29] 
          
           #define col_addr_glitch                CUR Booleans[30] 
          
           #define cr_cancels_micro_mode          CUR Booleans[31] 
          
           [ 
           . 
           . 
           . 
           ]

哦，一个宏定义。（作为宏定义的常见的形式，它们至少应该大写）

好吧，那么 CUR 是什么呢？用 cscope 查找定义易如反掌。

 
            1 
          
           #define CUR cur_term->type.

起码这个宏定义是大写的！

我们通过逐条查看指令和寄存器找更早定义的 cur_term 。它是什么呢？

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
           #if 0 && !0 
          
           extern  
           NCURSES_EXPORT_VAR 
           ( 
           TERMINAL * 
           ) 
             
           cur_term 
           ; 
          
           #elif 0 
          
           NCURSES_WRAPPED_VAR 
           ( 
           TERMINAL * 
           , 
             
           cur_term 
           ) 
           ; 
          
           #define cur_term   NCURSES_PUBLIC_VAR(cur_term()) 
          
           #else 
          
           extern  
           NCURSES_EXPORT_VAR 
           ( 
           TERMINAL * 
           ) 
             
           cur_term 
           ; 
          
           #endif

cscope 读取了 /usr/include/term.h 。好吧，更多的宏。我用加粗来突出这行代码，我认为它产生了影响。为什么这里会有 “if 0 && !0 … elif 0” ？我不清楚（需要再读些代码）。有时程序员会在他们想要在产品中失效的调试代码附近使用 “#if 0”，可是，这个好像是自动生成的。

查找 NCURSES_EXPORT_VAR 发现：

 
            1 
          
           #  define NCURSES_EXPORT_VAR(type) NCURSES_IMPEXP type

… 和 NCURSES_IMPEXP：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
           /* Take care of non-cygwin platforms */ 
          
           #if !defined(NCURSES_IMPEXP)           
          
           #  define NCURSES_IMPEXP /* nothing */ 
          
           #endif                                 
          
           #if !defined(NCURSES_API)              
          
           #  define NCURSES_API /* nothing */    
          
           #endif                                 
          
           #if !defined(NCURSES_EXPORT)           
          
           #  define NCURSES_EXPORT(type) NCURSES_IMPEXP type NCURSES_API 
          
           #endif                                 
          
           #if !defined(NCURSES_EXPORT_VAR)       
          
           #  define NCURSES_EXPORT_VAR(type) NCURSES_IMPEXP type 
          
           #endif

… 还有 TERMINAL：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
           typedef 
             
           struct 
             
           term 
             
           { 
                   
           /* describe an actual terminal */ 
          
           TERMTYPE     
           type 
           ; 
                   
           /* terminal type description */ 
          
           short 
               
           Filedes 
           ; 
                
           /* file description being written to */ 
          
           TTY      
           Ottyb 
           , 
                  
           /* original state of the terminal */ 
          
           Nttyb 
           ; 
                  
           /* current state of the terminal */ 
          
           int 
                 
           _baudrate 
           ; 
              
           /* used to compute padding */ 
          
           char 
             
           * 
                  
           _termname 
           ; 
                  
           /* used for termname() */ 
          
           } 
             
           TERMINAL 
           ;

嗨！TERMINAL 是大写的。和宏混在一起，这个代码不太好跟踪 …

好吧，到底是谁给 cur_term 赋的值呢？记住我们的问题是它被赋值为零，也许因为它未被初始化或显式赋值。浏览给它赋值的代码路径可能会找到更多的线索，来回答为什么没被初始化，或为什么被赋值为零。使用 cscope 的第一个选项：

快速浏览项发现：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
           NCURSES_EXPORT 
           ( 
           TERMINAL * 
           ) 
          
           NCURSES_SP_NAME 
           ( 
           set_curterm 
           ) 
             
           ( 
           NCURSES_SP_DCLx  
           TERMINAL * 
             
           termp 
           ) 
          
           { 
          
           TERMINAL * 
           oldterm 
           ; 
          
           T 
           ( 
           ( 
           T_CALLED 
           ( 
           "set_curterm(%p)" 
           ) 
           , 
             
           ( 
           void 
             
           * 
           ) 
             
           termp 
           ) 
           ) 
           ; 
          
           _nc_lock_global 
           ( 
           curses 
           ) 
           ; 
          
           oldterm 
             
           = 
             
           cur_term 
           ; 
          
           if 
             
           ( 
           SP_PARM 
           ) 
          
           SP_PARM 
           -> 
           _term 
             
           = 
             
           termp 
           ; 
          
           #if USE_REENTRANT 
          
           CurTerm 
             
           = 
             
           termp 
           ; 
          
           #else 
          
           cur_term 
             
           = 
             
           termp 
           ; 
          
           #endif

我加了高亮。甚至函数名称都被封装在宏里。但至少我们发现了 cur_term 如何被赋值的：通过 set_curterm()。也许它没被调用？

15. 外部工具：perf-tools/ftrace/uprobes

我稍后将介绍如何用 gdb 解决这个问题，可是我忍不住尝试我 perf-tools 工具箱里的 uprobe 工具，它使用 Linux 下的 ftrace 和 uprobes。用 tracers 的一个好处是它不会终止目标进程，像 gdb 一样（尽管对于这里的 cachetop.py 没什么用）。另一个好处是追踪几个和几千个进程一样容易。

我应该能追踪 libncursesw 对 set_curterm() 的调用，甚至打印出它的第一个参数：

 
            1 
          
            2 
          
           # /apps/perf-tools/bin/uprobe 'p:/lib/x86_64-linux-gnu/libncursesw.so.5:set_curterm %di' 
          
           ERROR 
           : 
             
           missing  
           symbol 
             
           "set_curterm" 
             
           in 
             
           / 
           lib 
           / 
           x86_64 
           - 
           linux 
           - 
           gnu 
           / 
           libncursesw 
           . 
           so 
           . 
           5

咦，没起作用。set_curterm() 在哪？有很多方法可以找到它，比如 gdb 或 objdump：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
           ( 
           gdb 
           ) 
             
           info  
           symbol  
           set_curterm 
          
           set_curterm  
           in 
             
           section 
             
           . 
           text  
           of 
             
           / 
           lib 
           / 
           x86_64 
           - 
           linux 
           - 
           gnu 
           / 
           libtinfo 
           . 
           so 
           . 
           5 
          
           # objdump -tT /lib/x86_64-linux-gnu/libncursesw.so.5 | grep cur_term 
          
           0000000000000000 
                  
           DO 
             
           * 
           UND* 
              
           0000000000000000 
              
           NCURSES_TINFO_5 
           . 
           0.19991023 
             
           cur_term 
          
           # objdump -tT /lib/x86_64-linux-gnu/libtinfo.so.5 | grep cur_term 
          
           0000000000228948 
             
           g 
                
           DO 
             
           . 
           bss 
               
           0000000000000008 
              
           NCURSES_TINFO_5 
           . 
           0.19991023 
             
           cur_term

gdb 表现的好些。此外如果仔细看源代码，我注意到它是为 libtinfo 构建的。

试着在 libtinfo 里边查找 set_curterm() ：

 
      
       
        
            1 
          

            2 
          

            3 
          

            4 
          

            5 
          

            6 
          

            7 
          
 
         
           # /apps/perf-tools/bin/uprobe 'p:/lib/x86_64-linux-gnu/libtinfo.so.5:set_curterm %di' 
          
 
           Tracing  
           uprobe  
           set_curterm 
             
           ( 
           p 
           : 
           set_curterm 
             
           / 
           lib 
           / 
           x86_64 
           - 
           linux 
           - 
           gnu 
           / 
           libtinfo 
           . 
           so 
           . 
           5 
           : 
           0xfa80 
             
           % 
           di 
           ) 
           . 
             
           Ctrl 
           - 
           C 
             
           to 
             
           end 
           . 
          
 
                      
           python 
           - 
           31617 
             
           [ 
           007 
           ] 
             
           d 
           . 
           . 
           . 
             
           24236402.719959 
           : 
             
           set_curterm 
           : 
             
           ( 
           0x7f116fcc2a80 
           ) 
             
           arg1 
           = 
           0x1345d70 
          
 
                      
           python 
           - 
           31617 
             
           [ 
           007 
           ] 
             
           d 
           . 
           . 
           . 
             
           24236402.720033 
           : 
             
           set_curterm 
           : 
             
           ( 
           0x7f116fcc2a80 
           ) 
             
           arg1 
           = 
           0x13a22e0 
          
 
                      
           python 
           - 
           31617 
             
           [ 
           007 
           ] 
             
           d 
           . 
           . 
           . 
             
           24236402.723804 
           : 
             
           set_curterm 
           : 
             
           ( 
           0x7f116fcc2a80 
           ) 
             
           arg1 
           = 
           0x14cdfa0 
          
 
                      
           python 
           - 
           31617 
             
           [ 
           007 
           ] 
             
           d 
           . 
           . 
           . 
             
           24236402.723838 
           : 
             
           set_curterm 
           : 
             
           ( 
           0x7f116fcc2a80 
           ) 
             
           arg1 
           = 
           0x0 
          
 
           ^ 
           C 
          
 
       
 
      
    

找到了。所以 set_curterm() 被调用了，并且被调用了四次。最后一次被传了一个零，看起来这就是问题所在。

如果你觉得疑惑，我怎么就知道 %di 寄存器就是第一个参数呢，因为 AMD64/x86_64 ABI 写着呢（假设这个库和 ABI 兼容）。这里有提示：

 
      
       
        
            1 
          

            2 
          

            3 
          

            4 
          

            5 
          

            6 
          

            7 
          

            8 
          

            9 
          

            10 
          

            11 
          

            12 
          

            13 
          

            14 
          

            15 
          

            16 
          

            17 
          

            18 
          

            19 
          
 
         
           # man syscall 
          
 
           [ 
           . 
           . 
           . 
           ] 
          
 
                   
           arch 
           / 
           ABI       
           arg1   
           arg2   
           arg3   
           arg4   
           arg5   
           arg6   
           arg7   
           Notes 
          
 
                  ────────────────────────────────────────────────────────────────── 
          
 
                   
           arm 
           / 
           OABI       
           a1     
           a2     
           a3     
           a4     
           v1     
           v2     
           v3 
          
 
                   
           arm 
           / 
           EABI       
           r0     
           r1     
           r2     
           r3     
           r4     
           r5     
           r6 
          
 
                   
           arm64          
           x0     
           x1     
           x2     
           x3     
           x4     
           x5 
                
           - 
          
 
                   
           blackfin       
           R0     
           R1     
           R2     
           R3     
           R4     
           R5 
                
           - 
          
 
                   
           i386           
           ebx    
           ecx    
           edx    
           esi    
           edi    
           ebp 
               
           - 
          
 
                   
           ia64           
           out0   
           out1   
           out2   
           out3   
           out4   
           out5 
              
           - 
          
 
                   
           mips 
           / 
           o32       
           a0     
           a1     
           a2     
           a3 
                
           - 
                 
           - 
                 
           - 
                 
           See  
           below 
          
 
                   
           mips 
           / 
           n32 
           , 
           64 
               
           a0     
           a1     
           a2     
           a3     
           a4     
           a5 
                
           - 
          
 
                   
           parisc         
           r26    
           r25    
           r24    
           r23    
           r22    
           r21 
               
           - 
          
 
                   
           s390           
           r2     
           r3     
           r4     
           r5     
           r6     
           r7 
                
           - 
          
 
                   
           s390x          
           r2     
           r3     
           r4     
           r5     
           r6     
           r7 
                
           - 
          
 
                   
           sparc 
           / 
           32 
                  
           o0     
           o1     
           o2     
           o3     
           o4     
           o5 
                
           - 
          
 
                   
           sparc 
           / 
           64 
                  
           o0     
           o1     
           o2     
           o3     
           o4     
           o5 
                
           - 
          
 
                   
           x86_64         
           rdi    
           rsi    
           rdx    
           r10    
           r8     
           r9 
                
           - 
          
 
           [ 
           . 
           . 
           . 
           ] 
          
 
       
 
      
    

我还想知道调用 arg1=0x0 的堆栈信息，但是 ftrace 还不支持栈追踪。

16. 外部工具：bcc/BPF

由于我们在调试 bcc 工具 cachetop.py，值得注意的是 bcc 里的 trace.py 有和我的老工具 uprobe 类似的功能：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
           # ./trace.py 'p:tinfo:set_curterm "%d", arg1' 
          
           TIME      
           PID     
           COMM          
           FUNC 
                         
           - 
          
           01 
           : 
           00 
           : 
           20 
             
           31698 
              
           python        
           set 
           _curterm 
                  
           38018416 
          
           01 
           : 
           00 
           : 
           20 
             
           31698 
              
           python        
           set 
           _curterm 
                  
           38396640 
          
           01 
           : 
           00 
           : 
           20 
             
           31698 
              
           python        
           set 
           _curterm 
                  
           39624608 
          
           01 
           : 
           00 
           : 
           20 
             
           31698 
              
           python        
           set 
           _curterm 
                  
           0

是的，我们在用 bcc 调试 bcc ！

如果你对 bcc 不熟悉，它值得一看。它为 Linux4.x 系列里的 BPF 新特性提供了 Python 和 lua 接口。总之，它能让很多以前不可能或昂贵以致无法运行的性能工具运行起来。我以前发过贴介绍如何在 Ubuntu Xenial 上运行它。

bcc 的 trace.py 工具应该有一个开关来决定是否打印用户堆栈，因为内核从 Linux4.6 开始具备 BPF 堆栈功能，不过到写这篇文章的时候我们还没有加上这个开关。

17. 更多的断点

我真的应该从在 set_curterm() 下了断点的 gdb 入手，可是我觉得我们走的弯路，使用ftrace和BPF的还是蛮有趣的。

回到实时运行模式：

 
      
       
        
            1 
          

            2 
          

            3 
          

            4 
          

            5 
          

            6 
          

            7 
          

            8 
          

            9 
          

            10 
          

            11 
          

            12 
          

            13 
          

            14 
          

            15 
          

            16 
          

            17 
          

            18 
          

            19 
          

            20 
          

            21 
          

            22 
          

            23 
          

            24 
          

            25 
          

            26 
          

            27 
          

            28 
          
 
         
           # gdb `which python` 
          
 
           GNU  
           gdb 
             
           ( 
           Ubuntu 
             
           7.11.1 
           - 
           0ubuntu1 
           ~ 
           16.04 
           ) 
             
           7.11.1 
          
 
           [ 
           . 
           . 
           . 
           ] 
          
 
           ( 
           gdb 
           ) 
             
           b 
             
           set_curterm 
          
 
           Function 
             
           "set_curterm" 
             
           not 
             
           defined 
           . 
          
 
           Make  
           breakpoint  
           pending  
           on  
           future  
           shared  
           library  
           load 
           ? 
             
           ( 
           y 
             
           or 
             
           [ 
           n 
           ] 
           ) 
             
           y 
          
 
           Breakpoint 
             
           1 
             
           ( 
           set_curterm 
           ) 
             
           pending 
           . 
          
 
           ( 
           gdb 
           ) 
             
           r 
             
           cachetop 
           . 
           py 
          
 
           Starting  
           program 
           : 
             
           / 
           usr 
           / 
           bin 
           / 
           python  
           cachetop 
           . 
           py 
          
 
           [ 
           Thread  
           debugging  
           using  
           libthread_db  
           enabled 
           ] 
          
 
           Using  
           host  
           libthread_db  
           library 
             
           "/lib/x86_64-linux-gnu/libthread_db.so.1" 
           . 
          
 
           Breakpoint 
             
           1 
           , 
             
           set_curterm 
             
           ( 
           termp 
           = 
           termp 
           @ 
           entry 
           = 
           0xa43150 
           ) 
             
           at 
             
           / 
           build 
           / 
           ncurses 
           - 
           pKZ1BN 
           / 
           ncurses 
           - 
           6.0 
           + 
           20160213 
           / 
           ncurses 
           / 
           tinfo 
           / 
           lib_cur_term 
           . 
           c 
           : 
           80 
          
 
           80 
              
           { 
          
 
           ( 
           gdb 
           ) 
             
           c 
          
 
           Continuing 
           . 
          

              
          
 
           Breakpoint 
             
           1 
           , 
             
           set_curterm 
             
           ( 
           termp 
           = 
           termp 
           @ 
           entry 
           = 
           0xab5870 
           ) 
             
           at 
             
           / 
           build 
           / 
           ncurses 
           - 
           pKZ1BN 
           / 
           ncurses 
           - 
           6.0 
           + 
           20160213 
           / 
           ncurses 
           / 
           tinfo 
           / 
           lib_cur_term 
           . 
           c 
           : 
           80 
          
 
           80 
              
           { 
          
 
           ( 
           gdb 
           ) 
             
           c 
          
 
           Continuing 
           . 
          

              
          
 
           Breakpoint 
             
           1 
           , 
             
           set_curterm 
             
           ( 
           termp 
           = 
           termp 
           @ 
           entry 
           = 
           0xbecb90 
           ) 
             
           at 
             
           / 
           build 
           / 
           ncurses 
           - 
           pKZ1BN 
           / 
           ncurses 
           - 
           6.0 
           + 
           20160213 
           / 
           ncurses 
           / 
           tinfo 
           / 
           lib_cur_term 
           . 
           c 
           : 
           80 
          
 
           80 
              
           { 
          
 
           ( 
           gdb 
           ) 
             
           c 
          
 
           Continuing 
           . 
          

              
          
 
           Breakpoint 
             
           1 
           , 
             
           set_curterm 
             
           ( 
           termp 
           = 
           0x0 
           ) 
             
           at 
             
           / 
           build 
           / 
           ncurses 
           - 
           pKZ1BN 
           / 
           ncurses 
           - 
           6.0 
           + 
           20160213 
           / 
           ncurses 
           / 
           tinfo 
           / 
           lib_cur_term 
           . 
           c 
           : 
           80 
          
 
           80 
              
           { 
          
 
       
 
      
    

好的，在这个断点我们可以看到 set_curterm() 被调用了，被传了一个 termp = 0x0 的参数，多亏了 debuginfo 提供的信息。如果没有 debuginfo ，我只能在每个断点处打印寄存器值。

我打印栈帧出来，这样我们可以看到是谁将 curterm 设为零的。

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
            18 
          
            19 
          
            20 
          
            21 
          
            22 
          
            23 
          
            24 
          
            25 
          
            26 
          
            27 
          
            28 
          
            29 
          
            30 
          
            31 
          
            32 
          
            33 
          
            34 
          
            35 
          
            36 
          
            37 
          
            38 
          
            39 
          
            40 
          
            41 
          
            42 
          
            43 
          
           ( 
           gdb 
           ) 
             
           bt 
          
           #0  set_curterm (termp=0x0) at /build/ncurses-pKZ1BN/ncurses-6.0+20160213/ncurses/tinfo/lib_cur_term.c:80 
          
           #1  0x00007ffff5a44e75 in llvm::sys::Process::FileDescriptorHasColors(int) () from /usr/lib/x86_64-linux-gnu/libbcc.so.0 
          
           #2  0x00007ffff45cabb8 in clang::driver::tools::Clang::ConstructJob(clang::driver::Compilation&, clang::driver::JobAction const&, clang::driver::InputInfo const&, llvm::SmallVector<clang::driver::InputInfo, 4u> const&, llvm::opt::ArgList const&, char const*) const () from /usr/lib/x86_64-linux-gnu/libbcc.so.0 
          
           #3  0x00007ffff456ffa5 in clang::driver::Driver::BuildJobsForAction(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, char const*, bool, bool, char const*, clang::driver::InputInfo&) const () from /usr/lib/x86_64-linux-gnu/libbcc.so.0 
          
           #4  0x00007ffff4570501 in clang::driver::Driver::BuildJobs(clang::driver::Compilation&) const () from /usr/lib/x86_64-linux-gnu/libbcc.so.0 
          
           #5  0x00007ffff457224a in clang::driver::Driver::BuildCompilation(llvm::ArrayRef<char const*>) () from /usr/lib/x86_64-linux-gnu/libbcc.so.0 
          
           #6  0x00007ffff4396cda in ebpf::ClangLoader::parse(std::unique_ptr<llvm::Module, std::default_delete<llvm::Module> >*, std::unique_ptr<std::vector<ebpf::TableDesc, std::allocator<ebpf::TableDesc> >, std::default_delete<std::vector<ebpf::TableDesc, std::allocator<ebpf::TableDesc> > > >*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, char const**, int) () from /usr/lib/x86_64-linux-gnu/libbcc.so.0 
          
           #7  0x00007ffff4344314 in ebpf::BPFModule::load_cfile(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool, char const**, int) () 
          
           from 
             
           / 
           usr 
           / 
           lib 
           / 
           x86_64 
           - 
           linux 
           - 
           gnu 
           / 
           libbcc 
           . 
           so 
           . 
           0 
          
           #8  0x00007ffff4349e5e in ebpf::BPFModule::load_string(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, char const**, int) () 
          
           from 
             
           / 
           usr 
           / 
           lib 
           / 
           x86_64 
           - 
           linux 
           - 
           gnu 
           / 
           libbcc 
           . 
           so 
           . 
           0 
          
           #9  0x00007ffff43430c8 in bpf_module_create_c_from_string () from /usr/lib/x86_64-linux-gnu/libbcc.so.0 
          
           #10 0x00007ffff690ae40 in ffi_call_unix64 () from /usr/lib/x86_64-linux-gnu/libffi.so.6 
          
           #11 0x00007ffff690a8ab in ffi_call () from /usr/lib/x86_64-linux-gnu/libffi.so.6 
          
           #12 0x00007ffff6b1a68c in _ctypes_callproc () from /usr/lib/python2.7/lib-dynload/_ctypes.x86_64-linux-gnu.so 
          
           #13 0x00007ffff6b1ed82 in ?? () from /usr/lib/python2.7/lib-dynload/_ctypes.x86_64-linux-gnu.so 
          
           #14 0x00000000004b1153 in PyObject_Call () 
          
           #15 0x00000000004ca5ca in PyEval_EvalFrameEx () 
          
           #16 0x00000000004c2e05 in PyEval_EvalCodeEx () 
          
           #17 0x00000000004def08 in ?? () 
          
           #18 0x00000000004b1153 in PyObject_Call () 
          
           #19 0x00000000004f4c3e in ?? () 
          
           #20 0x00000000004b1153 in PyObject_Call () 
          
           #21 0x00000000004f49b7 in ?? () 
          
           #22 0x00000000004b6e2c in ?? () 
          
           #23 0x00000000004b1153 in PyObject_Call () 
          
           #24 0x00000000004ca5ca in PyEval_EvalFrameEx () 
          
           #25 0x00000000004c2e05 in PyEval_EvalCodeEx () 
          
           #26 0x00000000004def08 in ?? () 
          
           #27 0x00000000004b1153 in PyObject_Call () 
          
           #28 0x00000000004c73ec in PyEval_EvalFrameEx () 
          
           #29 0x00000000004c2e05 in PyEval_EvalCodeEx () 
          
           #30 0x00000000004caf42 in PyEval_EvalFrameEx () 
          
           #31 0x00000000004c2e05 in PyEval_EvalCodeEx () 
          
           #32 0x00000000004c2ba9 in PyEval_EvalCode () 
          
           #33 0x00000000004f20ef in ?? () 
          
           #34 0x00000000004eca72 in PyRun_FileExFlags () 
          
           #35 0x00000000004eb1f1 in PyRun_SimpleFileExFlags () 
          
           #36 0x000000000049e18a in Py_Main () 
          
           #37 0x00007ffff7811830 in __libc_start_main (main=0x49daf0 <main>, argc=2, argv=0x7fffffffdfb8, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,  
          
           stack_end 
           = 
           0x7fffffffdfa8 
           ) 
             
           at 
             
           . 
           . 
           / 
           csu 
           / 
           libc 
           - 
           start 
           . 
           c 
           : 
           291 
          
           #38 0x000000000049da19 in _start ()

好了，有了更多的线索…我认为。我们在 llvm::sys::Process::FileDescriptorHasColors()里边。llvm 编译器有问题？

18. 外部工具：cscope，再来一次

代码较多的时候使用 cscope 查看，这次是 llvm。FileDescriptorHasColors() 函数：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
           static 
             
           bool 
             
           terminalHasColors 
           ( 
           int 
             
           fd 
           ) 
             
           { 
          
           [ 
           . 
           . 
           . 
           ] 
          
           // Now extract the structure allocated by setupterm and free its memory 
          
           // through a really silly dance. 
          
           struct 
             
           term * 
           termp 
             
           = 
             
           set_curterm 
           ( 
           ( 
           struct 
             
           term * 
           ) 
           nullptr 
           ) 
           ; 
          
           ( 
           void 
           ) 
           del_curterm 
           ( 
           termp 
           ) 
           ; 
             
           // Drop any errors here.

这是较早版本中使用的代码：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
           static 
             
           bool 
             
           terminalHasColors 
           ( 
           ) 
             
           { 
          
           if 
             
           ( 
           const 
             
           char 
             
           * 
           term 
             
           = 
             
           std 
           :: 
           getenv 
           ( 
           "TERM" 
           ) 
           ) 
             
           { 
          
           // Most modern terminals support ANSI escape sequences for colors. 
          
           // We could check terminfo, or have a list of known terms that support 
          
           // colors, but that would be overkill. 
          
           // The user can always ask for no colors by setting TERM to dumb, or 
          
           // using a commandline flag. 
          
           return 
             
           strcmp 
           ( 
           term 
           , 
             
           "dumb" 
           ) 
             
           != 
             
           0 
           ; 
          
           } 
          
           return 
             
           false 
           ; 
          
           }

用空指针调用 set_curterm() 变成了 “愚蠢的舞蹈” 。

19. 写内存

作为实验，我要修改程序内存来避免 set_curterm() 被置零，用来探索可能的解决方法。

运行 gdb ，在 set_curterm() 下断点，跑到零调用的地方：

 
      
       
        
            1 
          

            2 
          

            3 
          

            4 
          

            5 
          

            6 
          

            7 
          

            8 
          

            9 
          

            10 
          

            11 
          

            12 
          

            13 
          

            14 
          

            15 
          

            16 
          

            17 
          

            18 
          

            19 
          

            20 
          

            21 
          

            22 
          

            23 
          

            24 
          

            25 
          

            26 
          

            27 
          

            28 
          

            29 
          
 
         
           # gdb `which python` 
          
 
           GNU  
           gdb 
             
           ( 
           Ubuntu 
             
           7.11.1 
           - 
           0ubuntu1 
           ~ 
           16.04 
           ) 
             
           7.11.1 
                                              
          
 
           [ 
           . 
           . 
           . 
           ] 
          
 
           ( 
           gdb 
           ) 
             
           b 
             
           set_curterm 
          
 
           Function 
             
           "set_curterm" 
             
           not 
             
           defined 
           . 
          
 
           Make  
           breakpoint  
           pending  
           on  
           future  
           shared  
           library  
           load 
           ? 
             
           ( 
           y 
             
           or 
             
           [ 
           n 
           ] 
           ) 
             
           y 
          
 
           Breakpoint 
             
           1 
             
           ( 
           set_curterm 
           ) 
             
           pending 
           . 
          
 
           ( 
           gdb 
           ) 
             
           r 
             
           cachetop 
           . 
           py 
          
 
           Starting  
           program 
           : 
             
           / 
           usr 
           / 
           bin 
           / 
           python  
           cachetop 
           . 
           py 
          
 
           [ 
           Thread  
           debugging  
           using  
           libthread_db  
           enabled 
           ] 
          
 
           Using  
           host  
           libthread_db  
           library 
             
           "/lib/x86_64-linux-gnu/libthread_db.so.1" 
           . 
          

              
          
 
           Breakpoint 
             
           1 
           , 
             
           set_curterm 
             
           ( 
           termp 
           = 
           termp 
           @ 
           entry 
           = 
           0xa43150 
           ) 
             
           at 
             
           / 
           build 
           / 
           ncurses 
           - 
           pKZ1BN 
           / 
           ncurses 
           - 
           6.0 
           + 
           20160213 
           / 
           ncurses 
           / 
           tinfo 
           / 
           lib_cur_term 
           . 
           c 
           : 
           80 
          
 
           80 
                  
           { 
          
 
           ( 
           gdb 
           ) 
             
           c 
          
 
           Continuing 
           . 
          

              
          
 
           Breakpoint 
             
           1 
           , 
             
           set_curterm 
             
           ( 
           termp 
           = 
           termp 
           @ 
           entry 
           = 
           0xab5870 
           ) 
             
           at 
             
           / 
           build 
           / 
           ncurses 
           - 
           pKZ1BN 
           / 
           ncurses 
           - 
           6.0 
           + 
           20160213 
           / 
           ncurses 
           / 
           tinfo 
           / 
           lib_cur_term 
           . 
           c 
           : 
           80 
          
 
           80 
                  
           { 
          
 
           ( 
           gdb 
           ) 
             
           c 
          
 
           Continuing 
           . 
          

              
          
 
           Breakpoint 
             
           1 
           , 
             
           set_curterm 
             
           ( 
           termp 
           = 
           termp 
           @ 
           entry 
           = 
           0xbecb90 
           ) 
             
           at 
             
           / 
           build 
           / 
           ncurses 
           - 
           pKZ1BN 
           / 
           ncurses 
           - 
           6.0 
           + 
           20160213 
           / 
           ncurses 
           / 
           tinfo 
           / 
           lib_cur_term 
           . 
           c 
           : 
           80 
          
 
           80 
                  
           { 
          
 
           ( 
           gdb 
           ) 
             
           c 
          
 
           Continuing 
           . 
                                                                                
          

              
          
 
           Breakpoint 
             
           1 
           , 
             
           set_curterm 
             
           ( 
           termp 
           = 
           0x0 
           ) 
             
           at 
             
           / 
           build 
           / 
           ncurses 
           - 
           pKZ1BN 
           / 
           ncurses 
           - 
           6.0 
           + 
           20160213 
           / 
           ncurses 
           / 
           tinfo 
           / 
           lib_cur_term 
           . 
           c 
           : 
           80 
          
 
           80 
                  
           { 
          
 
       
 
      
    

这里我用 set 命令来改写内存，把零换成在前面看到的 set_curterm() 参数 0xbecb90 ，希望它仍是合法的。

警告：写内存不安全！gdb 不会问你 “你确定？”。如果你写错了或者敲错了，会搞坏程序。最好的情况是你的程序立即奔溃，你意识到自己做错了。最糟的情况，程序使用坏的数据继续运行几年之后被发现是错的。

这里，我在不用于生产的实验室机器上做试验，所以我继续。
我以16进制（p/x）的形式打印 %rdi 的值，然后将其设为之前的地址，再打印一次，最后打印所有寄存器的值：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
            18 
          
            19 
          
            20 
          
            21 
          
            22 
          
            23 
          
            24 
          
            25 
          
            26 
          
            27 
          
            28 
          
            29 
          
            30 
          
           ( 
           gdb 
           ) 
             
           p 
           / 
           x 
             
           $ 
           rdi 
          
           $ 
           1 
             
           = 
             
           0x0 
          
           ( 
           gdb 
           ) 
             
           set 
             
           $ 
           rdi 
           = 
           0xbecb90 
          
           ( 
           gdb 
           ) 
             
           p 
           / 
           x 
             
           $ 
           rdi 
          
           $ 
           2 
             
           = 
             
           0xbecb90 
          
           ( 
           gdb 
           ) 
             
           i 
             
           r 
          
           rax 
                        
           0x100 
                
           256 
          
           rbx 
                        
           0x1 
              
           1 
          
           rcx 
                        
           0xe71 
                
           3697 
          
           rdx 
                        
           0x0 
              
           0 
          
           rsi 
                        
           0x7ffff5dd45d3 
               
           140737318307283 
          
           rdi 
                        
           0xbecb90 
             
           12503952 
          
           rbp 
                        
           0x100 
                
           0x100 
          
           rsp 
                        
           0x7fffffffa5b8 
               
           0x7fffffffa5b8 
          
           r8 
                         
           0xbf0050 
             
           12517456 
          
           r9 
                         
           0x1999999999999999 
               
           1844674407370955161 
          
           r10 
                        
           0xbf0040 
             
           12517440 
          
           r11 
                        
           0x7ffff7bb4b78 
               
           140737349634936 
          
           r12 
                        
           0xbecb70 
             
           12503920 
          
           r13 
                        
           0xbeaea0 
             
           12496544 
          
           r14 
                        
           0x7fffffffa9a0 
               
           140737488333216 
          
           r15 
                        
           0x7fffffffa8a0 
               
           140737488332960 
          
           rip 
                        
           0x7ffff3c76a80 
               
           0x7ffff3c76a80 
             
           < 
           set_curterm 
           > 
          
           eflags 
                     
           0x246 
                
           [ 
             
           PF  
           ZF  
           IF 
             
           ] 
          
           cs 
                         
           0x33 
             
           51 
          
           ss 
                         
           0x2b 
             
           43 
          
           ds 
                         
           0x0 
              
           0 
          
           es 
                         
           0x0 
              
           0 
          
           fs 
                         
           0x0 
              
           0 
          
           gs 
                         
           0x0 
              
           0

（因为这里我已经安装了调试信息，因此不必使用寄存器，我可以设置传给 set_curterm() 的参数参数 “termp”，而不是 $rdi。）

现在 %rdi 被用到了，所以那些寄存器看起来还能继续用。

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
           ( 
           gdb 
           ) 
             
           c 
          
           Continuing 
           . 
          
           Breakpoint 
             
           1 
           , 
             
           set_curterm 
             
           ( 
           termp 
           = 
           termp 
           @ 
           entry 
           = 
           0x0 
           ) 
             
           at 
             
           / 
           build 
           / 
           ncurses 
           - 
           pKZ1BN 
           / 
           ncurses 
           - 
           6.0 
           + 
           20160213 
           / 
           ncurses 
           / 
           tinfo 
           / 
           lib_cur_term 
           . 
           c 
           : 
           80 
          
           80 
              
           {

好的，在调用 set_curterm() 时程序没崩！但遇到另一个参数也是零的问题。我们故技重施：

 
      
       
        
            1 
          

            2 
          

            3 
          

            4 
          

            5 
          

            6 
          

            7 
          

            8 
          
 
         
           ( 
           gdb 
           ) 
             
           set 
             
           $ 
           rdi 
           = 
           0xbecb90 
          
 
           ( 
           gdb 
           ) 
             
           c 
          
 
           Continuing 
           . 
          
 
           warning 
           : 
             
           JITed  
           object 
             
           file  
           architecture  
           unknown  
           is 
             
           not 
             
           compatible  
           with  
           target  
           architecture  
           i386 
           : 
           x86 
           - 
           64. 
          

              
          
 
           Program  
           received  
           signal  
           SIGSEGV 
           , 
             
           Segmentation  
           fault 
           . 
          
 
           0x00007ffff34ad411 
             
           in 
             
           ClrBlank 
             
           ( 
           win 
           = 
           0xaea060 
           ) 
             
           at 
             
           / 
           build 
           / 
           ncurses 
           - 
           pKZ1BN 
           / 
           ncurses 
           - 
           6.0 
           + 
           20160213 
           / 
           ncurses 
           / 
           tty 
           / 
           tty_update 
           . 
           c 
           : 
           1129 
          
 
           1129 
                    
           if 
             
           ( 
           back_color_erase 
           ) 
          
 
       
 
      
    

啊。这就是我写内存的后果。所以这次试验以另一个段错误结束。

20. 条件断点

在前面一节，我用了 3 个 continues 到达断点的正确调用处。如果有几百次调用的话，就得用条件断点了。这里有个例子。

和之前一样我运行程序，在 set_curterm() 下断点：

 
      
       
        
            1 
          

            2 
          

            3 
          

            4 
          

            5 
          

            6 
          

            7 
          

            8 
          

            9 
          

            10 
          

            11 
          

            12 
          

            13 
          

            14 
          
 
         
           # gdb `which python` 
          
 
           GNU  
           gdb 
             
           ( 
           Ubuntu 
             
           7.11.1 
           - 
           0ubuntu1 
           ~ 
           16.04 
           ) 
             
           7.11.1 
                                              
          
 
           [ 
           . 
           . 
           . 
           ] 
          
 
           ( 
           gdb 
           ) 
             
           b 
             
           set_curterm 
          
 
           Function 
             
           "set_curterm" 
             
           not 
             
           defined 
           . 
          
 
           Make  
           breakpoint  
           pending  
           on  
           future  
           shared  
           library  
           load 
           ? 
             
           ( 
           y 
             
           or 
             
           [ 
           n 
           ] 
           ) 
             
           y 
          
 
           Breakpoint 
             
           1 
             
           ( 
           set_curterm 
           ) 
             
           pending 
           . 
          
 
           ( 
           gdb 
           ) 
             
           r 
             
           cachetop 
           . 
           py 
          
 
           Starting  
           program 
           : 
             
           / 
           usr 
           / 
           bin 
           / 
           python  
           cachetop 
           . 
           py 
          
 
           [ 
           Thread  
           debugging  
           using  
           libthread_db  
           enabled 
           ] 
          
 
           Using  
           host  
           libthread_db  
           library 
             
           "/lib/x86_64-linux-gnu/libthread_db.so.1" 
           . 
          

              
          
 
           Breakpoint 
             
           1 
           , 
             
           set_curterm 
             
           ( 
           termp 
           = 
           termp 
           @ 
           entry 
           = 
           0xa43150 
           ) 
             
           at 
             
           / 
           build 
           / 
           ncurses 
           - 
           pKZ1BN 
           / 
           ncurses 
           - 
           6.0 
           + 
           20160213 
           / 
           ncurses 
           / 
           tinfo 
           / 
           lib_cur_term 
           . 
           c 
           : 
           80 
          
 
           80 
              
           { 
          
 
       
 
      
    

现在我要将 1 号断点变成条件断点，这样它只会在 %rdi 的值为零是被触发：

 
      
       
        
            1 
          

            2 
          

            3 
          

            4 
          

            5 
          

            6 
          

            7 
          

            8 
          

            9 
          

            10 
          

            11 
          
 
         
           ( 
           gdb 
           ) 
             
           cond 
             
           1 
             
           $ 
           rdi 
           == 
           0x0 
          
 
           ( 
           gdb 
           ) 
             
           i 
             
           b 
          
 
           Num      
           Type            
           Disp  
           Enb  
           Address             
           What 
          
 
           1 
                   
           breakpoint      
           keep 
             
           y 
               
           0x00007ffff3c76a80 
             
           in 
             
           set_curterm  
           at 
             
           / 
           build 
           / 
           ncurses 
           - 
           pKZ1BN 
           / 
           ncurses 
           - 
           6.0 
           + 
           20160213 
           / 
           ncurses 
           / 
           tinfo 
           / 
           lib_cur_term 
           . 
           c 
           : 
           80 
          
 
                
           stop  
           only  
           if 
             
           $ 
           rdi 
           == 
           0x0 
          
 
                
           breakpoint  
           already  
           hit 
             
           1 
             
           time 
          
 
           ( 
           gdb 
           ) 
             
           c 
          
 
           Continuing 
           . 
          

              
          
 
           Breakpoint 
             
           1 
           , 
             
           set_curterm 
             
           ( 
           termp 
           = 
           0x0 
           ) 
             
           at 
             
           / 
           build 
           / 
           ncurses 
           - 
           pKZ1BN 
           / 
           ncurses 
           - 
           6.0 
           + 
           20160213 
           / 
           ncurses 
           / 
           tinfo 
           / 
           lib_cur_term 
           . 
           c 
           : 
           80 
          
 
           ( 
           gdb 
           ) 
          
 
       
 
      
    

漂亮！cond 是 conditional 的简写。为什么当我第一次创建 “pending” 断点的时候没有立即运行它呢？因为我发现在 pending 断点上条件不管用，至少在这个版本的 gdb 上是这样。（要么是我哪里做错了。）我也用 i b （info breakpoints）列出了断点信息。

21. 返回命令

我曾经试过另一个改值的方法，但是这次我要改指令而不是数据。

警告：看前边的警告，这里也适用。

和之前一样我们来到 set_curterm 零断点处，然后敲入 ret （return 的简写），就会立即从此函数返回并且不执行这个函数。我想用不执行函数的方式让全局变量 curterm 不被置零。

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
           [ 
           . 
           . 
           . 
           ] 
          
           ( 
           gdb 
           ) 
             
           c 
          
           Continuing 
           . 
          
           Breakpoint 
             
           1 
           , 
             
           set_curterm 
             
           ( 
           termp 
           = 
           0x0 
           ) 
             
           at 
             
           / 
           build 
           / 
           ncurses 
           - 
           pKZ1BN 
           / 
           ncurses 
           - 
           6.0 
           + 
           20160213 
           / 
           ncurses 
           / 
           tinfo 
           / 
           lib_cur_term 
           . 
           c 
           : 
           80 
          
           ( 
           gdb 
           ) 
             
           ret 
          
           Make  
           set_curterm  
           return 
             
           now 
           ? 
             
           ( 
           y 
             
           or 
             
           n 
           ) 
             
           y 
          
           #0  0x00007ffff5a44e75 in llvm::sys::Process::FileDescriptorHasColors(int) () from /usr/lib/x86_64-linux-gnu/libbcc.so.0 
          
           ( 
           gdb 
           ) 
             
           c 
          
           Continuing 
           . 
          
           Program  
           received  
           signal  
           SIGSEGV 
           , 
             
           Segmentation  
           fault 
           . 
          
           _nc_free_termtype 
             
           ( 
           ptr 
           = 
           ptr 
           @ 
           entry 
           = 
           0x100 
           ) 
             
           at 
             
           / 
           build 
           / 
           ncurses 
           - 
           pKZ1BN 
           / 
           ncurses 
           - 
           6.0 
           + 
           20160213 
           / 
           ncurses 
           / 
           tinfo 
           / 
           free_ttype 
           . 
           c 
           : 
           52 
          
           52 
                  
           FreeIfNeeded 
           ( 
           ptr 
           -> 
           str_table 
           ) 
           ;

又崩了。这是我搞砸的现场。

再试一次。在多看了一点代码之后，我想第二次尝试 ret，以防父函数被卷进来。再来一次，这只是一次非常规试验：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
           [ 
           . 
           . 
           . 
           ] 
          
           ( 
           gdb 
           ) 
             
           c 
          
           Continuing 
           . 
          
           Breakpoint 
             
           1 
           , 
             
           set_curterm 
             
           ( 
           termp 
           = 
           0x0 
           ) 
             
           at 
             
           / 
           build 
           / 
           ncurses 
           - 
           pKZ1BN 
           / 
           ncurses 
           - 
           6.0 
           + 
           20160213 
           / 
           ncurses 
           / 
           tinfo 
           / 
           lib_cur_term 
           . 
           c 
           : 
           80 
          
           80 
              
           { 
          
           ( 
           gdb 
           ) 
             
           ret 
          
           Make  
           set_curterm  
           return 
             
           now 
           ? 
             
           ( 
           y 
             
           or 
             
           n 
           ) 
             
           y 
          
           #0  0x00007ffff5a44e75 in llvm::sys::Process::FileDescriptorHasColors(int) () from /usr/lib/x86_64-linux-gnu/libbcc.so.0 
          
           ( 
           gdb 
           ) 
             
           ret 
          
           Make  
           selected  
           stack  
           frame  
           return 
             
           now 
           ? 
             
           ( 
           y 
             
           or 
             
           n 
           ) 
             
           y 
          
           #0  0x00007ffff45cabb8 in clang::driver::tools::Clang::ConstructJob(clang::driver::Compilation&, clang::driver::JobAction const&, clang::driver::InputInfo const&, llvm::SmallVector const&, llvm::opt::ArgList const&, char const*) const () from /usr/lib/x86_64-linux-gnu/libbcc.so.0 
          
           ( 
           gdb 
           ) 
             
           c

屏幕清空暂停…然后刷新：

哇！成功了！

22. 更好的方案

我已经把调试输出发布到 github，因为 BPF 首席工程师，Alexei Starovoitov 对 llvm 也很精通，问题的根源好像是 llvm 的一个 bug。当我在用写内存和返回命令瞎搞的时候，他建议我在 bcc 加上 llvm 选项 -fno-color-diagnostics，来避免这个问题。成功了！把它加到 bcc 里是一个解决办法。（我还是希望 llvm 的 bug 能被修复）

23. Python 环境

至此问题已经解决了，但是你可能会好奇想看修复好的堆栈回溯。

安装 python-dbg：

 
      
       
        
            1 
          

            2 
          

            3 
          

            4 
          

            5 
          

            6 
          

            7 
          

            8 
          

            9 
          

            10 
          

            11 
          

            12 
          

            13 
          
 
         
           # apt-get install -y python-dbg 
          
 
           Reading  
           package 
             
           lists 
           . 
           . 
           . 
             
           Done 
          
 
           [ 
           . 
           . 
           . 
           ] 
          
 
           The  
           following  
           additional  
           packages  
           will  
           be  
           installed 
           : 
          
 
              
           libpython 
           - 
           dbg  
           libpython2 
           . 
           7 
           - 
           dbg  
           python2 
           . 
           7 
           - 
           dbg 
          
 
           Suggested  
           packages 
           : 
          
 
              
           python2 
           . 
           7 
           - 
           gdbm 
           - 
           dbg  
           python2 
           . 
           7 
           - 
           tk 
           - 
           dbg  
           python 
           - 
           gdbm 
           - 
           dbg  
           python 
           - 
           tk 
           - 
           dbg 
          
 
           The  
           following  
           NEW 
             
           packages  
           will  
           be  
           installed 
           : 
          
 
              
           libpython 
           - 
           dbg  
           libpython2 
           . 
           7 
           - 
           dbg  
           python 
           - 
           dbg  
           python2 
           . 
           7 
           - 
           dbg 
          
 
           0 
             
           upgraded 
           , 
             
           4 
             
           newly  
           installed 
           , 
             
           0 
             
           to 
             
           remove  
           and 
             
           20 
             
           not 
             
           upgraded 
           . 
          
 
           Need  
           to 
             
           get 
             
           11.9 
             
           MB  
           of  
           archives 
           . 
          
 
           After  
           this 
             
           operation 
           , 
             
           36.4 
             
           MB  
           of  
           additional  
           disk  
           space  
           will  
           be  
           used 
           . 
          
 
           [ 
           . 
           . 
           . 
           ] 
          
 
       
 
      
    

现在我回到 gdb 来看堆栈回溯：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
            18 
          
            19 
          
            20 
          
            21 
          
            22 
          
            23 
          
            24 
          
            25 
          
            26 
          
            27 
          
            28 
          
            29 
          
            30 
          
            31 
          
            32 
          
            33 
          
            34 
          
            35 
          
            36 
          
            37 
          
           # gdb `which python` /var/cores/core.python.30520 
          
           GNU  
           gdb 
             
           ( 
           Ubuntu 
             
           7.11.1 
           - 
           0ubuntu1 
           ~ 
           16.04 
           ) 
             
           7.11.1 
          
           [ 
           . 
           . 
           . 
           ] 
          
           Reading  
           symbols  
           from 
             
           / 
           usr 
           / 
           bin 
           / 
           python 
           . 
           . 
           . 
           Reading  
           symbols  
           from 
             
           / 
           usr 
           / 
           lib 
           / 
           debug 
           / 
           . 
           build 
           - 
           id 
           / 
           4e 
           / 
           a0539215b2a9e32602f81c90240874132c1a54 
           . 
           debug 
           . 
           . 
           . 
           done 
           . 
          
           [ 
           . 
           . 
           . 
           ] 
          
           ( 
           gdb 
           ) 
             
           bt 
          
           #0  ClrBlank (win=0x1993060) at /build/ncurses-pKZ1BN/ncurses-6.0+20160213/ncurses/tty/tty_update.c:1129 
          
           #1  ClrUpdate () at /build/ncurses-pKZ1BN/ncurses-6.0+20160213/ncurses/tty/tty_update.c:1147 
          
           #2  doupdate () at /build/ncurses-pKZ1BN/ncurses-6.0+20160213/ncurses/tty/tty_update.c:1010 
          
           #3  0x00007f0a37aa07e6 in wrefresh (win=win@entry=0x1993060) at /build/ncurses-pKZ1BN/ncurses-6.0+20160213/ncurses/base/lib_refresh.c:65 
          
           #4  0x00007f0a37a99499 in recur_wrefresh (win=win@entry=0x1993060) at /build/ncurses-pKZ1BN/ncurses-6.0+20160213/ncurses/base/lib_getch.c:384 
          
           #5  0x00007f0a37a99616 in _nc_wgetch (win=win@entry=0x1993060, result=result@entry=0x7ffd33d93e24, use_meta=1) 
          
           at 
             
           / 
           build 
           / 
           ncurses 
           - 
           pKZ1BN 
           / 
           ncurses 
           - 
           6.0 
           + 
           20160213 
           / 
           ncurses 
           / 
           base 
           / 
           lib_getch 
           . 
           c 
           : 
           491 
          
           #6  0x00007f0a37a9a325 in wgetch (win=0x1993060) at /build/ncurses-pKZ1BN/ncurses-6.0+20160213/ncurses/base/lib_getch.c:672 
          
           #7  0x00007f0a37cc6ec3 in PyCursesWindow_GetCh.lto_priv.109 (self=0x7f0a3c57b198, args=()) at /build/python2.7-HpIZBG/python2.7-2.7.11/Modules/_cursesmodule.c:853 
          
           #8  0x00000000004c4d5a in call_function (oparg=<optimized out>, pp_stack=0x7ffd33d93f30) at ../Python/ceval.c:4350 
          
           #9  PyEval_EvalFrameEx () at ../Python/ceval.c:2987 
          
           #10 0x00000000004c2e05 in PyEval_EvalCodeEx () at ../Python/ceval.c:3582 
          
           #11 0x00000000004def08 in function_call.lto_priv () at ../Objects/funcobject.c:523 
          
           #12 0x00000000004b1153 in PyObject_Call () at ../Objects/abstract.c:2546 
          
           #13 0x00000000004c73ec in ext_do_call (nk=0, na=<optimized out>, flags=<optimized out>, pp_stack=0x7ffd33d941e8, func=<function at remote 0x7f0a37edcc80>) 
          
           at 
             
           . 
           . 
           / 
           Python 
           / 
           ceval 
           . 
           c 
           : 
           4662 
          
           #14 PyEval_EvalFrameEx () at ../Python/ceval.c:3026 
          
           #15 0x00000000004c2e05 in PyEval_EvalCodeEx () at ../Python/ceval.c:3582 
          
           #16 0x00000000004caf42 in fast_function (nk=0, na=<optimized out>, n=<optimized out>, pp_stack=0x7ffd33d943f0, func=<function at remote 0x7f0a38039140>) 
          
           at 
             
           . 
           . 
           / 
           Python 
           / 
           ceval 
           . 
           c 
           : 
           4445 
          
           #17 call_function (oparg=<optimized out>, pp_stack=0x7ffd33d943f0) at ../Python/ceval.c:4370 
          
           #18 PyEval_EvalFrameEx () at ../Python/ceval.c:2987 
          
           #19 0x00000000004c2e05 in PyEval_EvalCodeEx () at ../Python/ceval.c:3582 
          
           #20 0x00000000004c2ba9 in PyEval_EvalCode (co=<optimized out>, globals=<optimized out>, locals=<optimized out>) at ../Python/ceval.c:669 
          
           #21 0x00000000004f20ef in run_mod.lto_priv () at ../Python/pythonrun.c:1376 
          
           #22 0x00000000004eca72 in PyRun_FileExFlags () at ../Python/pythonrun.c:1362 
          
           #23 0x00000000004eb1f1 in PyRun_SimpleFileExFlags () at ../Python/pythonrun.c:948 
          
           #24 0x000000000049e18a in Py_Main () at ../Modules/main.c:640 
          
           #25 0x00007f0a3be10830 in __libc_start_main (main=0x49daf0 <main>, argc=2, argv=0x7ffd33d94838, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,  
          
           stack_end 
           = 
           0x7ffd33d94828 
           ) 
             
           at 
             
           . 
           . 
           / 
           csu 
           / 
           libc 
           - 
           start 
           . 
           c 
           : 
           291 
          
           #26 0x000000000049da19 in _start ()

没有 “??” 了，但也没什么大用。

python 调试包给 gdb 加入了别的功能。现在我们可以看 python 的回溯：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
           ( 
           gdb 
           ) 
             
           py 
           - 
           bt 
          
           Traceback 
             
           ( 
           most  
           recent  
           call  
           first 
           ) 
           : 
          
           File 
             
           "./cachetop.py" 
           , 
             
           line 
             
           188 
           , 
             
           in 
             
           handle 
           _loop 
          
           s 
             
           = 
             
           stdscr 
           . 
           getch 
           ( 
           ) 
          
           File 
             
           "/usr/lib/python2.7/curses/wrapper.py" 
           , 
             
           line 
             
           43 
           , 
             
           in 
             
           wrapper 
          
           return 
             
           func 
           ( 
           stdscr 
           , 
             
           * 
           args 
           , 
             
           * 
           * 
           kwds 
           ) 
          
           File 
             
           "./cachetop.py" 
           , 
             
           line 
             
           260 
           , 
             
           in 
             
           curses 
           . 
           wrapper 
           ( 
           handle_loop 
           , 
             
           args 
           )

… 和Python 源码：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
           ( 
           gdb 
           ) 
             
           py 
           - 
           list 
          
           183 
                    
           b 
           . 
           attach_kprobe 
           ( 
           event 
           = 
           "mark_buffer_dirty" 
           , 
             
           fn_name 
           = 
           "do_count" 
           ) 
          
           184 
                
           185 
                    
           exiting 
             
           = 
             
           0 
          
           186 
                
           187 
                    
           while 
             
           1 
           : 
          
           > 
           188 
                        
           s 
             
           = 
             
           stdscr 
           . 
           getch 
           ( 
           ) 
          
           > 
             
           189 
                        
           if 
             
           s 
             
           == 
             
           ord 
           ( 
           'q' 
           ) 
           : 
          
           > 
             
           190 
                            
           exiting 
             
           = 
             
           1 
          
           > 
             
           191 
                        
           elif 
             
           s 
             
           == 
             
           ord 
           ( 
           'r' 
           ) 
           : 
          
           > 
             
           192 
                            
           sort_reverse 
             
           = 
             
           not 
             
           sort_reverse 
          
           > 
             
           193 
                        
           elif 
             
           s 
             
           == 
             
           ord 
           ( 
           '<' 
           ) 
           :

它识别出了我们之前执行的 python 代码中的段错误。真是太棒了！

原先堆栈回溯的问题是我们看到了 python 内部在执行方法，却看不到方法本身。如果你调试别的语言，要取决于它的编译选项和运行环境,还有怎么结束执行代码。如果你在网上搜索 “语言名” 和 “gdb” 你可能会找到像 Python 一样的 gdb 扩展。如果没有的话，坏消息是你需要自己写，好消息是这样做是可行的！当它们可以用 Python 来写的时候，请搜索 “adding new GDB commands in Python” 的资料。

24. 更多命令

看起来好像我写了一个 gdb 的全面介绍，但我真的没有：gdb 里还有很多命令我没提到。help 命令列出了主要部分：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
            18 
          
            19 
          
            20 
          
            21 
          
           ( 
           gdb 
           ) 
             
           help 
          
           List  
           of  
           classes  
           of  
           commands 
           : 
          
           aliases 
             
           -- 
             
           Aliases  
           of  
           other  
           commands 
          
           breakpoints 
             
           -- 
             
           Making  
           program  
           stop  
           at  
           certain  
           points 
          
           data 
             
           -- 
             
           Examining  
           data 
          
           files 
             
           -- 
             
           Specifying  
           and 
             
           examining  
           files 
          
           internals 
             
           -- 
             
           Maintenance  
           commands 
          
           obscure 
             
           -- 
             
           Obscure  
           features 
          
           running 
             
           -- 
             
           Running  
           the  
           program 
          
           stack 
             
           -- 
             
           Examining  
           the  
           stack 
          
           status 
             
           -- 
             
           Status  
           inquiries 
          
           support 
             
           -- 
             
           Support  
           facilities 
          
           tracepoints 
             
           -- 
             
           Tracing  
           of  
           program  
           execution  
           without  
           stopping  
           the  
           program 
          
           user 
           - 
           defined 
             
           -- 
             
           User 
           - 
           defined  
           commands 
          
           Type 
             
           "help" 
             
           followed  
           by 
             
           a 
             
           class 
             
           name  
           for 
             
           a 
             
           list  
           of  
           commands  
           in 
             
           that  
           class 
           . 
          
           Type 
             
           "help all" 
             
           for 
             
           the  
           list  
           of  
           all  
           commands 
           . 
          
           Type 
             
           "help" 
             
           followed  
           by  
           command  
           name  
           for 
             
           full  
           documentation 
           . 
          
           Type 
             
           "apropos word" 
             
           to 
             
           search  
           for 
             
           commands  
           related  
           to 
             
           "word" 
           . 
          
           Command  
           name  
           abbreviations  
           are  
           allowed  
           if 
             
           unambiguous 
           .

你可以对每一类命令执行 help。例如，这是 breakpoints 类的全部清单：

 
            1 
          
            2 
          
            3 
          
            4 
          
            5 
          
            6 
          
            7 
          
            8 
          
            9 
          
            10 
          
            11 
          
            12 
          
            13 
          
            14 
          
            15 
          
            16 
          
            17 
          
            18 
          
            19 
          
            20 
          
            21 
          
            22 
          
            23 
          
            24 
          
            25 
          
            26 
          
            27 
          
            28 
          
            29 
          
            30 
          
            31 
          
            32 
          
            33 
          
            34 
          
            35 
          
            36 
          
            37 
          
            38 
          
            39 
          
            40 
          
            41 
          
            42 
          
            43 
          
            44 
          
            45 
          
            46 
          
            47 
          
            48 
          
            49 
          
            50 
          
            51 
          
            52 
          
            53 
          
            54 
          
            55 
          
            56 
          
            57 
          
            58 
          
            59 
          
            60 
          
            61 
          
            62 
          
            63 
          
            64 
          
            65 
          
            66 
          
            67 
          
            68 
          
            69 
          
            70 
          
            71 
          
            72 
          
            73 
          
            74 
          
            75 
          
            76 
          
            77 
          
            78 
          
            79 
          
            80 
          
            81 
          
            82 
          
            83 
          
            84 
          
            85 
          
            86 
          
            87 
          
            88 
          
            89 
          
            90 
          
            91 
          
            92 
          
            93 
          
            94 
          
            95 
          
            96 
          
            97 
          
            98 
          
           ( 
           gdb 
           ) 
             
           help  
           breakpoints 
          
           Making  
           program  
           stop  
           at  
           certain  
           points 
           . 
          
           List  
           of  
           commands 
           : 
          
           awatch 
             
           -- 
             
           Set 
             
           a 
             
           watchpoint  
           for 
             
           an  
           expression 
          
           break 
             
           -- 
             
           Set  
           breakpoint  
           at  
           specified  
           location 
          
           break 
           - 
           range 
             
           -- 
             
           Set 
             
           a 
             
           breakpoint  
           for 
             
           an  
           address  
           range 
          
           catch 
             
           -- 
             
           Set  
           catchpoints  
           to 
             
           catch 
             
           events 
          
           catch 
             
           assert 
             
           -- 
             
           Catch 
             
           failed  
           Ada  
           assertions 
          
           catch 
             
           catch 
             
           -- 
             
           Catch 
             
           an  
           exception 
          
           catch 
             
           exception 
             
           -- 
             
           Catch 
             
           Ada  
           exceptions 
          
           catch 
             
           exec 
             
           -- 
             
           Catch 
             
           calls  
           to 
             
           exec 
          
           catch 
             
           fork 
             
           -- 
             
           Catch 
             
           calls  
           to 
             
           fork 
          
           catch 
             
           load 
             
           -- 
             
           Catch 
             
           loads  
           of  
           shared  
           libraries 
          
           catch 
             
           rethrow 
             
           -- 
             
           Catch 
             
           an  
           exception 
          
           catch 
             
           signal 
             
           -- 
             
           Catch 
             
           signals  
           by  
           their  
           names  
           and 
           / 
           or 
             
           numbers 
          
           catch 
             
           syscall 
             
           -- 
             
           Catch 
             
           system  
           calls  
           by  
           their  
           names  
           and 
           / 
           or 
             
           numbers 
          
           catch 
             
           throw 
             
           -- 
             
           Catch 
             
           an  
           exception 
          
           catch 
             
           unload 
             
           -- 
             
           Catch 
             
           unloads  
           of  
           shared  
           libraries 
          
           catch 
             
           vfork 
             
           -- 
             
           Catch 
             
           calls  
           to 
             
           vfork 
          
           clear 
             
           -- 
             
           Clear  
           breakpoint  
           at  
           specified  
           location 
          
           commands 
             
           -- 
             
           Set  
           commands  
           to 
             
           be  
           executed  
           when 
             
           a 
             
           breakpoint  
           is 
             
           hit 
          
           condition 
             
           -- 
             
           Specify  
           breakpoint  
           number 
             
           N 
             
           to 
             
           break 
             
           only  
           if 
             
           COND  
           is 
             
           true 
          
           delete 
             
           -- 
             
           Delete  
           some  
           breakpoints  
           or 
             
           auto 
           - 
           display  
           expressions 
          
           delete  
           bookmark 
             
           -- 
             
           Delete 
             
           a 
             
           bookmark  
           from  
           the  
           bookmark  
           list 
          
           delete  
           breakpoints 
             
           -- 
             
           Delete  
           some  
           breakpoints  
           or 
             
           auto 
           - 
           display  
           expressions 
          
           delete  
           checkpoint 
             
           -- 
             
           Delete 
             
           a 
             
           checkpoint 
             
           ( 
           experimental 
           ) 
          
           delete  
           display 
             
           -- 
             
           Cancel  
           some  
           expressions  
           to 
             
           be  
           displayed  
           when  
           program  
           stops 
          
           delete  
           mem 
             
           -- 
             
           Delete  
           memory  
           region 
          
           delete  
           tracepoints 
             
           -- 
             
           Delete  
           specified  
           tracepoints 
          
           delete  
           tvariable 
             
           -- 
             
           Delete  
           one  
           or 
             
           more  
           trace  
           state  
           variables 
          
           disable 
             
           -- 
             
           Disable  
           some  
           breakpoints 
          
           disable  
           breakpoints 
             
           -- 
             
           Disable  
           some  
           breakpoints 
          
           disable  
           display 
             
           -- 
             
           Disable  
           some  
           expressions  
           to 
             
           be  
           displayed  
           when  
           program  
           stops 
          
           disable  
           frame 
           - 
           filter 
             
           -- 
             
           GDB  
           command  
           to 
             
           disable  
           the  
           specified  
           frame 
           - 
           filter 
          
           disable  
           mem 
             
           -- 
             
           Disable  
           memory  
           region 
          
           disable  
           pretty 
           - 
           printer 
             
           -- 
             
           GDB  
           command  
           to 
             
           disable  
           the  
           specified  
           pretty 
           - 
           printer 
          
           disable  
           probes 
             
           -- 
             
           Disable  
           probes 
          
           disable  
           tracepoints 
             
           -- 
             
           Disable  
           specified  
           tracepoints 
          
           disable  
           type 
           - 
           printer 
             
           -- 
             
           GDB  
           command  
           to 
             
           disable  
           the  
           specified  
           type 
           - 
           printer 
          
           disable  
           unwinder 
             
           -- 
             
           GDB  
           command  
           to 
             
           disable  
           the  
           specified  
           unwinder 
          
           disable  
           xmethod 
             
           -- 
             
           GDB  
           command  
           to 
             
           disable 
             
           a 
             
           specified 
             
           ( 
           group  
           of 
           ) 
             
           xmethod 
           ( 
           s 
           ) 
          
           dprintf 
             
           -- 
             
           Set 
             
           a 
             
           dynamic  
           printf  
           at  
           specified  
           location 
          
           enable 
             
           -- 
             
           Enable  
           some  
           breakpoints 
          
           enable  
           breakpoints 
             
           -- 
             
           Enable  
           some  
           breakpoints 
          
           enable  
           breakpoints  
           count 
             
           -- 
             
           Enable  
           breakpoints  
           for 
             
           COUNT  
           hits 
          
           enable  
           breakpoints  
           delete 
             
           -- 
             
           Enable  
           breakpoints  
           and 
             
           delete  
           when  
           hit 
          
           enable  
           breakpoints  
           once 
             
           -- 
             
           Enable  
           breakpoints  
           for 
             
           one  
           hit 
          
           enable  
           count 
             
           -- 
             
           Enable  
           breakpoints  
           for 
             
           COUNT  
           hits 
          
           enable  
           delete 
             
           -- 
             
           Enable  
           breakpoints  
           and 
             
           delete  
           when  
           hit 
          
           enable  
           display 
             
           -- 
             
           Enable  
           some  
           expressions  
           to 
             
           be  
           displayed  
           when  
           program  
           stops 
          
           enable  
           frame 
           - 
           filter 
             
           -- 
             
           GDB  
           command  
           to 
             
           disable  
           the  
           specified  
           frame 
           - 
           filter 
          
           enable  
           mem 
             
           -- 
             
           Enable  
           memory  
           region 
          
           enable  
           once 
             
           -- 
             
           Enable  
           breakpoints  
           for 
             
           one  
           hit 
          
           enable  
           pretty 
           - 
           printer 
             
           -- 
             
           GDB  
           command  
           to 
             
           enable  
           the  
           specified  
           pretty 
           - 
           printer 
          
           enable  
           probes 
             
           -- 
             
           Enable  
           probes 
          
           enable  
           tracepoints 
             
           -- 
             
           Enable  
           specified  
           tracepoints 
          
           enable  
           type 
           - 
           printer 
             
           -- 
             
           GDB  
           command  
           to 
             
           enable  
           the  
           specified  
           type  
           printer 
          
           enable  
           unwinder 
             
           -- 
             
           GDB  
           command  
           to 
             
           enable  
           unwinders 
          
           enable  
           xmethod 
             
           -- 
             
           GDB  
           command  
           to 
             
           enable 
             
           a 
             
           specified 
             
           ( 
           group  
           of 
           ) 
             
           xmethod 
           ( 
           s 
           ) 
          
           ftrace 
             
           -- 
             
           Set 
             
           a 
             
           fast  
           tracepoint  
           at  
           specified  
           location 
          
           hbreak 
             
           -- 
             
           Set 
             
           a 
             
           hardware  
           assisted  
           breakpoint 
          
           ignore 
             
           -- 
             
           Set  
           ignore 
           - 
           count  
           of  
           breakpoint  
           number 
             
           N 
             
           to 
             
           COUNT 
          
           rbreak 
             
           -- 
             
           Set 
             
           a 
             
           breakpoint  
           for 
             
           all  
           functions  
           matching  
           REGEXP 
          
           rwatch 
             
           -- 
             
           Set 
             
           a 
             
           read  
           watchpoint  
           for 
             
           an  
           expression 
          
           save 
             
           -- 
             
           Save  
           breakpoint  
           definitions  
           as 
             
           a 
             
           script 
          
           save  
           breakpoints 
             
           -- 
             
           Save  
           current  
           breakpoint  
           definitions  
           as 
             
           a 
             
           script 
          
           save  
           gdb 
           - 
           index 
             
           -- 
             
           Save 
             
           a 
             
           gdb 
           - 
           index  
           file 
          
           save  
           tracepoints 
             
           -- 
             
           Save  
           current  
           tracepoint  
           definitions  
           as 
             
           a 
             
           script 
          
           skip 
             
           -- 
             
           Ignore 
             
           a 
             
           function 
             
           while 
             
           stepping 
          
           skip  
           delete 
             
           -- 
             
           Delete  
           skip  
           entries 
          
           skip  
           disable 
             
           -- 
             
           Disable  
           skip  
           entries 
          
           skip  
           enable 
             
           -- 
             
           Enable  
           skip  
           entries 
          
           skip  
           file 
             
           -- 
             
           Ignore 
             
           a 
             
           file  
           while 
             
           stepping 
          
           skip  
           function 
             
           -- 
             
           Ignore 
             
           a 
             
           function 
             
           while 
             
           stepping 
          
           strace 
             
           -- 
             
           Set 
             
           a 
             
           static 
             
           tracepoint  
           at  
           location  
           or 
             
           marker 
          
           tbreak 
             
           -- 
             
           Set 
             
           a 
             
           temporary  
           breakpoint 
          
           tcatch 
             
           -- 
             
           Set  
           temporary  
           catchpoints  
           to 
             
           catch 
             
           events 
          
           tcatch  
           assert 
             
           -- 
             
           Catch 
             
           failed  
           Ada  
           assertions 
          
           tcatch  
           catch 
             
           -- 
             
           Catch 
             
           an  
           exception 
          
           tcatch  
           exception 
             
           -- 
             
           Catch 
             
           Ada  
           exceptions 
          
           tcatch  
           exec 
             
           -- 
             
           Catch 
             
           calls  
           to 
             
           exec 
          
           tcatch  
           fork 
             
           -- 
             
           Catch 
             
           calls  
           to 
             
           fork 
          
           tcatch  
           load 
             
           -- 
             
           Catch 
             
           loads  
           of  
           shared  
           libraries 
          
           tcatch  
           rethrow 
             
           -- 
             
           Catch 
             
           an  
           exception 
          
           tcatch  
           signal 
             
           -- 
             
           Catch 
             
           signals  
           by  
           their  
           names  
           and 
           / 
           or 
             
           numbers 
          
           tcatch  
           syscall 
             
           -- 
             
           Catch 
             
           system  
           calls  
           by  
           their  
           names  
           and 
           / 
           or 
             
           numbers 
          
           tcatch  
           throw 
             
           -- 
             
           Catch 
             
           an  
           exception 
          
           tcatch  
           unload 
             
           -- 
             
           Catch 
             
           unloads  
           of  
           shared  
           libraries 
          
           tcatch  
           vfork 
             
           -- 
             
           Catch 
             
           calls  
           to 
             
           vfork 
          
           thbreak 
             
           -- 
             
           Set 
             
           a 
             
           temporary  
           hardware  
           assisted  
           breakpoint 
          
           trace 
             
           -- 
             
           Set 
             
           a 
             
           tracepoint  
           at  
           specified  
           location 
          
           watch 
             
           -- 
             
           Set 
             
           a 
             
           watchpoint  
           for 
             
           an  
           expression 
          
           Type 
             
           "help" 
             
           followed  
           by  
           command  
           name  
           for 
             
           full  
           documentation 
           . 
          
           Type 
             
           "apropos word" 
             
           to 
             
           search  
           for 
             
           commands  
           related  
           to 
             
           "word" 
           . 
          
           Command  
           name  
           abbreviations  
           are  
           allowed  
           if 
             
           unambiguous 
           .

这些帮助表明了 gdb 有很多功能，也说明了我在示例中用到的只是一小部分。

25. 结语

好吧，这个问题有点恶心：一个 LLVM bug 破坏了 ncurses 并引起了 Python 程序的段错误。但是我用来调试的命令和步骤很常见：看堆栈，检查寄存器，下断点，逐步排查，看源码。

当我第一次使用 gdb 的时候（多年前），我真的不喜欢它。觉得它不灵活而且功能有限。从那之后 gdb 进步了很多，我也掌握了 gdb 的技巧，我现在认为它是一个强大的现代调试器。不同的调试器特性可能不同，但是 gdb 可能是目前基于文本的最强大的调试器，lldb 正奋起直追。

我希望我分享的有完整输出的 gdb 示例和我提到的不同的警告，会对搜到它的人有帮助。有机会的话我会发布更多的 gdb 示例，特别是其他运行环境比如Java。

用 q 退出 gdb。