(transfer) Linux C++ program for performance analysis tool gprof introduction

Reprinted from https://blog.csdn.net/garfier/article/details/12489953#

 

Performance Analysis Tool

The performance of software is an important inspection point for software quality. Whether it is an online service program or an offline program, or even a terminal application, performance is the key to user experience. The major performance categories mentioned here include performance and stability. When we do software testing, we also need to focus on testing the performance and stability of the version. There are many ways to locate performance problems found during software testing. The basic method may be that developers review the code, or use some tools to perform performance analysis of the code. What are the common performance analysis tuning tools? The following two articles provide a detailed summary:

 

In my work, it is mainly about the performance analysis of Linux C++ program code. gprof is one of the tools that can be used for Linux C++ code performance profiling. This article mainly talks about my learning and use of gprof.

 

Fundamentals of Gprof

Gprof can let you know where in your code is time-consuming, which functions are called a lot, and let you see the calling relationship between functions at a glance. gprof is a performance diagnostic tool supported by the gcc/g++ compiler. As long as the -pg option is added when compiling, the compiler will add an mcount function call at the beginning of each function when compiling the program. This mcount function will be called before each function call, and the function's Calling graph and function call time and call times and other information. Finally, it is saved in the gmon.out file when the program exits. It should be noted that the program must exit normally or exit through the exit call, because the program will only be triggered to write the gmon.out file when exit() is called.

Then, the use of gprof is mainly the following three steps:

 

  • will compile the program with the -pg parameter
  • Run the program and exit normally
  • View the gmon.out file

Gprof usage example

#include<iostream>
using namespace std;

int add(int a, int b)
{
        return a+b;
}

int sub(int a, int b)
{
        return a-b;
}

int call ()
{
        std::cout << add(1,2) << std::endl;
        std::cout << sub(2,4) << std::endl;
}

intmain()
{
        int a=1, b=2;
        cout << add(a,b) << endl;
        for (int i=0; i<10000; i++)
                call();
        return 0;
}

 

Compile with g++ and add the -pg parameter:

[plain] view plain copy
  1. g++ -o hello hello_grof.cpp -pg -g  

After getting the executable file, we can use readelf to see the difference between its symbol table and the one compiled without -pg: compare the results of readelf -r ./hello and readelf -r ./hello_normal.

 

Use gdb to debug the hello program. You can also see the calling relationship by breaking the point in the mcount function. Call the mcount function before the add function is executed:
 
Next, run the program ./hello, which will generate the gmon.out file in the current directory. Use gprof to view file information:
[plain] view plain copy
  1. gprof -b ./hello gmon.out  
Got the following output:
[plain] view plain copy
  1. Flat profile:  
  2.   
  3. Each sample counts as 0.01 seconds.  
  4.  no time accumulated  
  5.   
  6.   %   cumulative   self              self     total             
  7.  time   seconds   seconds    calls  Ts/call  Ts/call  name      
  8.   0.00      0.00     0.00    10001     0.00     0.00  add(int, int)  
  9.   0.00      0.00     0.00    10000     0.00     0.00  sub(int, int)  
  10.   0.00      0.00     0.00    10000     0.00     0.00  call()  
  11.   0.00      0.00     0.00        1     0.00     0.00  global constructors keyed to _Z3addii  
  12.   0.00      0.00     0.00        1     0.00     0.00  __static_initialization_and_destruction_0(int, int)  
  13.   
  14.             Call graph  
  15.   
  16.   
  17. granularity: each sample hit covers 2 byte(s) no time propagated  
  18.   
  19. index % time    self  children    called     name  
  20.                 0.00    0.00       1/10001       main [7]  
  21.                 0.00    0.00   10000/10001       call() [10]  
  22. [8]      0.0    0.00    0.00   10001         add(int, int) [8]  
  23. -----------------------------------------------  
  24.                 0.00    0.00   10000/10000       call() [10]  
  25. [9]      0.0    0.00    0.00   10000         sub(int, int) [9]  
  26. -----------------------------------------------  
  27.                 0.00    0.00   10000/10000       main [7]  
  28. [10]     0.0    0.00    0.00   10000         call() [10]  
  29.                 0.00    0.00   10000/10001       add(int, int) [8]  
  30.                 0.00    0.00   10000/10000       sub(int, int) [9]  
  31. -----------------------------------------------  
  32.                 0.00 0.00 1/1 __do_global_ctors_aux [13]  
  33. [11]     0.0    0.00    0.00       1         global constructors keyed to _Z3addii [11]  
  34.                 0.00    0.00       1/1           __static_initialization_and_destruction_0(int, int) [12]  
  35. -----------------------------------------------  
  36.                 0.00    0.00       1/1           global constructors keyed to _Z3addii [11]  
  37. [12]     0.0    0.00    0.00       1         __static_initialization_and_destruction_0(int, int) [12]  
  38. -----------------------------------------------  
  39.   
  40. Index by function name  
  41.   
  42.   [11] global constructors keyed to _Z3addii (hello_grof.cpp) [9] sub(int, int) [10] call()  
  43.    [8] add(int, int)          [12] __static_initialization_and_destruction_0(int, int) (hello_grof.cpp)  

You can use the run command:
[plain] view plain copy
  1. gprof -b ./hello  gmon.out | gprof2doc.py > ~WWW/hello.dot  

To generate a call graph file in dot format, you can use the Windows version of GVEdit for Graphviz to view the call graph:
 
Attach a more complex program call relationship diagram:
The relationship and hotspot of calls are clear at a glance.
 

Interpretation of Gprof output

This part of the content can be removed from the -b parameter in gprof -b ./hello, and the detailed description of the field can be displayed:

[plain] view plain copy
  1. 14  %         the percentage of the total running time of the  
  2. 15 time       program used by this function.  
  3. 16  
  4. 17 cumulative a running sum of the number of seconds accounted  
  5. 18  seconds   for by this function and those listed above it.  
  6. 19  
  7. 20  self      the number of seconds accounted for by this  
  8. 21 seconds    function alone.  This is the major sort for this  
  9. 22            listing.  
  10. 23  
  11. 24 calls      the number of times this function was invoked, if  
  12. 25            this function is profiled, else blank.  
  13. 26  
  14. 27  self      the average number of milliseconds spent in this  
  15. 28 ms/call    function per call, if this function is profiled,  
  16. 29        else blank.  
  17. 30  
  18. 31  total     the average number of milliseconds spent in this  
  19. 32 ms/call    function and its descendents per call, if this  
  20. 33        function is profiled, else blank.  
  21. 34  
  22. 35 name       the name of the function.  This is the minor sort  
  23. 36            for this listing. The index shows the location of  
  24. 37        the function in the gprof listing. If the index is  
  25. 38        in parenthesis it shows where it would appear in  
  26. 39        the gprof listing if it were to be printed.  

 

Summarize

gprof is a common performance analysis tool. Here are some of its shortcomings, which are also seen from the Internet:

  • 1. Poor multi-thread support, inaccurate
  • 2. You must exit exit()
  • 3. It can only analyze the user time consumed by the application program in the running process, and cannot obtain the running time of the program kernel space. Call analysis of the kernel state is powerless. If the program system call ratio is relatively large, it is not suitable.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326057822&siteId=291194637