Introduction to the basic functions of valgrind and instructions on how to use it

Introduction to the basic functions of valgrind and instructions on how to use it

Introduction to the basic functions of valgrind and basic usage instructions_How to use valgrind_HNU Latecomer's Blog-CSDN Blog

The copy effect is not good, please read the original text.

1. Overview of Valgrind
Valgrind is a collection of open source (GPL V2) simulation and debugging tools under Linux.
Valgrind consists of a core and other debugging tools based on the core. The kernel is similar to a framework, which simulates a CPU environment and provides services to other tools; other tools are similar to plug-ins, using the services provided by the kernel to complete various specific memory debugging tasks.


2. Tool download and installation
reference address: https://www.valgrind.org/downloads/Installation
:

1. tar –xf valgrind-3.17.0.tar.bz2
2. cd valgrind-3.17.0
3. ./configure // Run the configuration script to generate a makefile. You can use --help to view the configuration items and configure them as needed, such as Modify the compilation tools, modify the installation path, etc.
4. make
5. make install //Installation generates an executable file. The path of the executable file is specified by the parameter --prefix. You need to add environment variables to PATH; if you do not add the parameter --prefix Specify, only use the default configuration, it will be automatically associated with
1
2
3
4
5.
You can use it after installation:
valgrind --help to see how to use it.

3. Use basic options
3.1 Introduction to basic tools
(1) Memcheck. This is the most widely used tool of valgrind. It is a heavyweight memory checker that can detect most of the incorrect memory usage in development, such as using uninitialized memory, using memory that has been released, memory access out of bounds, etc. This is also the part that this article will focus on.
(2) Callgrind. It is mainly used to check problems that occur during function calls in the program.
(3) Cachegrind. It is mainly used to check problems with cache usage in programs.
(4)Helgrind. It is mainly used to check competition problems that occur in multi-threaded programs.
(5) Massif. It is mainly used to check problems that occur in the stack usage in the program.
(6) Extension. You can use the functions provided by core to write specific memory debugging tools yourself.

3.2 Commonly used options
(1) Applicable to all Valgrind tools
–tool=<name> The most commonly used options. Run the tool named toolname in valgrind. Default memcheck.
-h --help Display help information.
–version displays the version of the valgrind kernel, each tool has its own version.
-q --quiet Run quietly, printing only error messages.
-v --verbose More detailed information, increase error count statistics.
–trace-children=no|yes Trace child threads? [no]
–track-fds=no|yes Trace open file descriptions? [no]
–time-stamp=no|yes Add timestamp to LOG information? [no]
–log-fd=< number > Output LOG to descriptor file [2=stderr]
–log-file=< file > Will output The information is written to the file filename.PID. PID is the process ID of the running program
. –log-file-exactly=< file > Output the LOG information to the file
. –log-file-qualifier=< VAR > Get the value of the environment variable. The file name used as the output information. [none]
–log-socket=ipaddr:port Output LOG to socket, ipaddr:port

(2) LOG information output
–xml=yes Output the information in xml format, only memcheck is available
–num-callers=<number> show <number> callers in stack traces [12]
–error-limit=no|yes If too If there are multiple errors, stop displaying new errors? [yes]
–error-exitcode=< number > If an error is found, return the error code [0=disable]
–db-attach=no|yes When an error occurs, valgrind will automatically start the debugger gdb. [no]
–db-command=< command > Command line option to start the debugger [gdb -nw %f %p]

(3) Related options applicable to the Memcheck tool:
–leak-check=no|summary|full Request detailed information on leak? [summary]
–leak-resolution=low|med|high how much bt merging in leak check [ low]
–show-reachable=no|yes show reachable blocks in leak check? [no]
For more detailed usage information, please see the help file, man manual or official website: http://valgrind.org/docs/manual/manual-core .html

(4) Note:
① Valgrind will not automatically check every line of code in the program, but will only check the code branch that is run, so unit testing or functional test cases are very important; ②
Valgrind can be regarded as a sandbox and run through valgrind The program actually runs in the sandbox of valgrind, so don't test the performance, you will be disappointed. It is recommended to only do functional testing
③ When compiling the code, it is recommended to add the -g -o0 option, do not use -o1, -o2 Options

3.3 常用选项示例
–tool=< name > : use the Valgrind tool named < name > [memcheck]
–log-file=< file > : log messages to < file >

Example:

valgrind --tool=memcheck --log-file=log.txt --leak-check=yes ./test
1
Description: Use the memcheck tool to check the test program for memory leaks and save the log to log.txt

4. Introduction to the Memcheck tool
Memcheck is the most widely used tool in valgrind and can detect most of the incorrect memory usage in development. This tool can mainly check the following errors
(1) Use of uninitialized memory
(2) Use memory that has been released (Reading/writing memory after it has been free'd)
(3) Use more than malloc allocation (Reading/writing off the end of malloc'd blocks)
(4) Illegal access to the stack (Reading/writing inappropriate areas on the stack)
(5) Whether the applied space has been released (Memory leaks – where pointers to malloc'd blocks are lost forever)
(6) Malloc/free/new/delete applies for and releases memory (Mismatched use of malloc/new/new [] vs free/delete/delete []) (7) src and
dst Overlapping (Overlapping src and dst pointers in memcpy() and related functions)

#include<iostream>
int main()
{     int *pInt;     std::cout<<"Use uninitialized memory";     int a=*pInt; //Use uninitialized memory } 1 2 3 4 5 6 7 # include<iostream> int main() {     int *pArray=(int *)malloc(sizeof(int) *5);     std::cout<<"Use freed memory";     free(pArray);     pArray[0 ]=0; //Use the released memory } 1 2 3 4 5 6 7 8 #include<iostream> int main() {     int *pArray=(int *)malloc(sizeof(int) *5);































    std::cout<<"Use more memory space than malloc allocated";
    pArray[5]=5; //Use more memory space than malloc allocated
    free(pArray);
}
1
2
3
4
5
6
7
8
#include<iostream >
int main()
{     int *pArray=(int *)malloc(sizeof(int) *5);     std::cout<<"malloc lacks free"; } 1 2 3 4 5 6 #include<iostream> int main () {     char a[10];     for (char c=0; c < sizeof(a); c++)     {         a[c]=c;     }     std::cout<<"The copied src and dst overlap";


















    memcpy(&a[4],&a[0],6);
}
1
2
3
4
5
6
7
8
9
10
11Note
: The program sometimes applies for many resident nodes, and these unreleased nodes should not be considered a problem;
generally As the program runs, malloc or new operations that cause the node to increase in one direction are considered memory leaks.

4.1 Example 1:
Source code:

#include<iostream>
int main()
{     int *pArray=(int *)malloc(sizeof(int) *5);     std::cout<<"Use more memory space than malloc allocated";     pArray[5]=5 ; //Use more than the memory space allocated by malloc     free(pArray); } 1 2 3 4 5 6 7 8 Compile:













g++ test1.cpp -g -o test1_g //-g: Let the memcheck tool get the specific line number of the error
1
Debugging:

valgrind --leak-check=yes --log-file=1_g ./test1_g
1
generates log file 1_g:

(1) The process number of the current program (./test1_g)
(2) The license description of the valgrind memcheck tool (3)
How the loader runs
(4) The parent process number, the process of the current terminal
(5) The error message detected
( 6) Stack summary and summary. In this example, there are a total of two allocs and two frees, and there is no memory leak.
(7) The number of errors detected, here is 1 prompt

4.2 Example 2
#include<iostream>
int main()
{     int *pArray=(int *)malloc(sizeof(int) *5);     std::cout<<"Use freed memory";     free(pArray) ;     pArray[0]=0; //Use the released memory } 1 2 3 4 5 6 7 8 Compile:













g++ test7.cpp -g -o test7_g //-g: Let the memcheck tool get the specific line number of the error
1
Debugging:

valgrind --leak-check=yes --log-file=7_g ./test7_g
1
generates log file 7_g:

(1) Because the same terminal is still used, the parent process is still 8248
(2) There are two illegal read and write errors

Compile:

g++ test7.cpp -g -o test2_g_O2 -O2 
1
debugging:

valgrind --leak-check=yes --log-file=7_g_O2 ./test7_g_O2
1
Generate log file 7_g_O2:

You can see the same program. After adding -O2, the pArray[0]=0; statement is optimized out, so it is not detected.
In order to achieve more stringent detection, you need to ensure that the compiler does not optimize when compiling, that is, the optimization level is -O0. Gcc and g++ use -O0 by default, but most actual designs will add -O1 or -O2 to the Makefile. parameters, so it’s best to check.

4.3 Summary of the overall format of the Memcheck report output document

copyright Copyright statement
Abnormal read and write report
(1) Abnormal read and write of main thread Abnormal read
and write report of Thread A Abnormal read
and write report of Thread B
(2) Other threads
(3) Heap memory leak report Heap
memory usage overview (HEAP SUMMARY)
Convinced Memory leak report (definitely lost)
Suspicious memory operation report (show-reachable=no off, open: –show-reachable=yes)
Leak summary (LEAK SUMMARY)

4.4 Basic format of Memcheck log report

{Problem description}
at {address, function name, module or line of code}
by {address, function name, line of code}
by …{Display the call stack layer by layer}
Address 0x??? {Describe the relative relationship of addresses}

4.5 7 types of errors included in memcheck
(1) illegal read/illegal write errors
prompt information: [invalid read of size 4]
(2) use of uninitialised values
​​prompt information: [Conditional jump or move depends on uninitialised value]
(3) use of uninitialised or unaddressable values ​​in system calls
prompt information: [syscall param write(buf) points to uninitilaised bytes]
(4) illegal frees
prompt information: [invalid free()]
(5) when a heap block is freed with an inappropriate deallocation function
prompt information: [Mismatched free()/delete/delete[]]
(6) overlapping source and destination blocks
prompt information: [source and destination overlap in memcpy(,)]
(7) memory leak detection
① still reachable
memory pointer still When there is still a chance to use or release, the dynamic memory pointed to by the pointer is exited before it is released.

Definitely lost memory leak, this memory can no longer be accessed
③ Indirectly lost
The pointers pointing to the memory are all located at the memory leak
④ Possible
memory leak, there is still a pointer that can quickly access a certain block of memory, but the pointer The point pointed to is no longer the first location in memory.

4.6 Principle of the memcheck tool
Memcheck implements a simulated CPU. The monitored program is interpreted and executed by the simulated CPU. The simulated CPU can detect the legality of the address and the legality of the read operation when all memory read and write instructions occur.

The key to Memcheck's ability to detect memory problems is that it creates two global tables.
(1) Valid-Value table:
For each byte in the entire address space of the process, there are 8 corresponding bits; for each register of the CPU, there is also a corresponding bit vector. These bits are responsible for recording whether the byte or register value has a valid, initialized value.
(2) Valid-Address table
For each byte in the entire address space of the process, there is a corresponding bit, which is responsible for recording whether the address can be read or written.
Detection principle:
When you want to read or write a byte in the memory, first check the A bit corresponding to this byte. If the A bit shows that the location is an invalid location, memcheck reports a read and write error.
The core is similar to a virtual CPU environment, so when a certain byte in the memory is loaded into the real CPU, the V bit corresponding to the byte is also loaded into the virtual CPU environment. Once the value in the register is used to generate a memory address, or the value can affect the program output, memcheck will check the corresponding V bits. If the value has not been initialized, an uninitialized memory error will be reported.

Simply put:
(1) How to know which addresses are legal (memory has been allocated)?
Maintain a valid address table (Valid-address (A) bits), in which all addresses that can currently be legally read and written (allocated) have corresponding entries. This table is maintained through the following measures:
① Global data (data, bss section) – marked as a legal address when the program starts
② Local variables – monitor changes in sp (stack pointer) and dynamically maintain
③ Dynamically allocated memory – intercept allocation/ Calls to release memory: malloc, calloc, realloc, valloc, memalign, free, new, new[], delete and delete[] ④
System call – intercept the address mapped by mmap
⑤ Others – you can display and inform memcheck whether a certain field is legal (2) How
to know whether a certain memory has been assigned?
① Maintain a valid value table (Valid-value (V) bits) to indicate whether the corresponding bit has been assigned a value. Because the virtual CPU can capture all write instructions to memory, this table is easy to maintain.

5. Callgrind tool introduces
the Callgrind performance analysis tool, which does not require special options when compiling source code. Callgrind uses cachegrind's statistical information Ir (I cache reads, that is, the number of times an instruction is executed) to count function calls in the program, establish a function call relationship graph, and selectively perform cache simulation. At the end of the run, it will write the analysis data to a file, and callgrind_annotate can convert the contents of this file into a readable form.

5.1 Basic operations of Callgrind text analysis
Example:
(1) cd linux/bin
(2) valgrind --tool=callgrind ./Devtest
generates a file: callgrind.out.27439
or
valgrind --tool=callgrind --separate-threads=yes ./Devtest
generates three files: callgrind.out.1234 (empty), callgrind.out.1234-01 (thread 1), callgrind.out.1234-02 (thread 2)
(3) callgrind_annotate callgrind.out.27439 > log
callgrind_annotate can convert the contents of the callgrind.out.pid file into a readable form and redirect it to the log file. Open the callgrind.out.pid and log files respectively, and you will find their differences (callgrind.out.pid It is a format that is not easy for humans to understand directly. callgrind_annotate is equivalent to a translation that displays callgrind.out.pid in the way we like
).
The log file generated by callgrind_annotate parsing callgrind.out.pid has the following content after opening:

You can see the dynamic library to which each function belongs. The number of instructions consumed by the function call is sorted from large to small by default.
callgrind_annotate also has several optional parameters:
① --inclusive=yes: Not only counts the execution times of each statement separately, but also calculates the calling relationship. For example, if function foo calls bar, then the cost of bar will be added to the cost of foo. .
② --tree=both: Display the calling relationship.
③ --auto=yes: will automatically associate statistical information with source code. It will display the source code of each function, and display the running cost of each statement in front
(4) You can associate individual files:
callgrind_annotate callgrind.out.9441 main.c | grep -v "???"
Note: The calls with the "???" prefix are all low-level calls to the system library. They are not important and can be filtered out by grep -v.

5.2 Basic operations of Callgrind flow chart analysis.
Taking the "sample code provided by the official website" on the right side of the project as an example, it will be more intuitive:
(1) gcc –g test.c -o test
(2) valgrind --tool=callgrind ./test
generation A file: callgrind.out.pid
(3) python gprof2dot.py -f callgrind -n10 -s callgrind.out.[pid] > valgrind.dot
(4) dot -Tpng valgrind.dot -o valgrind.png
(5) Open the picture and it is said that you can know the distribution of running time consumption at a glance.


6. Cachegrind tool introduction
6.1 Basic introduction
(1) Cachegrind Valgrind-based profiler (profiler) computer systems are becoming more and more complex, profiling storage systems is often a system bottleneck, and Cache needs to be profiled (2)
Function
① Simulate L1, L2 Cache
② Analyze Cache behavior, execution times, failure rate, etc. ③ Analyze (3) functions
according to files, functions, code lines, and assembly instructions ① Detailed Cache analysis to find program bottlenecks ② Instructions to improve programs and improve execution efficiency ③ Trace-driven Cache simulator (4) Advantages ① Easy to use, no need to recompile ② Analyze all executed code, including libraries ③ No language restrictions ④ Relatively fast ⑤ Flexible, simulate different configurations of Cache









6.2 Usage steps
(1) valgrind --tool=cachegrind ./test

At the same time, the file cachegrind.out.pid is generated
(2) callgrind_annotate cachegrind.out.4599 | grep -v “???”

Like callgrind, it can also be translated into readable information through callgrind_annotate. You can see
the hit status of I1 cache (instruction cache), D1 cache (data cache), and LL cache (public second-level cache).

7. Introduction to Massif tool
Massif is a memory analysis tool. The purpose of monitoring program memory allocation is achieved by continuously taking snapshots of the program heap.

7.1 Example
(1) g++ test.cc -o test
(2) valgrind --tool=massif ./test will get a massif file: massif.out.pid
(3) Use ms_print to parse the output file: ms_print massif.out .pid
(4) See the memory changes of the stack through graphic snapshots:


8. Introduction to Helgrind tool
Helgrind is a key function of Valgrind. This section mainly focuses on detecting basic security issues related to multi-threading.
① Insecure resource access
② Deadlock problem
③ Incorrect use of POSIX pthreads API
④ If the previous several basics are safe and error-free, multi-threaded programs must be able to minimize synchronization blocks as much as possible

8.1 Helgrind resource insecure access
Problem solved:
​Problem 1: It can be solved well by calling Helgrind. Take the basic program on the right as an example.

#include <pthread.h>

int var = 0;
void* child_fn (void* arg)
{
   var++;
   return NULL;
}
int main (void)
{
    pthread_t child;
    pthread_t child2;
 
    pthread_create(&child,NULL, child_fn, NULL);
    pthread_create(&child2,NULL,child_fn,NULL);
 
    pthread_join(child,NULL);
    pthread_join(child2,NULL);
 
    return 0;
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14 15
16
17
18
19 20 21 Obviously var is a shared unsafe access
. Call Helgrind to see how it can be detected: gcc -g test.c -o test –lpthread valgrind --tool=helgrind ./test After running helgrind, the following results will be generated. From the information prompt, you can see that there are two errors, preemption of the val global variable.






Question 2:
The deadlock problem should be avoided as much as possible. Helgrind can detect the deadlock problem caused by problems with the locking and unlocking sequence. We can take a closer look at this problem: https://blog.csdn.net/sfwtoms11/article/details /38438253
Let’s look at the situation of adding 2 consecutive locks

#include <pthread.h>

pthread_mutex_t mut_thread;
int var = 0;
void* child_fn ( void* arg )
{
    pthread_mutex_lock(&mut_thread);
    var++;
    pthread_mutex_lock(&mut_thread);
    return NULL;
}

int main ( void )
{
    pthread_t child;
    pthread_t child2;
    pthread_mutex_init(&mut_thread,NULL);
    pthread_create(&child,NULL, child_fn, NULL);
    pthread_create(&child2,NULL,child_fn,NULL);
    pthread_join(child,NULL);
    pthread_join(child2,NULL);
    return 0;
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Problems caused by mutex and unlocking order:

#include <pthread.h>

pthread_mutex_t mut_thread;
pthread_mutex_t mut_thread1;
int var = 0;
void* child_fn ( void* arg ) {
    pthread_mutex_lock(&mut_thread);
    pthread_mutex_lock(&mut_thread1);
    var++;
    pthread_mutex_unlock(&mut_thread);
    pthread_mutex_unlock(&mut_thread1);
    return NULL;
}
void* child_fn1(void *arg)
{
    pthread_mutex_lock(&mut_thread1);
    pthread_mutex_lock(&mut_thread);
    var++;
    pthread_mutex_unlock(&mut_thread1);
    pthread_mutex_unlock(&mut_thread);
    return NULL;
}

int main ( void ) {
    pthread_t child;
    pthread_t child2;
    pthread_mutex_init(&mut_thread,NULL);
    pthread_mutex_init(&mut_thread1,NULL);
    pthread_create(&child,NULL, child_fn, NULL);
    pthread_create(&child2,NULL,child_fn1,NULL);
    pthread_join(child,NULL);
    pthread_join(child2,NULL);
    return 0;
}

——————————————
Copyright statement: This article is an original article by CSDN blogger “HNU Latecomer” and follows the CC 4.0 BY-SA copyright agreement. Please attach the original source link and this copy when reprinting statement.
Original link: https://blog.csdn.net/weixin_45518728/article/details/119865117

Guess you like

Origin blog.csdn.net/thanklife/article/details/130992946