[Code Quality] C/C++ code static analysis and common analysis software tools


1 Introduction

  For large-scale C/C++ projects, it is generally developed in a team divided into modules, with a code amount of hundreds of thousands or more. Due to the rapid increase in the amount of code and the large number of developers, the probability of code bugs also increases. ——This is a problem of mathematical probability theory, not a coding problem of programmers. Some of the problems are static problems caused by non-execution processes, such as memory leaks, memory out-of-bounds, wild pointers, logic ambiguities, deadlocks, etc. You can use some detection tools to analyze the code before releasing the code to eliminate static bugs. In addition to checking code static bugs, you can also judge code complexity, code quality, code execution efficiency, etc., as a basis for improving code quality.


2 What is static analysis

   Program static analysis refers to scanning the source code through lexical analysis, grammatical analysis, semantic analysis, control flow, data flow analysis and other technologies without executing the code to verify whether the code meets normative and security , Reliability, maintainability and other indicators of a code analysis technology. By reviewing and analyzing the code, check the function and performance of the code and improve the code quality. There are two ways of static analysis, namely manual review and software tool analysis.


  • Manual review, which depends on people, is suitable for small projects or scenarios with a small amount of code; low efficiency and easy to miss

  • Software tool analysis, ideal way, accuracy, reliability, and efficiency are much higher than manual review


3 Static analysis method

  • Lexical analysis scans the character stream of the code in turn, converts the source code into equivalent symbols through the regular expression method, and generates a list of symbols.

  • Syntax analysis , this method mainly analyzes the correctness of the source code structure, and organizes the symbols of the context-unrelated grammar into a grammar tree.

  • Abstract syntax tree analysis , the source code is organized into a tree structure, and the associated code is represented by the nodes of the tree.

  • Semantic analysis mainly reviews structurally correct source code and analyzes the nature of the code context.

  • Control flow analysis . This method reflects the nesting relationship of functions and can generate a function call relationship graph; by analyzing the source code, a directed control flow graph is generated. The nodes represent the basic code blocks, and the directed edges between nodes represent the flow control path. To the edge indicates the possible loop.

  • Data flow analysis : traverse and analyze the control flow graph generated by "control flow analysis", record variable initialization points and reference points, and save them as slice-related data information to generate data flow graphs.

  • The taint analysis is to infer the possible "attack" variables in the source code from the data flow graph generated by the "data flow analysis" and identify possible defects in the code.

  • Invalid code analysis . In the control flow graph generated by "control flow analysis", isolated nodes without edges are invalid codes. Through this method, code logic problems can be detected.


4 Static analysis content

  In most cases, the object of static analysis is program source code, and in a few cases, compiled object code (executable file) is used. The content of static analysis can be roughly classified into three categories according to the specific analysis object:

[1] Fatal category (memory related)

[2] Logic

[3] Coding standards and other categories


4.1 Memory Related

  Because C/C++ supports pointers, programmers usually manage memory dynamically, which may lead to memory leaks. Typical memory related issues are as follows:

  • Access to a null pointer (null pointer) that has not applied for memory
  • A pointer to access the released memory (wild pointer)
  • Out of bounds memory access
  • Memory leak, memory is not released after application
  • Release memory repeatedly
  • File descriptor leak (not released)
  • Format string is not safe (memory out of bounds)

  Regarding the check of memory problems, Valgrind tools are often used to check, please refer to the article How to use Valgrind to detect memory leaks .


4.2 Logic Class

  • Logic errors, repeated code branches, missing branch statements (such as switchmissing break), inconsistent variable comparison types, often true or false
  • Operation error, division by 0, unsigned number less than 0, self-addition of bool type
  • Suspicious check, dead loop, deadlock, if statement "=" problem, return of local variable, variable overflow

4.3 Programming style and others

  • Programming style, naming, standardization, readability, portability and reusability
  • Execution problems, functions are not used, variables are not used, code is unreachable (in advance return)
  • Hidden troubles, grammatical problems, fuzzy logic problems, type casting, compilation warnings, volatileproblems
  • Efficiency issues, time complexity, space complexity, logic loops,
  • Standard industry specifications, such as MISRA C

5 Common static analysis tools


tool Support language Support platform Authorization Description
AdLint C Linux 、 Windows 、 Mac OS Open source Visualization of code quality evaluation, supporting multiple software quality measurements
Coverity Prevent C / C ++ 、 C # 、 JAVA Linux 、 Windows 、 Mac OS Paid Provide a variety of auxiliary tools, specializing in the most accurate finding of the most serious and difficult to detect defects
Flawfinder C/C++ Linux, Windows Open source Use the c/c++ program security review tool written in Python; lexical scanning and analysis, embedded some vulnerability databases, such as buffer overflow, format string vulnerabilities, etc., scan fast, and divide the vulnerabilities according to the risk level of the vulnerabilities in the code , You can quickly find the existing problems
Klocwork C / C ++ 、 C # 、 JAVA Linux, Windows Paid The most widely used analysis tools in China
Rats C/C++、Python、Perl、 PHP Linux, Windows Open source The scanning rules are rough
PC-Lint C/C++ Windows Paid A commercial static analyzer that supports C/C++ provided by Gimpel Software
Cppcheck C/C++ Linux, Windows Open source Support graphical interface and command line
Splint C Linux Open source Static detection is aimed at C language security tools and vulnerability detection; Splint supports multiple routine checks
what C/C++ Linux Open source Lightweight static analyzer that can run under Linux-like systems
BLAST C Linux Open source A C language analyzer that uses a counter-example-driven automatic abstraction refinement method to construct an abstract model and verify the safety performance of the model
Frama-C C Linux 、 Windows 、 Mac OS Open source Static analyzer for C language
ITS4 C/C++ Linux, Windows Open source An automated source code review tool developed by Cigital; however, it cannot understand the meaning of the program context and has a lot of false positives
CoBot C/C++ Linux, Windows Open source Developed by Peking University, China's first software security testing tool certified by CWE
TscanCode C / C ++ 、 C # 、 Lua Linux 、 Windows 、 Mac OS Open source Static analysis tool developed by Tencent

Recommended Use:

CoBot、TscanCode、Cppcheck、Flawfinder

  个人使用首选开源工具;付费的功能很强大,但费用都比较昂贵,适合于公司使用。


  实质上,不论是人工审查还软件分析,都可能存在一定的误报率,甚至漏报。因此,提高代码质量的根本途径在于编码过程,形成良好编码习惯,是保证代码质量的最可靠方式。


6 参考文章

【1】【代码质量】C++代码质量扫描主流工具深度比较
【2】国内外主流静态分析类工具汇总

Guess you like

Origin blog.csdn.net/qq_20553613/article/details/108608856
Recommended