2312llvm, build clang tool with matcher

original

Use LibToolingand LibASTMatchersbuild tools

Here's how to build useful translation tools Clangbased on the basicsLibTooling源到源

Steps 0: TakeClang

Because Clangit is LLVMpart of the project, you need to download LLVMthe source code first. ClangBoth LLVMare in the same gitrepository, but in different directories. See the Getting Started Guide for more information .

cd ~/clang-llvm
git clone https://github.com/llvm/llvm-project.git

Next, you need to pick up CMakethe build system and Ninjabuild tools.

cd ~/clang-llvm
git clone https://github.com/martine/ninja.git
cd ninja
git checkout release
./bootstrap.py
sudo cp ninja /usr/bin/
cd ~/clang-llvm
git clone git://cmake.org/stage/cmake.git
cd cmake
git checkout next
./bootstrap
make
sudo make install

OK. Now build Clang!

cd ~/clang-llvm
mkdir build && cd build
cmake -G Ninja ../llvm -DLLVM_ENABLE_PROJECTS="clang;clang-tools-extra" -DLLVM_BUILD_TESTS=ON  
# 允许测试,默认关闭.
ninja
ninja check       # Test LLVM only.
ninja clang-test  # Test Clang only.
ninja install

OK, that's it. All tests should pass.
Finally, want to set the compiler Clangfor it .自己

cd ~/clang-llvm/build
cmake ../llvm

第二个Command to open configure Clang. GUIYou need to set CMAKE_CXX_COMPILERthe item. Press to "t"open advanced mode. Scroll down CMAKE_CXX_COMPILERto and set it to, /usr/bin/clang++or install location.
Press "c"configure, then "g"generated CMakefiles.
Finally, run it one last time ninjaand you are done.

Steps 1:CreateClangTool

The simplest to create ClangTool: 语法the inspector. Although it already clang-checkexists.

First, create for the tool 新目录, and tell CMakeit to exist. Since this won't be the core clangtool, it will be in clang-tools-extrathe repository.

cd ~/clang-llvm
mkdir clang-tools-extra/loop-convert
echo 'add_subdirectory(loop-convert)' >> clang-tools-extra/CMakeLists.txt
vim clang-tools-extra/loop-convert/CMakeLists.txt

CMakeLists.txtShould contain the following:

set(LLVM_LINK_COMPONENTS support)
add_clang_executable(loop-convert
  LoopConvert.cpp
  )
target_link_libraries(loop-convert
  PRIVATE
  clangAST
  clangASTMatchers
  clangBasic
  clangFrontend
  clangSerialization
  clangTooling
  )

Once completed, Ninjathe tool can be compiled. Compile! clang-tools-extra/loop-convert/LoopConvert.cppPlace the following in.
See LibToolingthe documentation for the different components .

//声明`clang::SyntaxOnlyAction`.
#include "clang/Frontend/FrontendActions.h"
#include "clang/Tooling/CommonOptionsParser.h"
#include "clang/Tooling/Tooling.h"
//声明`llvm::cl::extrahelp`.
#include "llvm/Support/CommandLine.h"
using namespace clang::tooling;
using namespace llvm;
//对所有命令行选项,自定义分类,这样只显示他们.
static llvm::cl::OptionCategory MyToolCategory("my-tool options");
 //`CommonOptionsParser`用与编译数据库和输入文件相关的常见命令行选项的`说明`声明`HelpMessage`.
//在所有工具中都有此帮助消息.

static cl::extrahelp CommonHelp(CommonOptionsParser::HelpMessage);
//之后可添加此`特定工具`的帮助消息.
static cl::extrahelp MoreHelp("\nMore help text...\n");
int main(int argc, const char **argv) {
    
    
  auto ExpectedParser = CommonOptionsParser::create(argc, argv, MyToolCategory);
  if (!ExpectedParser) {
    
    
    //对不支持的选项,优雅失败.
    llvm::errs() << ExpectedParser.takeError();
    return 1;
  }
  CommonOptionsParser& OptionsParser = ExpectedParser.get();
  ClangTool Tool(OptionsParser.getCompilations(), OptionsParser.getSourcePathList());
  return Tool.run(newFrontendActionFactory<clang::SyntaxOnlyAction>().get());
}

That's it! New tools can be compiled by buildrunning from the directory.ninja

cd ~/clang-llvm/build
ninja

It should now be able 源文件to run on and ~/clang-llvm/build/binin 语法检查器. Try it!

echo "int main() { return 0; }" > test.cpp
bin/loop-convert test.cpp --

Note the . 指定after the source file 两个破折号. Passing the compiler after the dash 附加选项instead of loading them 编译数据库from it is not needed now 选项.

Intermezzo:Learn ASTthe basics of matchers

ClangA library was recently introduced that provides 简单,强大且简洁a way to describe ASTthe .指定模式ASTMatcher

Depending on the 宏和模板supported DSLimplementation 匹配器(see ASTMatchers.h, here ), it provides a 函数式语言common 代数数据类型feel.

For example, suppose you just want to check 二元符号. There is binaryOperatora matcher called:

binaryOperator(hasOperatorName("+"), hasLHS(integerLiteral(equals(0))))

It will match 左侧exactly . It will not 0字面match ( like or ), but it will match up to 0 .加式其他形式0"\0"NULL扩展宏

匹配器Overloaded symbols will not be 匹配called "+"because there is a separate operatorCallExprmatcher to handle it 重载符号.

There is a ASTmatcher to match ASTall 不同节点, narrow down 匹配器to only matched 指定条件nodes AST, and ASTfetch from one node to ASTanother 遍历匹配器.

ASTComplete list of matchers

All 名词matchers are described ASTin 可绑定实体so that they 匹配项can be found when found. To do this, just call a method on like:引用这些匹配器bind

variable(hasType(isInteger())).bind("intvar")

Step 2: Use ASTmatchers

Okay, 使用matcher. First define a catch 按零定义初化的新变量all statements 匹配器. Start by matching all forloops:

forStmt()

Next, in 循环the first part, specify 声明a single variable so that it can be expanded 匹配器to

forStmt(hasLoopInit(declStmt(hasSingleDecl(varDecl()))))

Finally, you can add 变量initialization to zero 条件.

forStmt(hasLoopInit(declStmt(hasSingleDecl(varDecl(
  hasInitializer(integerLiteral(equals(0))))))))

It's easy 阅读和理解to define a matcher (" 匹配, initpart declares a loop of 按0literal variables"), but hard to make every part necessary.初化确定

Note that this will not match loops 此匹配器that initialize to "\0",0.0,NULLor divide 0整数by zero . The last step is to give it a name and bind it because you want to work with it:变量
匹配器ForStmt

StatementMatcher LoopMatcher =
  forStmt(hasLoopInit(declStmt(hasSingleDecl(varDecl(
    hasInitializer(integerLiteral(equals(0)))))))).bind("forLoop");

After you have defined your matchers, add more 助手to run them. Matchers are MatchCallbackpaired with MatchFinderobjects and 注册then ClangToolrun from.
Add 以下内容to LoopConvert.cpp:

#include "clang/ASTMatchers/ASTMatchers.h"
#include "clang/ASTMatchers/ASTMatchFinder.h"
using namespace clang;
using namespace clang::ast_matchers;
StatementMatcher LoopMatcher =
  forStmt(hasLoopInit(declStmt(hasSingleDecl(varDecl(
    hasInitializer(integerLiteral(equals(0)))))))).bind("forLoop");
class LoopPrinter : public MatchFinder::MatchCallback {
    
    
public :
  virtual void run(const MatchFinder::MatchResult &Result) {
    
    
    if (const ForStmt *FS = Result.Nodes.getNodeAs<clang::ForStmt>("forLoop"))
      FS->dump();
  }
};

and main()change it to:

int main(int argc, const char **argv) {
    
    
  auto ExpectedParser = CommonOptionsParser::create(argc, argv, MyToolCategory);
  if (!ExpectedParser) {
    
    
    //对不支持的选项,优雅失败.
    llvm::errs() << ExpectedParser.takeError();
    return 1;
  }
  CommonOptionsParser& OptionsParser = ExpectedParser.get();
  ClangTool Tool(OptionsParser.getCompilations(), OptionsParser.getSourcePathList());
  LoopPrinter Printer;
  MatchFinder Finder;
  Finder.addMatcher(LoopMatcher, &Printer);
  return Tool.run(newFrontendActionFactory(&Finder).get());
}

Now, it should be reproducible 编译, and 运行the code should discover forthe loop. Create 几个示例a new file containing, and test that the new 手工works:

cd ~/clang-llvm/llvm/llvm_build/
ninja loop-convert
vim ~/test-files/simple-loops.cc
bin/loop-convert ~/test-files/simple-loops.cc

Steps 3.5: More complex matchers

简单Matchers can find forloops, but 过滤more loops still need to be dropped. A lot of the remaining work can 一些巧妙be done with a matcher of choice, but first decide what you want to allow 属性.

How to characterize a loop that can be converted to 区间syntax 数组-based? NArray of size, 区间loop based on:
1, 0索引start
2, iterate consecutively
3, N-1end at index

has been checked
(1), so all that's left 添加is to check 循环条件to make sure the 索引变量AND Ncomparison is in the loop, and again 检查to make sure it 增量步骤's just 递增the same variable.
(2)The matcher is simple: the requirement is declared in the initsection .相同变量前增量或后增量

Unfortunately, it cannot be written 此匹配器. 匹配器It does not contain the logic to compare 两个any ASTnode and determine whether it is 相等, so it is best to 匹配have more than allowed, and additionally 回调compare with.
Then you can start 构建the submatcher. The requirements 增量步骤are 一元增量as follows:

hasIncrement(unaryOperator(hasOperatorName("++")))

Specifying 递增content introduces Clanganother AST: 怪癖Because they are 引用declared as variables 表达式, press DeclRefExpr(" 声明引用式") to indicate 变量usage.

To find 引用a specific declaration unaryOperator, simply add to it 第二个条件:

hasIncrement(unaryOperator(
  hasOperatorName("++"),
  hasUnaryOperand(declRefExpr())))

Additionally, you can restrict the matcher 递增to only when the variable is :整数匹配

hasIncrement(unaryOperator(
  hasOperatorName("++"),
  hasUnaryOperand(declRefExpr(to(varDecl(hasType(isInteger())))))))

The last step is to 标识append it to 此变量so that 回调it can be extracted in:

hasIncrement(unaryOperator(
  hasOperatorName("++"),
  hasUnaryOperand(declRefExpr(to(
    varDecl(hasType(isInteger())).bind("incrementVariable"))))))

But when 添加this code is added to LoopMatcherthe definition and 确保equipped with a new matcher 程序, it only prints out a loop initialized by 声明zero 单个变量and consisting 某个变量of .一元增量增量步骤

Now, it's just a matter of adding one 匹配器that checks if the part forof the loop compares to size. There's just one problem: if you don't look at the body, you don't know what's going on !条件变量数组循环迭代的数组

Again limit yourself to, 匹配器get 近似the desired result in and 回调fill in the details in . So, start with:

hasCondition(binaryOperator(hasOperatorName("<")))

Make sure 左侧it is 引用a variable and has 整数类型.

hasCondition(binaryOperator(
  hasOperatorName("<"),
  hasLHS(declRefExpr(to(varDecl(hasType(isInteger()))))),
  hasRHS(expr(hasType(isInteger())))))

Why? Because it doesn't work. In test-files/simple.cppthe one provided 三个循环, there is no 一个matching condition. A quick look at the dump of the first loop generated by 上一个the loop transformation shows the answer:迭代forAST

(ForStmt 0x173b240
  (DeclStmt 0x173afc8
    0x173af50 "int i =
      (IntegerLiteral 0x173afa8 'int' 0)")
  <<>>
  (BinaryOperator 0x173b060 '_Bool' '<'
    (ImplicitCastExpr 0x173b030 'int'
      (DeclRefExpr 0x173afe0 'int' lvalue Var 0x173af50 'i' 'int'))
    (ImplicitCastExpr 0x173b048 'int'
      (DeclRefExpr 0x173b008 'const int' lvalue Var 0x170fa80 'N' 'const int')))
  (UnaryOperator 0x173b0b0 'int' lvalue prefix '++'
    (DeclRefExpr 0x173b088 'int' lvalue Var 0x173af50 'i' 'int'))
  (CompoundStatement ...

It is known 声明that and 增量both match, otherwise there will be no dump 该循环. The reason is that there is a conversion in 小于the symbol 第一个操作数(i.e. LHS) , 隐式转换i.e.引用iL值到R值

The good thing is, 匹配器库, ignoringParenImpCastprovides 此问题a way to tell 匹配器,to 继续ignore before matching 隐式转换和括号.

Adjust 条件符号, restore 期望匹配.

hasCondition(binaryOperator(
  hasOperatorName("<"),
  hasLHS(ignoringParenImpCasts(declRefExpr(
    to(varDecl(hasType(isInteger())))))),
  hasRHS(expr(hasType(isInteger())))))

After adding 绑定to 想抓的式and 标识串extracting to 变量, step 2 of the array is completed.

Step 4: Extract matching nodes

At the moment, 匹配器the callback is not very interesting: it is just 转储looping AST. Sometimes, you need to 更改enter the source code. Then, use 绑定the node from the previous step.

MatchFinder::run()The callback takes MatchFinder::MatchResult&parameters. Of interest are its Contextand Nodesmembers.

That is, Clanguse ASTContextclasses to represent ASTenvironment information, but parameters 最重要are 多个操作required ASTContext*.
More 直接有用importantly , 匹配nodes 集合and how to extract them.
Because 绑定there are three variables (identified by ConditionVarName,InitVarNameand ), they can be obtained by member functions . Add inIncrementVarNamegetNodeAs()匹配节点
LoopConvert.cpp

#include "clang/AST/ASTContext.h"

change LoopMatcherto:

StatementMatcher LoopMatcher =
    forStmt(hasLoopInit(declStmt(
                hasSingleDecl(varDecl(hasInitializer(integerLiteral(equals(0)))).bind("initVarName")))),
            hasIncrement(unaryOperator(
                hasOperatorName("++"),
                hasUnaryOperand(declRefExpr(
                    to(varDecl(hasType(isInteger())).bind("incVarName")))))),
            hasCondition(binaryOperator(
                hasOperatorName("<"),
                hasLHS(ignoringParenImpCasts(declRefExpr(
                    to(varDecl(hasType(isInteger())).bind("condVarName"))))),
                hasRHS(expr(hasType(isInteger())))))).bind("forLoop");

and will LoopPrinter::runchange to

void LoopPrinter::run(const MatchFinder::MatchResult &Result) {
    
    
  ASTContext *Context = Result.Context;
  const ForStmt *FS = Result.Nodes.getNodeAs<ForStmt>("forLoop");
  //不想转换头文件!
  if (!FS || !Context->getSourceManager().isWrittenInMainFile(FS->getForLoc()))
    return;
  const VarDecl *IncVar = Result.Nodes.getNodeAs<VarDecl>("incVarName");
  const VarDecl *CondVar = Result.Nodes.getNodeAs<VarDecl>("condVarName");
  const VarDecl *InitVar = Result.Nodes.getNodeAs<VarDecl>("initVarName");
  if (!areSameVariable(IncVar, CondVar) || !areSameVariable(IncVar, InitVar))
    return;
  llvm::outs() << "发现可能基于数组的循环.\n";
}

Clang变量声明Associated with the variables represented by each . Since the form VarDeclof each declaration is , it is only necessary to ensure that ( the base class of) is not and compares the specification ."规范"按地址唯一VarDeclValueDeclNULL声明

static bool areSameVariable(const ValueDecl *First, const ValueDecl *Second) {
    
    
  return First && Second &&
         First->getCanonicalDecl() == Second->getCanonicalDecl();
}

If execution reaches LoopPrinter::run()the end of the loop, know that the loop is as follows

for (int i= 0; i < expr(); ++i) {
    
     ... }

Now, just print a 循环message stating that it was found.

By the way, although the job has been done Clangby providing a method , testing whether it is , is not that simple:规范式艰苦两个式相同

static bool areSameExpr(ASTContext *Context, const Expr *First, const Expr *Second) {
    
    
  if (!First || !Second)
    return false;
  llvm::FoldingSetNodeID FirstID, SecondID;
  First->Profile(FirstID, *Context, true);
  Second->Profile(SecondID, *Context, true);
  return FirstID == SecondID;
}

This code relies on two llvm::FoldingSetNodeIDmember functions . 比较As Stmt::Profile()the documentation shows, Profile()the member function is built according to the hash of the formula. It will be needed later AST. 节点属性Before adding it to run , try to find out is convertible .子节点属性节点描述
FoldingSetNodeID比较areSameExpr其他循环test-files/simple.cpp新代码哪些循环

Guess you like

Origin blog.csdn.net/fqbqrr/article/details/135231376