2.4.2.4.2. PatFrag的展开
在第一步完成后,我们为所有的PatFrag生成了TreePattern实例。接着,2490行的for循环对其中的片段进行展开。在第一次调用ParsePatternFragments()时(CodeGenDAGPatterns构造函数的2355行),展开输入模式。在第二次调用时(CodeGenDAGPatterns构造函数的2358行),展开输出模式(OutPatFrag定义)。展开由TreePattern的InlinePatternFragments()方法执行:
601 void InlinePatternFragments() {
602 for (unsigned i = 0, e = Trees.size(); i != e; ++i)
603 Trees[i] = Trees[i]->InlinePatternFragments(*this);
604 }
实际上PatFrag都只包含一棵TreePatternNode树。不过其他结构,比如Pattern,Instruction都会调用这个函数,它们可能有多棵树。每棵树的展开由TreePatternNode::InlinePatternFragments()方法来完成。
1358 TreePatternNode *TreePatternNode::InlinePatternFragments(TreePattern &TP) {
1359 if (TP.hasError())
1360 return nullptr;
1361
1362 if (isLeaf())
1363 return this; // nothing to do.
1364 Record *Op = getOperator();
1365
1366 if (!Op->isSubClassOf("PatFrag")) {
1367 // Just recursively inline children nodes.
1368 for (unsigned i = 0, e = getNumChildren(); i != e; ++i) {
1369 TreePatternNode *Child = getChild(i);
1370 TreePatternNode *NewChild = Child->InlinePatternFragments(TP);
1371
1372 assert((Child->getPredicateFns().empty() ||
1373 NewChild->getPredicateFns() == Child->getPredicateFns()) &&
1374 "Non-empty child predicate clobbered!");
1375
1376 setChild(i, NewChild);
1377 }
1378 return this;
1379 }
1380
1381 // Otherwise, we found a reference to a fragment. First, look up its
1382 // TreePattern record.
1383 TreePattern *Frag = TP.getDAGPatterns().getPatternFragment(Op);
1384
1385 // Verify that we are passing the right number of operands.
1386 if (Frag->getNumArgs() != Children.size()) {
1387 TP.error("'" + Op->getName() + "' fragment requires " +
1388 utostr(Frag->getNumArgs()) + " operands!");
1389 return nullptr;
1390 }
1391
1392 TreePatternNode *FragTree = Frag->getOnlyTree()->clone();
1393
1394 TreePredicateFn PredFn(Frag);
1395 if (!PredFn.isAlwaysTrue())
1396 FragTree->addPredicateFn(PredFn);
1397
1398 // Resolve formal arguments to their actual value.
1399 if (Frag->getNumArgs()) {
1400 // Compute the map of formal to actual arguments.
1401 std::map<std::string, TreePatternNode*> ArgMap;
1402 for (unsigned i = 0, e = Frag->getNumArgs(); i != e; ++i)
1403 ArgMap[Frag->getArgName(i)] = getChild(i)->InlinePatternFragments(TP);
1404
1405 FragTree->SubstituteFormalArguments(ArgMap);
1406 }
1407
1408 FragTree->setName(getName());
1409 for (unsigned i = 0, e = Types.size(); i != e; ++i)
1410 FragTree->UpdateNodeType(i, getExtType(i), TP);
1411
1412 // Transfer in the old predicates.
1413 for (unsigned i = 0, e = getPredicateFns().size(); i != e; ++i)
1414 FragTree->addPredicateFn(getPredicateFns()[i]);
1415
1416 // Get a new copy of this fragment to stitch into here.
1417 //delete this; // FIXME: implement refcounting!
1418
1419 // The fragment we inlined could have recursive inlining that is needed. See
1420 // if there are any pattern fragments in it and inline them as needed.
1421 return FragTree->InlinePatternFragments(TP);
1422 }
1362行的isLeaf方法首先筛选出模式树的叶子节点,它们不可能是展开的对象(因为SDNode与PatFrag都不产生叶子节点)。非叶子节点总是由一个DagInit值生成的。
如果这个DagInit值的操作符不是一个PatFrag定义(比如一个SDNode定义),那么操作符本身不可展开,因此在1370行对其操作数进行展开,并替换为展开的结果。注意,这些展开可能没有执行任何操作,得到的结果与原来是一样的。
而如果操作符是一个PatFrag定义,首先确保操作数与子节点个数相同(正常情况下,这是可以保证的,因为子节点就是根据操作数生成的)。然后克隆这棵模式树(TreePatternNode),这样做是因为这个PatFrag可能有多处使用,我们必须保留一个未展开的形式,以备后续展开这个PatFrag的其他地方使用。
如果存在操作数,在1403行将展开的操作数子节点保存在ArgMap容器,然后对克隆树进行参数值的替换。
1328 void TreePatternNode::
1329 SubstituteFormalArguments(std::map<std::string, TreePatternNode*> &ArgMap) {
1330 if (isLeaf()) return;
1331
1332 for (unsigned i = 0, e = getNumChildren(); i != e; ++i) {
1333 TreePatternNode *Child = getChild(i);
1334 if (Child->isLeaf()) {
1335 Init *Val = Child->getLeafValue();
1336 // Note that, when substituting into an output pattern, Val might be an
1337 // UnsetInit.
1338 if (isa<UnsetInit>(Val) || (isa<DefInit>(Val) &&
1339 cast<DefInit>(Val)->getDef()->getName() == "node")) {
1340 // We found a use of a formal argument, replace it with its value.
1341 TreePatternNode *NewChild = ArgMap[Child->getName()];
1342 assert(NewChild && "Couldn't find formal argument!");
1343 assert((Child->getPredicateFns().empty() ||
1344 NewChild->getPredicateFns() == Child->getPredicateFns()) &&
1345 "Non-empty child predicate clobbered!");
1346 setChild(i, NewChild);
1347 }
1348 } else {
1349 getChild(i)->SubstituteFormalArguments(ArgMap);
1350 }
1351 }
1352 }
SubstituteFormalArguments()只替换代表未初始化值(UnsetInit,对应(ops?:$name)或(ops $name)形式)以及(ops node:$name)形式的叶子节点。注意,1341行的Child->getName()返回的是name部分,比如(ops node:$ptr),就是字符串”ptr”。
用一个例子能更容易理解这个过程。比如这些定义:
def P1: PatFrag<(node:$src, node:$dst), (P2 node:$src, node:$dst), [{…}]>;
def P2: PatFrag<(node:$src, node:$dst), (P3 node:$src, (P4 node:$dst)), [{…}]>;
def P3: PatFrag<(node:$src, node:$dst), (S1 node:$src, node:$dst), [{…}]>;
def P4: PatFrag<(node:$val), (S2 node:$val), [{…}]>;
def S1: SDNode;
def S2: SDNode;
对P1解析的结果是:一棵以P2为根,node:$src、node:$dst为叶子的树(称为P1树)。
对P2解析的结果是:一棵以P3为根,node:$src与(P4 node:$dst)为子树的树(称为P2树)。
对P3解析的结果是:一棵以S1为根,node:$src、node:$dst为叶子的树(称为P3树)。
对P4解析的结果是:一棵以S2为根,node:$val为叶子的树(称为P4树)。
开始内联P2树时,InlinePatternFragments()的1383行获取P1的克隆树(称为P1clone)。在1403行递归进入叶子节点node:$src,node:$dst,并在1363行返回。这时,ArgMap[dst]ànode:$dst,ArgMap[src]ànode:$src。然后对P1clone进行参数替换,这一步没有实质的变化。
接着在1421行,对P1clone的P2操作符进行内联。类似的,在1383行获取P2克隆树(称为P2clone),在1403行进入P2clone的子树node:$src与(P4 node:$dst)。node:$src直接返回,得到P2clone的ArgMap[src] ànode:$src。(P4 node:$dst)则再次调用InlinePatternFragments()。在这个过程中,得到P4的克隆树(称为P4clone),因为P2clone的P4子树的叶子节点是node:$dst。因此P2clone有ArgMap[val]ànode:$dst,P4clone进行参数替换,得到(S2 node:$dst)。然后在1421行,内联P4clone的S2操作符。S2是SDNode的定义,且node:$dst是叶子节点,因此S2不改变。自此,P4clone处理完成。
现在P2clone的ArgMap[dst]àP4clone,ArgMap[src]ànode:$src。接着在1421行对P2clone的S1操作符进行内联,也无改变。自此,P2clone内联展开完成。进而P1clone内联展开也完成,得到:(S1 node:$src, (S2 node: $dst)))——同时也实现了node:$dst对node:$val的替换。
V7.0的实现如下。第一个版本的InlinePatternFragments()是所有外部调用的入口,即让一个TreePattern实例内联其中的片段。类型TreePatternNodePtr是std::shared_ptr<TreePatternNode>。 864 void InlinePatternFragments() { 865 std::vector<TreePatternNodePtr> Copy = Trees; 866 Trees.clear(); 867 for (unsigned i = 0, e = Copy.size(); i != e; ++i) 868 Copy[i]->InlinePatternFragments(Copy[i], *this, Trees); 869 } 容器Trees里保存的是由当前PatFrags派生定义里的Fragments域中(list<dag>)每个片段所创建的TreePatternNode实例。在上面865行Trees容器里的对象被移到了Copy临时容器中,然后对每个片段的TreePatternNode实例调用下面TreePatternNode的InlinePatternFragments()进行展开。 1858 void TreePatternNode::InlinePatternFragments( 1859 TreePatternNodePtr T, TreePattern &TP, 1860 std::vector<TreePatternNodePtr> &OutAlternatives) { 1861 1862 if (TP.hasError()) 1863 return; 1864 1865 if (isLeaf()) { 1866 OutAlternatives.push_back(T); // nothing to do. 1867 return; 1868 } 1869 1870 Record *Op = getOperator(); 1871 1872 if (!Op->isSubClassOf("PatFrags")) { 1873 if (getNumChildren() == 0) { 1874 OutAlternatives.push_back(T); 1875 return; 1876 } 1877 1878 // Recursively inline children nodes. 1879 std::vector<std::vector<TreePatternNodePtr> > ChildAlternatives; 1880 ChildAlternatives.resize(getNumChildren()); 1881 for (unsigned i = 0, e = getNumChildren(); i != e; ++i) { 1882 TreePatternNodePtr Child = getChildShared(i); 1883 Child->InlinePatternFragments(Child, TP, ChildAlternatives[i]); 1884 // If there are no alternatives for any child, there are no 1885 // alternatives for this expression as whole. 1886 if (ChildAlternatives[i].empty()) 1887 return; 1888 1889 for (auto NewChild : ChildAlternatives[i]) 1890 assert((Child->getPredicateFns().empty() || 1891 NewChild->getPredicateFns() == Child->getPredicateFns()) && 1892 "Non-empty child predicate clobbered!"); 1893 } 1894 1895 // The end result is an all-pairs construction of the resultant pattern. 1896 std::vector<unsigned> Idxs; 1897 Idxs.resize(ChildAlternatives.size()); 1898 bool NotDone; 1899 do { 1900 // Create the variant and add it to the output list. 1901 std::vector<TreePatternNodePtr> NewChildren; 1902 for (unsigned i = 0, e = ChildAlternatives.size(); i != e; ++i) 1903 NewChildren.push_back(ChildAlternatives[i][Idxs[i]]); 1904 TreePatternNodePtr R = std::make_shared<TreePatternNode>( 1905 getOperator(), std::move(NewChildren), getNumTypes()); 1906 1907 // Copy over properties. 1908 R->setName(getName()); 1909 R->setPredicateFns(getPredicateFns()); 1910 R->setTransformFn(getTransformFn()); 1911 for (unsigned i = 0, e = getNumTypes(); i != e; ++i) 1912 R->setType(i, getExtType(i)); 1913 1914 // Register alternative. 1915 OutAlternatives.push_back(R); 1916 1917 // Increment indices to the next permutation by incrementing the 1918 // indices from last index backward, e.g., generate the sequence 1919 // [0, 0], [0, 1], [1, 0], [1, 1]. 1920 int IdxsIdx; 1921 for (IdxsIdx = Idxs.size() - 1; IdxsIdx >= 0; --IdxsIdx) { 1922 if (++Idxs[IdxsIdx] == ChildAlternatives[IdxsIdx].size()) 1923 Idxs[IdxsIdx] = 0; 1924 else 1925 break; 1926 } 1927 NotDone = (IdxsIdx >= 0); 1928 } while (NotDone); 1929 1930 return; 1931 } 1932 1933 // Otherwise, we found a reference to a fragment. First, look up its 1934 // TreePattern record. 1935 TreePattern *Frag = TP.getDAGPatterns().getPatternFragment(Op); 1936 1937 // Verify that we are passing the right number of operands. 1938 if (Frag->getNumArgs() != Children.size()) { 1939 TP.error("'" + Op->getName() + "' fragment requires " + 1940 Twine(Frag->getNumArgs()) + " operands!"); 1941 return; 1942 } 1943 1944 // Compute the map of formal to actual arguments. 1945 std::map<std::string, TreePatternNodePtr> ArgMap; 1946 for (unsigned i = 0, e = Frag->getNumArgs(); i != e; ++i) { 1947 const TreePatternNodePtr &Child = getChildShared(i); 1948 ArgMap[Frag->getArgName(i)] = Child; 1949 } 1950 1951 // Loop over all fragment alternatives. 1952 for (auto Alternative : Frag->getTrees()) { 1953 TreePatternNodePtr FragTree = Alternative->clone(); 1954 1955 TreePredicateFn PredFn(Frag); 1956 if (!PredFn.isAlwaysTrue()) 1957 FragTree->addPredicateFn(PredFn); 1958 1959 // Resolve formal arguments to their actual value. 1960 if (Frag->getNumArgs()) 1961 FragTree->SubstituteFormalArguments(ArgMap); 1962 1963 // Transfer types. Note that the resolved alternative may have fewer 1964 // (but not more) results than the PatFrags node. 1965 FragTree->setName(getName()); 1966 for (unsigned i = 0, e = FragTree->getNumTypes(); i != e; ++i) 1967 FragTree->UpdateNodeType(i, getExtType(i), TP); 1968 1969 // Transfer in the old predicates. 1970 for (const TreePredicateFn &Pred : getPredicateFns()) 1971 FragTree->addPredicateFn(Pred); 1972 1973 // The fragment we inlined could have recursive inlining that is needed. See 1974 // if there are any pattern fragments in it and inline them as needed. 1975 FragTree->InlinePatternFragments(FragTree, TP, OutAlternatives); 1976 } 1977 } TreePatternNode如果是PagFrags定义里的叶子节点,它的Val域是非空的,Operator域是空的;如果是非叶子节点,则反之。因此,到达1870行的都是非叶子节点。1872~1931行处理非PatFrags的派生定义。我们在前面看到对这样的TreePatternNode节点,其Children容器里保存的是对应dag定义里的操作数的TreePatternNode实例。 在1881行循环,对这些操作数执行深度优先的递归处理。这里,除非处理出错,ChildAlternatives容器是不可能空的,另外,由于操作数有可能展开为多个,因此这个容器是二维的。操作数展开后,如果原来有谓词,展开前后谓词必须相同。 1889~1928行循环执行这样的操作:这是一个深度自递归过程。假设这个dag有两个PatFrags操作数,分别展开为2个及3个结果。那么该循环要生成这两个操作数展开结果的完全组合,并对每对组合生成TreePatternNode实例。所生成的组合依次是:([0][0], [1][0]), ([0][0], [1][1]), ([0][0], [1][2]), ([0][1], [1][0]), ([0][1], [1][1]), ([0][1], [1][2]), ([0][2], [1][0]), ([0][2], [1][1]), ([0][2], [1][2])。(假若PatFrags操作数展开的结果仍然是PatFrags,这些PatFrags结果在1883行的递归调用里先得到处理,与上面类似的结果保存在ChildAlternatives容器里,再进入1899行循环)。每对括号里的操作数与原来的操作符将一起生成一个TreePatternNode实例,然后这些实例保存在OutAlternatives容器里返回。 1935行以下是处理PatFrags的派生定义作为片段dag操作符的情形。在这一行上获取该PatFrags派生定义对应的TreePattern实例,而它所在dag的操作数个数必须与这个TreePattern实例里Children容器保存的TreePatternNode实例个数相同(它们就是从dag的操作数解析来的)。剩下代码执行实参替换。 由上面的处理看到,一个片段的展开可能有多个结果,在这些结果里操作数已经组合好了,因此在1946行循环先将实参的TreePatternNode实例与实参名通过临时容器ArgMap关联,在1952行对片段的每个TreePatternNode实例执行实参替换(通过SubstituteFormalArguments()方法,与上面v3.6.1的处理相同)。最后在1975行对替换后的片段调用InlinePatternFragments()进行最后一轮的(可能的)展开。 |