LLVM学习笔记(36)

3.5.1.6.       推导调度类型

前面,我们已经各种相关的定义通过CodeGenSchedModels中的各式容器关联起来。现在我们将根据这些定义推导出所有的调度类型。在840行遍历所有的调度类型。注意,这个遍历的终止条件SchedClasses.size()是动态的,包括了新添加的类型,因为在下面的处理里,会向该容器添加新的调度类型。

836      void CodeGenSchedModels::inferSchedClasses(){

837        DEBUG(dbgs() << NumInstrSchedClasses<< " instr sched classes.\n");

838     

839        // Visit allexisting classes and newly created classes.

840        for (unsignedIdx = 0; Idx != SchedClasses.size(); ++Idx) {

841          assert(SchedClasses[Idx].Index== Idx && "bad SCIdx");

842     

843          if (SchedClasses[Idx].ItinClassDef)

844            inferFromItinClass(SchedClasses[Idx].ItinClassDef,Idx);

845          if (!SchedClasses[Idx].InstRWs.empty())

846            inferFromInstRWs(Idx);

847          if (!SchedClasses[Idx].Writes.empty()) {

848            inferFromRW(SchedClasses[Idx].Writes,SchedClasses[Idx].Reads,

849                        Idx, SchedClasses[Idx].ProcIndices);

850          }

851          assert(SchedClasses.size()< (NumInstrSchedClasses*6) &&

852                 "too many SchedVariants");

853        }

854      }

3.5.1.6.1.  从ItinRW定义推导

首先处理将一组InstrItinClass(执行步骤)映射到一组SchedReadWrite(资源使用)的ItinRW定义(843行)。为了方便起见,我们将涉及的定义列出如下:

402      classItinRW<list<SchedReadWrite> rw, list<InstrItinClass> iic> {

403        list<InstrItinClass> MatchedItinClasses= iic;

404        list<SchedReadWrite> OperandReadWrites= rw;

405        SchedMachineModel SchedModel = ?;

406      }

其中OperandReadWrites的成员可以是SchedWriteVariant与的SchedReadVariant定义,两者的基类都是SchedVariant。SchedVariant包含了一组Variants,每个Variant(SchedVar)都有一个谓词,在谓词条件满足时,Selected援引的SchedReadWrite对象适用。

361      classSchedVariant<list<SchedVar> variants> {

362        list<SchedVar> Variants = variants;

363        bit Variadic = 0;

364        SchedMachineModel SchedModel = ?;

365      }

355      class SchedVar<SchedPredicate pred,list<SchedReadWrite> selected> {

356        SchedPredicate Predicate = pred;

357        list<SchedReadWrite> Selected =selected;

358      }

374      class SchedWriteVariant<list<SchedVar> variants>: SchedWrite,

375        SchedVariant<variants> {

376      }

383      class SchedReadVariant<list<SchedVar>variants> : SchedRead,

384        SchedVariant<variants> {

385      }

另外,SchedReadWrite定义(match)可能被SchedAlias修改为另一个(alias)。

415      classSchedAlias<SchedReadWrite match, SchedReadWrite alias> {

416        SchedReadWrite MatchRW = match;

417        SchedReadWrite AliasRW = alias;

418        SchedMachineModel SchedModel = ?;

419      }

OperandReadWrites的成员还可以是WriteSequence:

224      classWriteSequence<list<SchedWrite> writes, int rep = 1> : SchedWrite {

225        list<SchedWrite> Writes = writes;

226        int Repeat = rep;

227        SchedMachineModel SchedModel = ?;

228      }

更为复杂的是,上述定义中的SchedWrite及SchedReadWrite仍然可以是SchedWriteVariant,SchedReadVariant,WriteSequence,还有相关SchedAlias(当然,很复杂定义出现的可能性很小)。举一个ARM的例子,下面是ARMCortexA9处理器指令调度定义的一个片段(ARMScheduleA9.td):

1912   let SchedModel =CortexA9Model in {

1926   def A9WriteIssue : SchedWriteRes<[]> { let Latency = 0; }

2037   def A9WriteLMAdr : SchedWriteVariant<[

2038     SchedVar<A9LMAdr1Pred, [A9WriteAdr1]>,

2039     SchedVar<A9LMAdr2Pred, [A9WriteAdr2]>,

2040     SchedVar<A9LMAdr3Pred, [A9WriteAdr3]>,

2041     SchedVar<A9LMAdr4Pred, [A9WriteAdr4]>,

2042     SchedVar<A9LMAdr5Pred, [A9WriteAdr5]>,

2043     SchedVar<A9LMAdr6Pred, [A9WriteAdr6]>,

2044     SchedVar<A9LMAdr7Pred, [A9WriteAdr7]>,

2045     SchedVar<A9LMAdr8Pred, [A9WriteAdr8]>,

2046     // For unknownLDM/VLDM/VSTM, assume 2 32-bit registers.

2047     SchedVar<A9LMUnknownPred,[A9WriteAdr2]>]>;

2086   def A9WriteLM : SchedWriteVariant<[

2087     SchedVar<A9LMAdr1Pred,A9WriteLMOpsList.Writes[0-1]>,

2088     SchedVar<A9LMAdr2Pred,A9WriteLMOpsList.Writes[0-3]>,

2089     SchedVar<A9LMAdr3Pred, A9WriteLMOpsList.Writes[0-5]>,

2090     SchedVar<A9LMAdr4Pred,A9WriteLMOpsList.Writes[0-7]>,

2091     SchedVar<A9LMAdr5Pred,A9WriteLMOpsList.Writes[0-9]>,

2092     SchedVar<A9LMAdr6Pred,A9WriteLMOpsList.Writes[0-11]>,

2093     SchedVar<A9LMAdr7Pred,A9WriteLMOpsList.Writes[0-13]>,

2094     SchedVar<A9LMAdr8Pred,A9WriteLMOpsList.Writes[0-15]>,

2095     // For unknownLDMs, define the maximum number of writes, but only

2096     // make the firsttwo consume resources.

2097     SchedVar<A9LMUnknownPred, [A9WriteL1,A9WriteL1Hi,

2098                                A9WriteL2,A9WriteL2Hi,

2099                                A9WriteL3Hi,A9WriteL3Hi,

2100                                A9WriteL4Hi,A9WriteL4Hi,

2101                                A9WriteL5Hi,A9WriteL5Hi,

2102                                A9WriteL6Hi, A9WriteL6Hi,

2103                                A9WriteL7Hi,A9WriteL7Hi,

2104                                A9WriteL8Hi,A9WriteL8Hi]>]> {

2105     let Variadic= 1;

2106   }

2339   def :ItinRW<[A9WriteLM, A9WriteLMAdr,A9WriteIssue], [IIC_iLoad_m, IIC_iPop]>;

2529   }

我们将使用这个例子来帮助学习下面的处理代码。这个例子将原来,比如使用IIC_iLoad_m来描述调度特征的,指令定义LDM的调度特征,细分为9类。每一类由A9WriteLM,A9WriteLMAdr及A9WriteIssue中由指定的谓词选择的一个SchedReadWrite定义组成。

首先,在CodeGenSchedModels::inferSchedClasses的844行,如果调度类型的ItinClassDef不为空(来自指令定义的Itinerary),需要检查这个Itinerary定义是否对某些处理器改写了。

857      void CodeGenSchedModels::inferFromItinClass(Record*ItinClassDef,

858                                                 unsigned FromClassIdx) {

859        for (unsignedPIdx = 0, PEnd = ProcModels.size(); PIdx != PEnd; ++PIdx) {

860          constCodeGenProcModel &PM = ProcModels[PIdx];

861          // For allItinRW entries.

862          bool HasMatch = false;

863          for(RecIter II = PM.ItinRWDefs.begin(), IE = PM.ItinRWDefs.end();

864               II != IE; ++II) {

865            RecVec Matched =(*II)->getValueAsListOfDefs("MatchedItinClasses");

866            if (!std::count(Matched.begin(),Matched.end(), ItinClassDef))

867              continue;

868            if (HasMatch)

869              PrintFatalError((*II)->getLoc(),"Duplicate itinerary class "

870                            +ItinClassDef->getName()

871                            + " in ItinResourcesfor " + PM.ModelName);

872            HasMatch = true;

873            IdxVec Writes, Reads;

874            findRWs((*II)->getValueAsListOfDefs("OperandReadWrites"),Writes, Reads);

875            IdxVec ProcIndices(1, PIdx);

876            inferFromRW(Writes,Reads, FromClassIdx, ProcIndices);

877          }

878        }

879      }

ItinRW的定义都保存在它们所援引处理器CodeGenSchedModels对象的ItinRWDefs容器里。因此,859行循环遍历所有的CodeGenSchedModels实例,找出与ItinClassDef相关的对象,记录其序号。对同一个处理器,不能存在映射同一个InstrItinClass的多个ItinRW定义,因为这样带来了二义性。在874行获取该ItinRW匹配的那组SchedReadWrite,然后据此推导新调度类型。

对上面的例子,在874行得到Writes的内容是:[A9WriteLM,A9WriteLMAdr, A9WriteIssue]。

1325   void CodeGenSchedModels::inferFromRW(const IdxVec &OperWrites,

1326                                        const IdxVec &OperReads,

1327                                        unsignedFromClassIdx,

1328                                        const IdxVec &ProcIndices) {

1329     DEBUG(dbgs() << "INFER RWproc("; dumpIdxVec(ProcIndices); dbgs() << ") ");

1330  

1331     // Create a seedtransition with an empty PredTerm and the expanded sequences

1332     // of SchedWritesfor the current SchedClass.

1333     std::vector<PredTransition>LastTransitions;

1334     LastTransitions.resize(1);

1335    LastTransitions.back().ProcIndices.append(ProcIndices.begin(),

1336                                              ProcIndices.end());

1337  

1338     for (IdxIterI = OperWrites.begin(), E = OperWrites.end(); I != E; ++I) {

1339       IdxVec WriteSeq;

1340       expandRWSequence(*I, WriteSeq, /*IsRead=*/false);

1341       unsigned Idx =LastTransitions[0].WriteSequences.size();

1342      LastTransitions[0].WriteSequences.resize(Idx + 1);

1343       SmallVectorImpl<unsigned> &Seq =LastTransitions[0].WriteSequences[Idx];

1344       for(IdxIter WI = WriteSeq.begin(), WE = WriteSeq.end(); WI != WE; ++WI)

1345         Seq.push_back(*WI);

1346       DEBUG(dbgs() << "(";dumpIdxVec(Seq); dbgs() << ") ");

1347     }

1348     DEBUG(dbgs() << " Reads: ");

1349     for (IdxIterI = OperReads.begin(), E = OperReads.end(); I != E; ++I) {

1350       IdxVec ReadSeq;

1351       expandRWSequence(*I, ReadSeq, /*IsRead=*/true);

1352       unsigned Idx = LastTransitions[0].ReadSequences.size();

1353       LastTransitions[0].ReadSequences.resize(Idx+ 1);

1354       SmallVectorImpl<unsigned> &Seq =LastTransitions[0].ReadSequences[Idx];

1355       for(IdxIter RI = ReadSeq.begin(), RE = ReadSeq.end(); RI != RE; ++RI)

1356         Seq.push_back(*RI);

1357       DEBUG(dbgs() << "(";dumpIdxVec(Seq); dbgs() << ") ");

1358     }

1359     DEBUG(dbgs() << '\n');

1360  

1361     // Collect allPredTransitions for individual operands.

1362     // Iterate untilno variant writes remain.

1363     while (hasVariant(LastTransitions, *this)){

1364       PredTransitions Transitions(*this);

1365       for(std::vector<PredTransition>::const_iterator

1366              I = LastTransitions.begin(), E =LastTransitions.end();

1367            I != E; ++I) {

1368         Transitions.substituteVariants(*I);

1369       }

1370       DEBUG(Transitions.dump());

1371       LastTransitions.swap(Transitions.TransVec);

1372     }

1373     // If the first transitionhas no variants, nothing to do.

1374     if (LastTransitions[0].PredTerm.empty())

1375       return;

1376  

1377     // WARNING: Weare about to mutate the SchedClasses vector. Do not refer to

1378     // OperWrites,OperReads, or ProcIndices after calling inferFromTransitions.

1379     inferFromTransitions(LastTransitions,FromClassIdx, *this);

1380   }

上面的代码首先在1333行准备一个临时容器LastTransitions,它一开始只有一个项。随着其中SchedVariant定义的处理,它将扩大为其中不冲突项的数目。元素类型PredTransition的定义如下。

927      struct PredTransition {

928        // A predicateterm is a conjunction of PredChecks.

929        SmallVector<PredCheck, 4> PredTerm;

930        SmallVector<SmallVector<unsigned,4>,16> WriteSequences;

931        SmallVector<SmallVector<unsigned,4>,16> ReadSequences;

932        SmallVector<unsigned, 4> ProcIndices;

933      };

因为SchedVariant定义包含SchedVar列表,而SchedVar又包含SchedReadWrite列表,为了辅助SchedVariant的展开,需要如930及931行定义的WriteSequences与ReadSequences容器。另外,容器PredTerm将用于保存SchedVariant中SchedVar的谓词定义,其中的PredCheck类型如下。

918      struct PredCheck {

919        bool IsRead;

920        unsigned RWIdx;

921        Record *Predicate;

922     

922        PredCheck(bool r, unsigned w, Record *p):IsRead(r), RWIdx(w), Predicate(p) {}

923      };

920行的RWIdx用于关联该谓词所选择的SchedReadWrite定义。

在inferFromRW的1340与1351行,对ItinRW的OperandReadWrites中的SchedWrite与SchedRead定义调用expandRWSequence方法展开可能存在的WriteSequence定义。不是从WriteSequence派生的定义,在406行就返回了。至于WriteSequence的派生定义,其Repeat成员指出Writes成员的重复次数,因此410行的循环将该Writes成员展开这个次数。

401      void CodeGenSchedModels::expandRWSequence(unsignedRWIdx, IdxVec &RWSeq,

402                                                boolIsRead) const {

403        constCodeGenSchedRW &SchedRW = getSchedRW(RWIdx, IsRead);

404        if (!SchedRW.IsSequence) {

405          RWSeq.push_back(RWIdx);

406          return;

407        }

408        int Repeat =

409          SchedRW.TheDef ?SchedRW.TheDef->getValueAsInt("Repeat") : 1;

410        for (int i =0; i < Repeat; ++i) {

411          for(IdxIter I = SchedRW.Sequence.begin(), E = SchedRW.Sequence.end();

412               I != E; ++I) {

413            expandRWSequence(*I, RWSeq, IsRead);

414          }

415        }

416      }

从expandRWSequence方法返回到inferFromRW,注意1343与1354行,这些展开的结果只在当前PredTransition对象(当前上下文是LastTransitions[0])的WriteSequences及ReadSequences里占据一项。接着在1363行循环处理LastTransitions,直到它不再包含SchedWriteVariant或SchedReadVariant。

其中使用的hasVariant方法定义如下。它遍历LastTransitions,在上面展开的结果中,查找是否存在SchedWriteVariant或SchedReadVariant。

1016   static bool hasVariant(ArrayRef<PredTransition> Transitions,

1017                          CodeGenSchedModels&SchedModels) {

1018     for(ArrayRef<PredTransition>::iterator

1019            PTI = Transitions.begin(), PTE =Transitions.end();

1020          PTI != PTE; ++PTI) {

1021       for(SmallVectorImpl<SmallVector<unsigned,4> >::const_iterator

1022              WSI =PTI->WriteSequences.begin(), WSE = PTI->WriteSequences.end();

1023            WSI != WSE; ++WSI) {

1024         for(SmallVectorImpl<unsigned>::const_iterator

1025                WI = WSI->begin(), WE =WSI->end(); WI != WE; ++WI) {

1026           if (hasAliasedVariants(SchedModels.getSchedWrite(*WI),SchedModels))

1027             returntrue;

1028         }

1029       }

1030       for(SmallVectorImpl<SmallVector<unsigned,4> >::const_iterator

1031              RSI = PTI->ReadSequences.begin(),RSE = PTI->ReadSequences.end();

1032            RSI != RSE; ++RSI) {

1033         for(SmallVectorImpl<unsigned>::const_iterator

1034                RI = RSI->begin(), RE =RSI->end(); RI != RE; ++RI) {

1035           if (hasAliasedVariants(SchedModels.getSchedRead(*RI),SchedModels))

1036             returntrue;

1037         }

1038       }

1039     }

1040     return false;

1041   }

尽管方法hasAliasedVariants的名字中有aliased这个字眼,实际上在函数开头的993行却是判断传入的SchedReadWrite是否为SchedWriteVariant或SchedReadVariant。如果不是,继续检查它是否具有别名,如果该别名是一个SchedWriteVariant或SchedReadVariant定义,这个SchedWriteVariant或SchedReadVariant定义就是需要处理的对象。

991      static bool hasAliasedVariants(constCodeGenSchedRW &RW,

992                                    CodeGenSchedModels &SchedModels) {

993        if (RW.HasVariants)

994          returntrue;

995     

996        for (RecIterI = RW.Aliases.begin(), E = RW.Aliases.end(); I != E; ++I) {

997          constCodeGenSchedRW &AliasRW =

998           SchedModels.getSchedRW((*I)->getValueAsDef("AliasRW"));

999          if (AliasRW.HasVariants)

1000         returntrue;

1001       if (AliasRW.IsSequence) {

1002         IdxVec ExpandedRWs;

1003        SchedModels.expandRWSequence(AliasRW.Index, ExpandedRWs, AliasRW.IsRead);

1004         for(IdxIter SI = ExpandedRWs.begin(), SE = ExpandedRWs.end();

1005              SI != SE; ++SI) {

1006           if(hasAliasedVariants(SchedModels.getSchedRW(*SI, AliasRW.IsRead),

1007                                  SchedModels)) {

1008             returntrue;

1009           }

1010         }

1011       }

1012     }

1013     return false;

1014   }

如果存在需要处理的SchedWriteVariant或SchedReadVariant定义,在inferFromRW的1364行首先生成一个PredTransitions实例,它封装了处理所需的方法。

937      classPredTransitions {

938        CodeGenSchedModels&SchedModels;

939     

940      public:

941        std::vector<PredTransition>TransVec;

942     

943        PredTransitions(CodeGenSchedModels &sm):SchedModels(sm) {}

944     

945        void substituteVariantOperand(const SmallVectorImpl<unsigned> &RWSeq,

946                                      bool IsRead,unsigned StartIdx);

947     

948        void substituteVariants(const PredTransition &Trans);

949     

950      #ifndef NDEBUG

951        void dump() const;

952      #endif

953     

954      private:

955        bool mutuallyExclusive(Record *PredDef,ArrayRef<PredCheck> Term);

956        void getIntersectingVariants(

957          constCodeGenSchedRW &SchedRW, unsigned TransIdx,

958          std::vector<TransVariant>&IntersectingVariants);

959        void pushVariant(constTransVariant &VInfo, bool IsRead);

960      };

其中941行的容器TransVec与容器LastTransitions类型相同。

如果存在从SchedVariant派生的定义,在1368行对LastTransitions容器的每个项调用下面的方法,将其中SchedVariant定义包含的SchedVar列表,根据它们的Predicate,对Selected成员进行筛选。

1247   void PredTransitions::substituteVariants(const PredTransition &Trans) {

1248     // Build up a setof partial results starting at the back of

1249     //PredTransitions. Remember the first new transition.

1250     unsigned StartIdx = TransVec.size();

1251     TransVec.resize(TransVec.size() + 1);

1252     TransVec.back().PredTerm = Trans.PredTerm;

1253     TransVec.back().ProcIndices =Trans.ProcIndices;

1254  

1255     // Visit eachoriginal write sequence.

1256     for(SmallVectorImpl<SmallVector<unsigned,4> >::const_iterator

1257            WSI = Trans.WriteSequences.begin(),WSE = Trans.WriteSequences.end();

1258          WSI != WSE; ++WSI) {

1259       // Push a new(empty) write sequence onto all partial Transitions.

1260       for(std::vector<PredTransition>::iterator I =

1261              TransVec.begin() + StartIdx, E =TransVec.end(); I != E; ++I) {

1262        I->WriteSequences.resize(I->WriteSequences.size() + 1);

1263       }

1264       substituteVariantOperand(*WSI,/*IsRead=*/false, StartIdx);

1265     }

1266     // Visit eachoriginal read sequence.

1267     for(SmallVectorImpl<SmallVector<unsigned,4> >::const_iterator

1268            RSI = Trans.ReadSequences.begin(), RSE= Trans.ReadSequences.end();

1269          RSI != RSE; ++RSI) {

1270       // Push a new(empty) read sequence onto all partial Transitions.

1271       for(std::vector<PredTransition>::iterator I =

1272              TransVec.begin() + StartIdx, E =TransVec.end(); I != E; ++I) {

1273         I->ReadSequences.resize(I->ReadSequences.size()+ 1);

1274       }

1275       substituteVariantOperand(*RSI,/*IsRead=*/true, StartIdx);

1276     }

1277   }

首先在PredTransitions的TransVec容器里开辟一个项用于当前的SchedVariant的处理。在1256与1267行循环,依次处理前面展开的结果。以前面的例子来说,WSI将依次援引A9WriteLM,A9WriteLMAdr,A9WriteIssue展开的结果。1260与1271行循环是必要的,因为在1264与1275行执行的substituteVariantOperand方法会扩大TransVec容器来保存SchedVariant中不冲突的展开结果。

因此,substituteVariantOperand方法执行将SchedVariant展开的工作。参数RWSeq就是ItinRW的OperandReadWrites中某个SchedReadWrite展开的结果,对上面的例子,它是A9WriteLM,A9WriteLMAdr,A9WriteIssue其中之一。

对于其中既不是SchedWriteVariant或SchedReadVariant,别名也不是SchedWriteVariant或SchedReadVariant的定义,直接记录在当前正在处理的展开的ReadSequences或WriteSequences容器里,无需特别处理(1216行循环)。

1206   void PredTransitions::substituteVariantOperand(

1207     constSmallVectorImpl<unsigned> &RWSeq, bool IsRead, unsigned StartIdx) {

1208  

1209     // Visit eachoriginal RW within the current sequence.

1210     for(SmallVectorImpl<unsigned>::const_iterator

1211            RWI = RWSeq.begin(), RWE =RWSeq.end(); RWI != RWE; ++RWI) {

1212       constCodeGenSchedRW &SchedRW = SchedModels.getSchedRW(*RWI, IsRead);

1213       // Push this RWon all partial PredTransitions or distribute variants.

1214       // NewPredTransitions may be pushed within this loop which should not be

1215       // revisited(TransEnd must be loop invariant).

1216       for(unsigned TransIdx = StartIdx, TransEnd = TransVec.size();

1217            TransIdx != TransEnd; ++TransIdx) {

1218         // In thecommon case, push RW onto the current operand's sequence.

1219         if (!hasAliasedVariants(SchedRW,SchedModels)) {

1220           if (IsRead)

1221            TransVec[TransIdx].ReadSequences.back().push_back(*RWI);

1222           else

1223            TransVec[TransIdx].WriteSequences.back().push_back(*RWI);

1224           continue;

1225         }

1226         // Distributethis partial PredTransition across intersecting variants.

1227         // This willpush a copies of TransVec[TransIdx] on the back of TransVec.

1228         std::vector<TransVariant>IntersectingVariants;

1229         getIntersectingVariants(SchedRW,TransIdx, IntersectingVariants);

1230         // Now expandeach variant on top of its copy of the transition.

1231         for(std::vector<TransVariant>::const_iterator

1232                IVI =IntersectingVariants.begin(),

1233                IVE = IntersectingVariants.end();

1234              IVI != IVE; ++IVI) {

1235           pushVariant(*IVI,IsRead);

1236         }

1237       }

1238     }

1239   }

至于SchedWriteVariant或SchedReadVariant或者别名为SchedWriteVariant或SchedReadVariant的定义,在1228行首先准备一个std::vector<TransVariant>类型的容器。它用于保存Variant与展开项的关联结果,其中TransVariant的定义如下:

906      struct TransVariant {

907        Record *VarOrSeqDef;  // Variant orsequence.

908        unsigned RWIdx;       // Index ofthis variant or sequence's matched type.

909        unsigned ProcIdx;     // Processormodel index or zero for any.

910        unsigned TransVecIdx; // Index into PredTransitions::TransVec.

911     

912        TransVariant(Record*def, unsigned rwi, unsigned pi, unsigned ti):

913          VarOrSeqDef(def), RWIdx(rwi), ProcIdx(pi),TransVecIdx(ti) {}

914      };

其中907行的VarOrSeqDef是SchedVariant中Variant的某一项,910行的TransVecIdx是该SchedVar或WriteSequence(来自SchedAlias)在TransVec容器里的索引。

然后调用getIntersectingVariants方法,注意这个调用在1216行开始循环内部,因此对某一个SchedRW,对每个正在处理的展开,都调用这个方法一遍。

1053~1090行,将SchedVariant中Variant的每一项,或SchedAlias中AliasRW的每一项记录到Variants里。然后在1091行循环遍历Variants容器,VarProcIdx与AliasProcIdx都是CodeGenProcModel实例的索引。这个域可以不设置的(即定义里那个问号),表示是一个通用设置。但如果设置了,我们必须使用处理器索引匹配的定义。

1046   void PredTransitions::getIntersectingVariants(

1047     constCodeGenSchedRW &SchedRW, unsigned TransIdx,

1048     std::vector<TransVariant>&IntersectingVariants) {

1049  

1050     bool GenericRW = false;

1051  

1052     std::vector<TransVariant> Variants;

1053     if (SchedRW.HasVariants) {

1054       unsigned VarProcIdx = 0;

1055       if(SchedRW.TheDef->getValueInit("SchedModel")->isComplete()) {

1056         Record *ModelDef =SchedRW.TheDef->getValueAsDef("SchedModel");

1057         VarProcIdx =SchedModels.getProcModel(ModelDef).Index;

1058       }

1059       // Push eachvariant. Assign TransVecIdx later.

1060       constRecVec VarDefs = SchedRW.TheDef->getValueAsListOfDefs("Variants");

1061       for(RecIter RI = VarDefs.begin(), RE = VarDefs.end(); RI != RE; ++RI)

1062         Variants.push_back(TransVariant(*RI, SchedRW.Index,VarProcIdx, 0));

1063       if (VarProcIdx == 0)

1064         GenericRW = true;

1065     }

1066     for (RecIterAI = SchedRW.Aliases.begin(), AE = SchedRW.Aliases.end();

1067          AI != AE; ++AI) {

1068       // If eitherthe SchedAlias itself or the SchedReadWrite that it aliases

1069       // to isdefined within a processor model, constrain all variants to

1070       // thatprocessor.

1071       unsigned AliasProcIdx = 0;

1072       if((*AI)->getValueInit("SchedModel")->isComplete()) {

1073         Record *ModelDef =(*AI)->getValueAsDef("SchedModel");

1074         AliasProcIdx =SchedModels.getProcModel(ModelDef).Index;

1075       }

1076       constCodeGenSchedRW &AliasRW =

1077        SchedModels.getSchedRW((*AI)->getValueAsDef("AliasRW"));

1078  

1079       if (AliasRW.HasVariants) {

1080         constRecVec VarDefs = AliasRW.TheDef->getValueAsListOfDefs("Variants");

1081         for(RecIter RI = VarDefs.begin(), RE = VarDefs.end(); RI != RE; ++RI)

1082           Variants.push_back(TransVariant(*RI, AliasRW.Index,AliasProcIdx, 0));

1083       }

1084       if (AliasRW.IsSequence) {

1085         Variants.push_back(

1086           TransVariant(AliasRW.TheDef,SchedRW.Index, AliasProcIdx, 0));

1087       }

1088       if (AliasProcIdx == 0)

1089         GenericRW = true;

1090     }

1091     for (unsignedVIdx = 0, VEnd = Variants.size(); VIdx != VEnd; ++VIdx) {

1092       TransVariant &Variant = Variants[VIdx];

1093       // Don't expandvariants if the processor models don't intersect.

1094       // A zeroprocessor index means any processor.

1095       SmallVectorImpl<unsigned>&ProcIndices = TransVec[TransIdx].ProcIndices;

1096       if (ProcIndices[0] &&Variants[VIdx].ProcIdx) {

1097         unsigned Cnt =std::count(ProcIndices.begin(), ProcIndices.end(),

1098                                  Variant.ProcIdx);

1099         if (!Cnt)

1100           continue;

1101         if (Cnt > 1) {

1102           constCodeGenProcModel &PM =

1103             *(SchedModels.procModelBegin() +Variant.ProcIdx);

1104          PrintFatalError(Variant.VarOrSeqDef->getLoc(),

1105                           "Multiple variantsdefined for processor " +

1106                           PM.ModelName +

1107                           " Ensure only oneSchedAlias exists per RW.");

1108         }

1109       }

1110       if(Variant.VarOrSeqDef->isSubClassOf("SchedVar")) {

1111         Record *PredDef =Variant.VarOrSeqDef->getValueAsDef("Predicate");

1112         if (mutuallyExclusive(PredDef,TransVec[TransIdx].PredTerm))

1113           continue;

1114       }

1115       if (IntersectingVariants.empty()) {

1116         // The firstvariant builds on the existing transition.

1117         Variant.TransVecIdx = TransIdx;

1118         IntersectingVariants.push_back(Variant);

1119       }

1120       else {

1121         // Pushanother copy of the current transition for more variants.

1122         Variant.TransVecIdx = TransVec.size();

1123         IntersectingVariants.push_back(Variant);

1124         TransVec.push_back(TransVec[TransIdx]);

1125       }

1126     }

1127     if (GenericRW &&IntersectingVariants.empty()) {

1128      PrintFatalError(SchedRW.TheDef->getLoc(), "No variant of thistype has "

1129                       "a matching predicateon any processor");

1130     }

1131   }

对Variant包含SchedVar定义,首先通过mutuallyExclusive方法检查其谓词与当前PredTransition对象记录的谓词是否兼容。

972      bool PredTransitions::mutuallyExclusive(Record*PredDef,

973                                              ArrayRef<PredCheck> Term) {

974     

975        for(ArrayRef<PredCheck>::iterator I = Term.begin(), E = Term.end();

976             I != E; ++I) {

977          if (I->Predicate == PredDef)

978            returnfalse;

979     

980          constCodeGenSchedRW &SchedRW = SchedModels.getSchedRW(I->RWIdx,I->IsRead);

981          assert(SchedRW.HasVariants&& "PredCheck must refer to a SchedVariant");

982          RecVec Variants =SchedRW.TheDef->getValueAsListOfDefs("Variants");

983          for(RecIter VI = Variants.begin(), VE = Variants.end(); VI != VE; ++VI) {

984            if((*VI)->getValueAsDef("Predicate") == PredDef)

985              returntrue;

986          }

987        }

988        return false;

989      }

PredTransition对象的PredTerm容器一开始是空的,一旦处理了一个SchedVar定义,就会保存它的谓词定义以及包含这个SchedVar的SchedVariant的索引(pushVariant的1148行)。

上面代码所表达的逻辑是这样的:与一个给定SchedVariant对象相关的谓词都视为彼此互斥,即使这些谓词所表示的条件不是互斥的。这没问题,因为选择给定SchedWrite的谓词总是以它们在.td文件里定义是次序来检查。后面的条件隐含地否定了任何前面的条件。

如果不互斥,在IntersectingVariants中记录这个Variant对象,并扩大TransVec记录这些Variant。

以上面的例子来说,首先处理A9WriteLM,Variants的内容是:

TransIdx = 0

[0]: SchedVar<A9LMAdr1Pred,A9WriteLMOpsList.Writes[0-1]>

[1]: SchedVar<A9LMAdr2Pred,A9WriteLMOpsList.Writes[0-3]>

[2]: SchedVar<A9LMAdr3Pred,A9WriteLMOpsList.Writes[0-5]>

[3]: SchedVar<A9LMAdr4Pred, A9WriteLMOpsList.Writes[0-7]>

[4]: SchedVar<A9LMAdr5Pred,A9WriteLMOpsList.Writes[0-9]>

[5]: SchedVar<A9LMAdr6Pred,A9WriteLMOpsList.Writes[0-11]>

[6]: SchedVar<A9LMAdr7Pred,A9WriteLMOpsList.Writes[0-13]>

[7]: SchedVar<A9LMAdr8Pred, A9WriteLMOpsList.Writes[0-15]>

[8]: SchedVar<A9LMUnknownPred, [A9WriteL1,A9WriteL1Hi, A9WriteL2, A9WriteL2Hi,

                             A9WriteL3Hi,A9WriteL3Hi, A9WriteL4Hi, A9WriteL4Hi,

                             A9WriteL5Hi,A9WriteL5Hi, A9WriteL6Hi, A9WriteL6Hi,

                             A9WriteL7Hi,A9WriteL7Hi, A9WriteL8Hi, A9WriteL8Hi]>

IntersectingVariants中每一项包含Variants对应的一项。同时TransVec中也有9项。

回到substituteVariantOperand方法,现在容器IntersectingVariants记录了这个SchedVariant定义所对应的若干TransVariant实例。在1231行遍历这些实例,并对它们调用下面的pushVariant方法记录被选中的SchedReadWrite定义。

1135   void PredTransitions::

1136   pushVariant(constTransVariant &VInfo, bool IsRead) {

1137  

1138     PredTransition &Trans =TransVec[VInfo.TransVecIdx];

1139  

1140     // If thisoperand transition is reached through a processor-specific alias,

1141     // then the wholetransition is specific to this processor.

1142     if (VInfo.ProcIdx != 0)

1143       Trans.ProcIndices.assign(1, VInfo.ProcIdx);

1144  

1145     IdxVec SelectedRWs;

1146     if(VInfo.VarOrSeqDef->isSubClassOf("SchedVar")) {

1147       Record *PredDef =VInfo.VarOrSeqDef->getValueAsDef("Predicate");

1148       Trans.PredTerm.push_back(PredCheck(IsRead, VInfo.RWIdx,PredDef));

1149       RecVec SelectedDefs =VInfo.VarOrSeqDef->getValueAsListOfDefs("Selected");

1150       SchedModels.findRWs(SelectedDefs,SelectedRWs, IsRead);

1151     }

1152     else {

1153       assert(VInfo.VarOrSeqDef->isSubClassOf("WriteSequence")&&

1154              "variant must be a SchedVariantor aliased WriteSequence");

1155      SelectedRWs.push_back(SchedModels.getSchedRWIdx(VInfo.VarOrSeqDef,IsRead));

1156     }

1157  

1158     constCodeGenSchedRW &SchedRW = SchedModels.getSchedRW(VInfo.RWIdx, IsRead);

1159  

1160    SmallVectorImpl<SmallVector<unsigned,4> > &RWSequences =IsRead

1161       ? Trans.ReadSequences :Trans.WriteSequences;

1162     if (SchedRW.IsVariadic) {

1163       unsigned OperIdx = RWSequences.size()-1;

1164       // Make N-1copies of this transition's last sequence.

1165       for(unsigned i = 1, e = SelectedRWs.size(); i != e; ++i) {

1166         // Create atemporary copy the vector could reallocate.

1167         RWSequences.reserve(RWSequences.size() +1);

1168         RWSequences.push_back(RWSequences[OperIdx]);

1169       }

1170       // Push each ofthe N elements of the SelectedRWs onto a copy of the last

1171       // sequence(split the current operand into N operands).

1172       // Note thatwrite sequences should be expanded within this loop--the entire

1173       // sequencebelongs to a single operand.

1174       for(IdxIter RWI = SelectedRWs.begin(), RWE = SelectedRWs.end();

1175            RWI != RWE; ++RWI, ++OperIdx) {

1176         IdxVec ExpandedRWs;

1177         if (IsRead)

1178           ExpandedRWs.push_back(*RWI);

1179         else

1180           SchedModels.expandRWSequence(*RWI,ExpandedRWs, IsRead);

1181        RWSequences[OperIdx].insert(RWSequences[OperIdx].end(),

1182                                     ExpandedRWs.begin(),ExpandedRWs.end());

1183       }

1184       assert(OperIdx== RWSequences.size() && "missed a sequence");

1185     }

1186     else {

1187       // Push thistransition's expanded sequence onto this transition's last

1188       // sequence(add to the current operand's sequence).

1189       SmallVectorImpl<unsigned> &Seq =RWSequences.back();

1190       IdxVec ExpandedRWs;

1191       for(IdxIter RWI = SelectedRWs.begin(), RWE = SelectedRWs.end();

1192            RWI != RWE; ++RWI) {

1193         if (IsRead)

1194           ExpandedRWs.push_back(*RWI);

1195         else

1196           SchedModels.expandRWSequence(*RWI,ExpandedRWs, IsRead);

1197       }

1198       Seq.insert(Seq.end(), ExpandedRWs.begin(),ExpandedRWs.end());

1199     }

1200   }

TransVariant实例要么来自SchedVariant的Variants(list<SchedVar>),要么来自SchedAlias的AliasRW(WriteSequence或SchedVariant),这样就不难理解1153行的断言了。对于由SchedVar生成的TransVariant对象,它的谓词记录在相关的PredTransition实例(这就是getIntersectingVariants1112行TransVec[TransIdx].PredTerm的来源)。

容器SelectedRWs记录相关SchedReadWrite定义的索引(SchedVar是Selected,WriteSequence则是它本身)。在1158行从VInfo.RWIdx得到包含这个SchedVar或WriteSequence的SchedVariant定义的CodeGenSchedRW实例。

如果这个SchedVariant定义不包含可变参数(不满足1162行条件),1191行遍历选中SchedVar定义中的Selected或SchedAlias定义中的AliasRW,展开可能出现的WriteSequence,在TransVec[VInfo.TransVecIdx]的ReadSequences或WriteSequences容器里记录这些CodeGenSchedRW对象序号。这些CodeGenSchedRW对象全部保存在容器的最后一个单元,这是与可变参数时最大的区别。

如果这个SchedVariant定义包含可变参数,首先在TransVec[VInfo.TransVecIdx]的ReadSequences或WriteSequences容器里拷贝最后一个单元,使得未用单元与SelectedRWs的成员可以一一对应。然后,将SelectedRWs中的成员(SchedReadWrite)分别一一对应地记录在这些未用单元里。

这样对于我们的例子,TransVec的内容是这样的:

TransVec[0].WriteSequences[0]:A9WriteLMOpsList.Writes[0-1], Pred: A9LMAdr1Pred

TransVec[1].WriteSequences[0]:A9WriteLMOpsList.Writes[0-3], Pred: A9LMAdr2Pred

TransVec[2].WriteSequences[0]:A9WriteLMOpsList.Writes[0-5], Pred: A9LMAdr3Pred

TransVec[3].WriteSequences[0]:A9WriteLMOpsList.Writes[0-7], Pred: A9LMAdr4Pred

TransVec[4].WriteSequences[0]:A9WriteLMOpsList.Writes[0-9], Pred: A9LMAdr5Pred

TransVec[5].WriteSequences[0]:A9WriteLMOpsList.Writes[0-11], Pred: A9LMAdr6Pred

TransVec[6].WriteSequences[0]:A9WriteLMOpsList.Writes[0-13], Pred: A9LMAdr7Pred

TransVec[7].WriteSequences[0]:A9WriteLMOpsList.Writes[0-15], Pred: A9LMAdr8Pred

TransVec[8]: Pred: A9LMUnknownPred

WriteSequences[0]: A9WriteL1

WriteSequences[1]: A9WriteL1Hi

WriteSequences[2]: A9WriteL2

WriteSequences[3]: A9WriteL2Hi

WriteSequences[4]: A9WriteL3Hi

WriteSequences[5]: A9WriteL3Hi

WriteSequences[6]: A9WriteL4Hi

WriteSequences[7]: A9WriteL4Hi

WriteSequences[8]: A9WriteL5Hi

WriteSequences[9]: A9WriteL5Hi

WriteSequences[10]: A9WriteL6Hi

WriteSequences[11]: A9WriteL6Hi

WriteSequences[12]: A9WriteL7Hi

WriteSequences[13]: A9WriteL7Hi

WriteSequences[14]: A9WriteL8Hi

WriteSequences[15]: A9WriteL8Hi

从substituteVariantOperand返回,再次进入substituteVariants在1256行的循环,处理ItinRW的OperandReadWrites中下一个SchedReadWrite。以上面的例子来说,是A9WriteLMAdr。在1262行,上面所有TransVec的WriteSequences容器都扩大一项。处理后,得到(可以对照mutuallyExclusive代码看一下):

TransVec[0] A9LMAdr1Pred x 2:

WriteSequences[0]: A9WriteLMOpsList.Writes[0-1]

WriteSequences[1]: A9WriteAdr1

TransVec[1] A9LMAdr2Pred x 2:

WriteSequences[0]: A9WriteLMOpsList.Writes[0-3]

WriteSequences[1]: A9WriteAdr2

TransVec[2] A9LMAdr3Pred x 2:

WriteSequences[0]: A9WriteLMOpsList.Writes[0-5]

WriteSequences[1]: A9WriteAdr3

TransVec[3] A9LMAdr4Pred x 2:

WriteSequences[0]: A9WriteLMOpsList.Writes[0-7]

WriteSequences[1]: A9WriteAdr4

TransVec[4] A9LMAdr5Pred x 2:

WriteSequences[0]: A9WriteLMOpsList.Writes[0-9]

WriteSequences[1]: A9WriteAdr5

TransVec[5] A9LMAdr6Pred x 2:

WriteSequences[0]:A9WriteLMOpsList.Writes[0-11]

WriteSequences[1]:A9WriteAdr6

TransVec[6] A9LMAdr7Pred x 2:

WriteSequences[0]: A9WriteLMOpsList.Writes[0-13]

WriteSequences[1]: A9WriteAdr7

TransVec[7] A9LMAdr8Pred x 2:

WriteSequences[0]: A9WriteLMOpsList.Writes[0-15]

WriteSequences[1]: A9WriteAdr8

TransVec[8] A9LMUnknownPred x2:

WriteSequences[0]: A9WriteL1

WriteSequences[1]: A9WriteL1Hi

WriteSequences[2]: A9WriteL2

WriteSequences[3]: A9WriteL2Hi

WriteSequences[4]: A9WriteL3Hi

WriteSequences[5]: A9WriteL3Hi

WriteSequences[6]: A9WriteL4Hi

WriteSequences[7]: A9WriteL4Hi

WriteSequences[8]: A9WriteL5Hi

WriteSequences[9]: A9WriteL5Hi

WriteSequences[10]: A9WriteL6Hi

WriteSequences[11]: A9WriteL6Hi

WriteSequences[12]: A9WriteL7Hi

WriteSequences[13]: A9WriteL7Hi

WriteSequences[14]: A9WriteL8Hi

WriteSequences[15]: A9WriteL8Hi

WriteSequences[16]: A9WriteAdr2

同样,在处理了最后的A9WriteIssue之后,TransVec变成:

TransVec[0] A9LMAdr1Pred x 3:

WriteSequences[0]: A9WriteLMOpsList.Writes[0-1]

WriteSequences[1]: A9WriteAdr1

WriteSequences[2]: A9WriteIssue

TransVec[1] A9LMAdr2Pred x 3:

WriteSequences[0]: A9WriteLMOpsList.Writes[0-3]

WriteSequences[1]: A9WriteAdr2

WriteSequences[2]: A9WriteIssue

TransVec[2] A9LMAdr3Pred x 3:

WriteSequences[0]: A9WriteLMOpsList.Writes[0-5]

WriteSequences[1]: A9WriteAdr3

WriteSequences[2]: A9WriteIssue

TransVec[3] A9LMAdr4Pred x 3:

WriteSequences[0]: A9WriteLMOpsList.Writes[0-7]

WriteSequences[1]: A9WriteAdr4

WriteSequences[2]: A9WriteIssue

TransVec[4] A9LMAdr5Pred x 3:

WriteSequences[0]: A9WriteLMOpsList.Writes[0-9]

WriteSequences[1]: A9WriteAdr5

WriteSequences[2]: A9WriteIssue

TransVec[5] A9LMAdr6Pred x 3:

WriteSequences[0]:A9WriteLMOpsList.Writes[0-11]

WriteSequences[1]: A9WriteAdr6

WriteSequences[2]:A9WriteIssue

TransVec[6] A9LMAdr7Pred x 3:

WriteSequences[0]: A9WriteLMOpsList.Writes[0-13]

WriteSequences[1]: A9WriteAdr7

WriteSequences[2]: A9WriteIssue

TransVec[7] A9LMAdr8Pred x 3:

WriteSequences[0]: A9WriteLMOpsList.Writes[0-15]

WriteSequences[1]: A9WriteAdr8

WriteSequences[2]: A9WriteIssue

TransVec[8] A9LMUnknownPredx 3:

WriteSequences[0]: A9WriteL1

WriteSequences[1]: A9WriteL1Hi

WriteSequences[2]: A9WriteL2

WriteSequences[3]: A9WriteL2Hi

WriteSequences[4]: A9WriteL3Hi

WriteSequences[5]: A9WriteL3Hi

WriteSequences[6]: A9WriteL4Hi

WriteSequences[7]: A9WriteL4Hi

WriteSequences[8]: A9WriteL5Hi

WriteSequences[9]: A9WriteL5Hi

WriteSequences[10]: A9WriteL6Hi

WriteSequences[11]: A9WriteL6Hi

WriteSequences[12]: A9WriteL7Hi

WriteSequences[13]: A9WriteL7Hi

WriteSequences[14]: A9WriteL8Hi

WriteSequences[15]: A9WriteL8Hi

WriteSequences[16]: A9WriteAdr2

WriteSequences[17]: A9WriteIssue

回到CodeGenSchedModels::inferFromRW。在1371行,Transitions.TransVec与LastTransitions交互内容,重新进入1363行循环,直到 LastTransitions中不包含SchedVariant为止。

一旦有SchedVariant被处理,就需要定义inferFromTransitions方法来构建与之对应的新调度类型。以我们的例子来说,现在LastTransitions的内容与TransVec一致。

1280   static void inferFromTransitions(ArrayRef<PredTransition>LastTransitions,

1281                                    unsignedFromClassIdx,

1282                                   CodeGenSchedModels &SchedModels) {

1283     // For eachPredTransition, create a new CodeGenSchedTransition, which usually

1284     // requires creatinga new SchedClass.

1285     for(ArrayRef<PredTransition>::iterator

1286            I = LastTransitions.begin(), E =LastTransitions.end(); I != E; ++I) {

1287       IdxVec OperWritesVariant;

1288       for(SmallVectorImpl<SmallVector<unsigned,4> >::const_iterator

1289              WSI = I->WriteSequences.begin(),WSE = I->WriteSequences.end();

1290            WSI != WSE; ++WSI) {

1291         // Create anew write representing the expanded sequence.

1292         OperWritesVariant.push_back(

1293           SchedModels.findOrInsertRW(*WSI, /*IsRead=*/false));

1294       }

1295       IdxVec OperReadsVariant;

1296       for(SmallVectorImpl<SmallVector<unsigned,4> >::const_iterator

1297              RSI = I->ReadSequences.begin(),RSE = I->ReadSequences.end();

1298            RSI != RSE; ++RSI) {

1299         // Create anew read representing the expanded sequence.

1300         OperReadsVariant.push_back(

1301           SchedModels.findOrInsertRW(*RSI, /*IsRead=*/true));

1302       }

1303       IdxVecProcIndices(I->ProcIndices.begin(), I->ProcIndices.end());

1304       CodeGenSchedTransition SCTrans;

1305       SCTrans.ToClassIdx =

1306         SchedModels.addSchedClass(/*ItinClassDef=*/nullptr, OperWritesVariant,

1307                                  OperReadsVariant, ProcIndices);

1308       SCTrans.ProcIndices = ProcIndices;

1309       // The finalPredTerm is unique set of predicates guarding the transition.

1310       RecVec Preds;

1311       for (SmallVectorImpl<PredCheck>::const_iterator

1312              PI = I->PredTerm.begin(), PE =I->PredTerm.end(); PI != PE; ++PI) {

1313         Preds.push_back(PI->Predicate);

1314       }

1315       RecIter PredsEnd =std::unique(Preds.begin(), Preds.end());

1316       Preds.resize(PredsEnd - Preds.begin());

1317       SCTrans.PredTerm = Preds;

1318      SchedModels.getSchedClass(FromClassIdx).Transitions.push_back(SCTrans);

1319     }

1320   }

也就是说,上面的每个TransVec项都会产生一个调度类型。除此之外,对每个新生成的调度类型,还会生成一个CodeGenSchedTransition对象,这个对象的定义是:

97        structCodeGenSchedTransition {

98          unsigned ToClassIdx;

99          IdxVec ProcIndices;

100        RecVec PredTerm;

101      };

对指定的处理器而言,FromClassIdx所代表的调度类型被当前的ItinRW定义修改为ToClassIdx所代表的调度类型。为了记录这个事实,CodeGenSchedClass定义了Transitions容器。注意在1306行构建的CodeGenSchedClass对象,它的ItinClassDef是NULL。这样在inferSchedClasses的840行循环里,不会再对该CodeGenSchedClass实例调用inferFromRW方法。

另外,注意1315行将重复的谓词去除,就像我们例子显示的,谓词可能会重复多个。


猜你喜欢

转载自blog.csdn.net/wuhui_gdnt/article/details/80854987