LLVM学习笔记(41)

3.6.2.2. 资源及其使用的描述

我们已经知道有两个方式可以描述指令的执行。一种是执行步骤,Itinerary,它包括了一系列包含一组InstrStage定义的InstrItinData定义,将InstrItinData与指令定义关联起来的InstrItinClass,以及一个把有关定义组合起来的ProcessorItineraries定义。另一种则是通过描述资源使用情形,它由一系列相互关联的SchedReadWrite派生定义组成。

这背后都是将处理器描述成若干资源,并叙述指令对这些资源的使用情况。现在是时候输出相关的数据结构了。

1250   void SubtargetEmitter::EmitSchedModel(raw_ostream &OS) {

1251     OS << "#ifdef DBGFIELD\n"

1252        << "#error \"<target>GenSubtargetInfo.inc requires a DBGFIELD macro\"\n"

1253        << "#endif\n"

1254        << "#ifndef NDEBUG\n"

1255        << "#define DBGFIELD(x) x,\n"

1256        << "#else\n"

1257        << "#define DBGFIELD(x)\n"

1258        << "#endif\n";

1259  

1260     if (SchedModels.hasItineraries()) {

1261       std::vector<std::vector<InstrItinerary> > ProcItinLists;

1262       // Emit the stage data

1263       EmitStageAndOperandCycleData(OS, ProcItinLists);

1264       EmitItineraries(OS, ProcItinLists);

1265     }

与前面章节看到的一样,这里的SchedModels对象中容器ProcModels保存了同一族的各个处理器的CodeGenProcModel对象。如果处理器中有使用执行步骤来描述的,满足1260行条件,将输出这些处理器的步骤(stage)数据。类似于TD文件里使用的InstrStage定义,LLVM也有一个同名的、作用相类的类型。

59        struct InstrStage {

60          enum ReservationKinds {

61            Required = 0,

62            Reserved = 1

63          };

64       

65          unsigned Cycles_;  ///< Length of stage in machine cycles

66          unsigned Units_;   ///< Choice of functional units

67          int NextCycles_;   ///< Number of machine cycles to next stage

68          ReservationKinds Kind_; ///< Kind of the FU reservation

69       

70          /// \brief Returns the number of cycles the stage is occupied.

71          unsigned getCycles() const {

72            return Cycles_;

73          }

74       

75          /// \brief Returns the choice of FUs.

76          unsigned getUnits() const {

77            return Units_;

78          }

79       

80          ReservationKinds getReservationKind() const {

81            return Kind_;

82          }

83       

84          /// \brief Returns the number of cycles from the start of this stage to the

85          /// start of the next stage in the itinerary

86          unsigned getNextCycles() const {

87            return (NextCycles_ >= 0) ? (unsigned)NextCycles_ : Cycles_;

88          }

89        };

InstrStage代表指令执行中的一个非流水线化的步骤。Cycles表示完成该步骤所需的周期,Units表示可供选择用于完成该步骤的功能单元。比如IntUnit1,IntUnit2。NextCycles表示从该步骤开始到下一步开始所应该消逝的周期数。值-1表示下一步应该跟在当前步骤后立即开始。比如:

{ 1, x, -1 }:表示该步骤占用FU x一个周期,下一步在该步骤后立即开始。

{ 2, x|y, 1 }:表示该步骤占用FU x或FU y连续的两个周期,下一步应该在该步骤开始一周期后开始。即,这些步骤要求在时间上重叠。

{ 1, x, 0 }:表示该步骤占用FU x一个周期,下一步与该步骤在同一个周期开始。这可用于表示指令同一时间要求多个步骤。

有两种FU保留类型:指令实际要求的FU,指令仅保留的FU。对其他指令的执行,保留单元不可用。不过,多条指令可以多次保留同一个单元。这两种单元保留用于模拟指令字段改变导致的暂停,使用同样资源(比如同一个寄存器)的FU,等等。

97        struct InstrItinerary {

98          int      NumMicroOps;        ///< # of micro-ops, -1 means it's variable

99          unsigned FirstStage;         ///< Index of first stage in itinerary

100        unsigned LastStage;          ///< Index of last + 1 stage in itinerary

101        unsigned FirstOperandCycle;  ///< Index of first operand rd/wr

102        unsigned LastOperandCycle;   ///< Index of last + 1 operand rd/wr

103      };

InstrItinerary代表指令的调度信息。包括该指令所占据的一组步骤及操作数读、写所在的流水线周期。它是InstrItinData定义在LLVM的对等物。更上一级的封装则是InstrItineraryData,它所定义的数据成员及构造函数有下面这些。它为子目标机器提供数据的封装。

109      class InstrItineraryData {

110      public:

111        MCSchedModel          SchedModel;     ///< Basic machine properties.

112        const InstrStage     *Stages;         ///< Array of stages selected

113        const unsigned       *OperandCycles;  ///< Array of operand cycles selected

114        const unsigned       *Forwardings;    ///< Array of pipeline forwarding pathes

115        const InstrItinerary *Itineraries;    ///< Array of itineraries selected

116     

117        /// Ctors.

118        InstrItineraryData() : SchedModel(MCSchedModel::GetDefaultSchedModel()),

119                               Stages(nullptr), OperandCycles(nullptr),

120                               Forwardings(nullptr), Itineraries(nullptr) {}

121     

122        InstrItineraryData(const MCSchedModel &SM, const InstrStage *S,

123                           const unsigned *OS, const unsigned *F)

124          : SchedModel(SM), Stages(S), OperandCycles(OS), Forwardings(F),

125            Itineraries(SchedModel.InstrItineraries) {}

3.6.2.2.1. 功能单元与旁路定义

我们已经知道一个处理器CodeGenProcModel对象的ItinsDef成员是其Processor派生定义里实际使用的ProcessorItineraries定义的Record对象(ProcessoràProcItin或ProcessoràSchedModelà Itineraries)。

359      void SubtargetEmitter::

360      EmitStageAndOperandCycleData(raw_ostream &OS,

361                                   std::vector<std::vector<InstrItinerary> >

362                                     &ProcItinLists) {

363     

364        // Multiple processor models may share an itinerary record. Emit it once.

365        SmallPtrSet<Record*, 8> ItinsDefSet;

366     

367        // Emit functional units for all the itineraries.

368        for (CodeGenSchedModels::ProcIter PI = SchedModels.procModelBegin(),

369               PE = SchedModels.procModelEnd(); PI != PE; ++PI) {

370     

371          if (!ItinsDefSet.insert(PI->ItinsDef).second)

372            continue;

373     

374          std::vector<Record*> FUs = PI->ItinsDef->getValueAsListOfDefs("FU");

375          if (FUs.empty())

376            continue;

377     

378          const std::string &Name = PI->ItinsDef->getName();

379          OS << "\n// Functional units for \"" << Name << "\"\n"

380             << "namespace " << Name << "FU {\n";

381     

382          for (unsigned j = 0, FUN = FUs.size(); j < FUN; ++j)

383            OS << "  const unsigned " << FUs[j]->getName()

384               << " = 1 << " << j << ";\n";

385     

386          OS << "}\n";

387     

388          std::vector<Record*> BPs = PI->ItinsDef->getValueAsListOfDefs("BP");

389          if (!BPs.empty()) {

390            OS << "\n// Pipeline forwarding pathes for itineraries \"" << Name

391               << "\"\n" << "namespace " << Name << "Bypass {\n";

392     

393            OS << "  const unsigned NoBypass = 0;\n";

394            for (unsigned j = 0, BPN = BPs.size(); j < BPN; ++j)

395              OS << "  const unsigned " << BPs[j]->getName()

396                 << " = 1 << " << j << ";\n";

397     

398            OS << "}\n";

399          }

400        }

X86家族中只有Atom使用Itinerary机制。Atom的ProcessorItineraries定义AtomItineraries没有定义BP(旁路,bypass),而且只定义了两个Port资源,因此我们得到如下的输出:

#ifdef DBGFIELD

#error "<target>GenSubtargetInfo.inc requires a DBGFIELD macro"

#endif

#ifndef NDEBUG

#define DBGFIELD(x) x,

#else

#define DBGFIELD(x)

#endif

 

// Functional units for "AtomItineraries"

namespace AtomItinerariesFU {

  const unsigned Port0 = 1 << 0;

  const unsigned Port1 = 1 << 1;

}

接下来要输出三张表。第一个是InstrStage类型描述的Stage数组,第二个是描述操作数周期的字符串数组,第三个是描述旁路的字符串数组。这些数组的第一个项都是预留给NoItineraries定义。

SubtargetEmitter::EmitStageAndOperandCycleData(续)

402        // Begin stages table

403        std::string StageTable = "\nextern const llvm::InstrStage " + Target +

404                                 "Stages[] = {\n";

405        StageTable += "  { 0, 0, 0, llvm::InstrStage::Required }, // No itinerary\n";

406     

407        // Begin operand cycle table

408        std::string OperandCycleTable = "extern const unsigned " + Target +

409          "OperandCycles[] = {\n";

410        OperandCycleTable += "  0, // No itinerary\n";

411     

412        // Begin pipeline bypass table

413        std::string BypassTable = "extern const unsigned " + Target +

414          "ForwardingPaths[] = {\n";

415        BypassTable += " 0, // No itinerary\n";

416     

417        // For each Itinerary across all processors, add a unique entry to the stages,

418        // operand cycles, and pipepine bypess tables. Then add the new Itinerary

419        // object with computed offsets to the ProcItinLists result.

420        unsigned StageCount = 1, OperandCycleCount = 1;

421        std::map<std::string, unsigned> ItinStageMap, ItinOperandMap;

422        for (CodeGenSchedModels::ProcIter PI = SchedModels.procModelBegin(),

423               PE = SchedModels.procModelEnd(); PI != PE; ++PI) {

424          const CodeGenProcModel &ProcModel = *PI;

425     

426          // Add process itinerary to the list.

427          ProcItinLists.resize(ProcItinLists.size()+1);

428     

429          // If this processor defines no itineraries, then leave the itinerary list

430          // empty.

431          std::vector<InstrItinerary> &ItinList = ProcItinLists.back();

432          if (!ProcModel.hasItineraries())

433            continue;

434     

435          const std::string &Name = ProcModel.ItinsDef->getName();

436     

437          ItinList.resize(SchedModels.numInstrSchedClasses());

438          assert(ProcModel.ItinDefList.size() == ItinList.size() && "bad Itins");

439     

440          for (unsigned SchedClassIdx = 0, SchedClassEnd = ItinList.size();

441               SchedClassIdx < SchedClassEnd; ++SchedClassIdx) {

442     

443            // Next itinerary data

444            Record *ItinData = ProcModel.ItinDefList[SchedClassIdx];

445     

446            // Get string and stage count

447            std::string ItinStageString;

448            unsigned NStages = 0;

449            if (ItinData)

450              FormItineraryStageString(Name, ItinData, ItinStageString, NStages);

451     

452            // Get string and operand cycle count

453            std::string ItinOperandCycleString;

454            unsigned NOperandCycles = 0;

455            std::string ItinBypassString;

456            if (ItinData) {

457              FormItineraryOperandCycleString(ItinData, ItinOperandCycleString,

458                                              NOperandCycles);

459     

460              FormItineraryBypassString(Name, ItinData, ItinBypassString,

461                                        NOperandCycles);

462            }

463     

464            // Check to see if stage already exists and create if it doesn't

465            unsigned FindStage = 0;

466            if (NStages > 0) {

467              FindStage = ItinStageMap[ItinStageString];

468              if (FindStage == 0) {

469                // Emit as { cycles, u1 | u2 | ... | un, timeinc }, // indices

470                StageTable += ItinStageString + ", // " + itostr(StageCount);

471                if (NStages > 1)

472                  StageTable += "-" + itostr(StageCount + NStages - 1);

473                StageTable += "\n";

474                // Record Itin class number.

475                ItinStageMap[ItinStageString] = FindStage = StageCount;

476                StageCount += NStages;

477              }

478            }

479     

480            // Check to see if operand cycle already exists and create if it doesn't

481            unsigned FindOperandCycle = 0;

482            if (NOperandCycles > 0) {

483              std::string ItinOperandString = ItinOperandCycleString+ItinBypassString;

484              FindOperandCycle = ItinOperandMap[ItinOperandString];

485              if (FindOperandCycle == 0) {

486                // Emit as  cycle, // index

487                OperandCycleTable += ItinOperandCycleString + ", // ";

488                std::string OperandIdxComment = itostr(OperandCycleCount);

489                if (NOperandCycles > 1)

490                  OperandIdxComment += "-"

491                    + itostr(OperandCycleCount + NOperandCycles - 1);

492                OperandCycleTable += OperandIdxComment + "\n";

493                // Record Itin class number.

494                ItinOperandMap[ItinOperandCycleString] =

495                  FindOperandCycle = OperandCycleCount;

496                // Emit as bypass, // index

497                BypassTable += ItinBypassString + ", // " + OperandIdxComment + "\n";

498                OperandCycleCount += NOperandCycles;

499              }

500            }

501     

502            // Set up itinerary as location and location + stage count

503            int NumUOps = ItinData ? ItinData->getValueAsInt("NumMicroOps") : 0;

504            InstrItinerary Intinerary = { NumUOps, FindStage, FindStage + NStages,

505                                          FindOperandCycle,

506                                          FindOperandCycle + NOperandCycles};

507     

508            // Inject - empty slots will be 0, 0

509            ItinList[SchedClassIdx] = Intinerary;

510          }

511        }

512     

513        // Closing stage

514        StageTable += "  { 0, 0, 0, llvm::InstrStage::Required } // End stages\n";

515        StageTable += "};\n";

516     

517        // Closing operand cycles

518        OperandCycleTable += "  0 // End operand cycles\n";

519        OperandCycleTable += "};\n";

520     

521        BypassTable += " 0 // End bypass tables\n";

522        BypassTable += "};\n";

523     

524        // Emit tables.

525        OS << StageTable;

526        OS << OperandCycleTable;

527        OS << BypassTable;

528      }

​​​​​​​3.6.2.2.2. ​​​​​​​执行步骤的数据

对使用执行步骤辅助指令调度的每个处理器,其CodeGenProcModel实例的ItinDefList容器保存的是相关ProcessorItineraries定义里的IID列表(类型list<InstrItinData>),这个容器关联了援引相同InstrItinClass定义的调度类型与InstrItinData定义。上面438行断言必须满足,因为在collectProcItins的784行,ProcModel.ItinsDef被调整为NumInstrSchedClasses大小。

对某个处理器CodeGenProcModel对象,440行实质上是遍历所有的非推导的CodeGenSchedClass对象,因此,444行获取的是与指定调度类型匹配的InstrItinData定义的Record对象,并作为450行调用的FormItineraryStageString方法的第二个参数。

274      void SubtargetEmitter::FormItineraryStageString(const std::string &Name,

275                                                      Record *ItinData,

276                                                      std::string &ItinString,

277                                                      unsigned &NStages) {

278        // Get states list

279        const std::vector<Record*> &StageList =

280          ItinData->getValueAsListOfDefs("Stages");

281     

282        // For each stage

283        unsigned N = NStages = StageList.size();

284        for (unsigned i = 0; i < N;) {

285          // Next stage

286          const Record *Stage = StageList[i];

287     

288          // Form string as ,{ cycles, u1 | u2 | ... | un, timeinc, kind }

289          int Cycles = Stage->getValueAsInt("Cycles");

290          ItinString += "  { " + itostr(Cycles) + ", ";

291     

292          // Get unit list

293          const std::vector<Record*> &UnitList = Stage->getValueAsListOfDefs("Units");

294     

295          // For each unit

296          for (unsigned j = 0, M = UnitList.size(); j < M;) {

297            // Add name and bitwise or

298            ItinString += Name + "FU::" + UnitList[j]->getName();

299            if (++j < M) ItinString += " | ";

300          }

301     

302          int TimeInc = Stage->getValueAsInt("TimeInc");

303          ItinString += ", " + itostr(TimeInc);

304     

305          int Kind = Stage->getValueAsInt("Kind");

306          ItinString += ", (llvm::InstrStage::ReservationKinds)" + itostr(Kind);

307     

308          // Close off stage

309          ItinString += " }";

310          if (++i < N) ItinString += ", ";

311        }

312      }

所输出的描述字符串可以参考上面对类InstrStage说明的例子。InstrItinData定义里还有一个OperandCycles定义用来描述指令发出后,指定操作数的值读、写完成所需的周期数。

319      void SubtargetEmitter::FormItineraryOperandCycleString(Record *ItinData,

320                               std::string &ItinString, unsigned &NOperandCycles) {

321        // Get operand cycle list

322        const std::vector<int64_t> &OperandCycleList =

323          ItinData->getValueAsListOfInts("OperandCycles");

324     

325        // For each operand cycle

326        unsigned N = NOperandCycles = OperandCycleList.size();

327        for (unsigned i = 0; i < N;) {

328          // Next operand cycle

329          const int OCycle = OperandCycleList[i];

330     

331          ItinString += "  " + itostr(OCycle);

332          if (++i < N) ItinString += ", ";

333        }

334      }

最后还要输出一个描述旁路(bypass)的数组。可以发现.td文件里的InstrItinData定义被拆分为这三个数组,这是因为这是描写InstrItinData定义比较独立的3个维度。而且这3个维度本身也可能是存在不少的重复定义,创建这三个数组,并通过数组下标来标定InstrItinData定义会获取更为紧凑的数据结构。

336      void SubtargetEmitter::FormItineraryBypassString(const std::string &Name,

337                                                       Record *ItinData,

338                                                       std::string &ItinString,

339                                                       unsigned NOperandCycles) {

340        const std::vector<Record*> &BypassList =

341          ItinData->getValueAsListOfDefs("Bypasses");

342        unsigned N = BypassList.size();

343        unsigned i = 0;

344        for (; i < N;) {

345          ItinString += Name + "Bypass::" + BypassList[i]->getName();

346          if (++i < NOperandCycles) ItinString += ", ";

347        }

348        for (; i < NOperandCycles;) {

349          ItinString += " 0";

350          if (++i < NOperandCycles) ItinString += ", ";

351        }

352      }

注意,对方法FormItineraryOperandCycleString,参数NOperandCycles是一个引用,在326行被设置为InstrItinData定义里OperandCycles的大小。它被传给方法FormItineraryBypassString,用以控制旁路数组的大小。

在EmitStageAndOperandCycleData的466行,NStages是由FormItineraryStageString方法设置的InstrItinData定义Stages的对象。容器ItinStageMap(std::map<std::string, unsigned>)用来保证生成InstrStage的唯一性,468~477行确保输出唯一的InstrStage。容器ItinOperandMap也是类似的作用,确保OperandCycle输出的唯一性。

在504行生成了一个InstrItinerary实例,保存到ProcItinLists容器的相应位置。在514行开始输出这三个数组。例如对X86目标机器,这是:

extern const llvm::InstrStage X86Stages[] = {

  { 0, 0, 0, llvm::InstrStage::Required }, // No itinerary

  { 13, AtomItinerariesFU::Port0 | AtomItinerariesFU::Port1, -1, (llvm::InstrStage::ReservationKinds)0 }, // 1

  { 7, AtomItinerariesFU::Port0 | AtomItinerariesFU::Port1, -1, (llvm::InstrStage::ReservationKinds)0 }, // 2

  { 21, AtomItinerariesFU::Port0 | AtomItinerariesFU::Port1, -1, (llvm::InstrStage::ReservationKinds)0 }, // 3

  { 1, AtomItinerariesFU::Port0 | AtomItinerariesFU::Port1, -1, (llvm::InstrStage::ReservationKinds)0 }, // 4

   …

  { 202, AtomItinerariesFU::Port0 | AtomItinerariesFU::Port1, -1, (llvm::InstrStage::ReservationKinds)0 }, // 92

  { 0, 0, 0, llvm::InstrStage::Required } // End stages

};

extern const unsigned X86OperandCycles[] = {

  0, // No itinerary

  0 // End operand cycles

};

extern const unsigned X86ForwardingPaths[] = {

 0, // No itinerary

 0 // End bypass tables

};

这三者通过下面将要生成的InstrItinerary数组联系起来。方法EmitItineraries的参数ProcItinLists是在前面的方法EmitStageAndOperandCycleData里准备的。注意,在546行对SchedModels容器ProcModels的遍历顺序与EmitStageAndOperandCycleData准备这些InstrItinerary对象数据时遍历ProcModels容器的顺序是一样的,而且ProcItinLists与ProcModels容器的大小总是相等的(EmitStageAndOperandCycleData的427行)。另外在432行看到,对不使用Itinerary的处理器,ProcItinLists的项是空的,而在509行看到,对于使用Itinerary的处理器,不管是否存在内容相同的Intinerary实例,总是为该处理器的ProcItinLists项生成一个新的Intinerary实例。因此,在下面遍历的处理器与ProcItinLists总是一一对应的(562行条件将不使用Itinerary的处理器滤除了)。

536      void SubtargetEmitter::

537      EmitItineraries(raw_ostream &OS,

538                      std::vector<std::vector<InstrItinerary> > &ProcItinLists) {

539     

540        // Multiple processor models may share an itinerary record. Emit it once.

541        SmallPtrSet<Record*, 8> ItinsDefSet;

542     

543        // For each processor's machine model

544        std::vector<std::vector<InstrItinerary> >::iterator

545            ProcItinListsIter = ProcItinLists.begin();

546        for (CodeGenSchedModels::ProcIter PI = SchedModels.procModelBegin(),

547               PE = SchedModels.procModelEnd(); PI != PE; ++PI, ++ProcItinListsIter) {

548     

549          Record *ItinsDef = PI->ItinsDef;

550          if (!ItinsDefSet.insert(ItinsDef).second)

551            continue;

552     

553          // Get processor itinerary name

554          const std::string &Name = ItinsDef->getName();

555     

556          // Get the itinerary list for the processor.

557          assert(ProcItinListsIter != ProcItinLists.end() && "bad iterator");

558          std::vector<InstrItinerary> &ItinList = *ProcItinListsIter;

559     

560          // Empty itineraries aren't referenced anywhere in the tablegen output

561          // so don't emit them.

562          if (ItinList.empty())

563            continue;

564     

565          OS << "\n";

566          OS << "static const llvm::InstrItinerary ";

567     

568          // Begin processor itinerary table

569          OS << Name << "[] = {\n";

570     

571          // For each itinerary class in CodeGenSchedClass::Index order.

572          for (unsigned j = 0, M = ItinList.size(); j < M; ++j) {

573            InstrItinerary &Intinerary = ItinList[j];

574     

575            // Emit Itinerary in the form of

576            // { firstStage, lastStage, firstCycle, lastCycle } // index

577            OS << "  { " <<

578              Intinerary.NumMicroOps << ", " <<

579              Intinerary.FirstStage << ", " <<

580              Intinerary.LastStage << ", " <<

581              Intinerary.FirstOperandCycle << ", " <<

582              Intinerary.LastOperandCycle << " }" <<

583              ", // " << j << " " << SchedModels.getSchedClass(j).Name << "\n";

584          }

585          // End processor itinerary table

586          OS << "  { 0, ~0U, ~0U, ~0U, ~0U } // end marker\n";

587          OS << "};\n";

588        }

589      }

X86目标机器只有Atom处理器使用了Itinerary,因此它输出这样的数组(有950项):

static const llvm::InstrItinerary AtomItineraries[] = {

  { 0, 0, 0, 0, 0 }, // 0 NoInstrModel

  { 1, 1, 2, 0, 0 }, // 1 IIC_AAA_WriteMicrocoded

  { 1, 2, 3, 0, 0 }, // 2 IIC_AAD_WriteMicrocoded

  { 1, 3, 4, 0, 0 }, // 3 IIC_AAM_WriteMicrocoded

  { 1, 1, 2, 0, 0 }, // 4 IIC_AAS_WriteMicrocoded

  { 1, 4, 5, 0, 0 }, // 5 IIC_BIN_CARRY_NONMEM_WriteALU

  …

  { 1, 43, 44, 0, 0 }, // 948 LDMXCSR_VLDMXCSR

  { 1, 17, 18, 0, 0 }, // 949 STMXCSR_VSTMXCSR

  { 0, ~0U, ~0U, ~0U, ~0U } // end marker

};

注释里给出的是所谓的调度类型。注意这里输出的顺序与X86GenInstrInfo.inc里Sched名字空间里的表示调度类型的枚举常量的顺序是完全一样。这个一致性使得我们通过这些枚举常量就能得到对应调度类型的具体参数。

猜你喜欢

转载自blog.csdn.net/wuhui_gdnt/article/details/82700669