LLVM学习笔记（41）

3.6.2.2. 资源及其使用的描述

我们已经知道有两个方式可以描述指令的执行。一种是执行步骤，Itinerary，它包括了一系列包含一组InstrStage定义的InstrItinData定义，将InstrItinData与指令定义关联起来的InstrItinClass，以及一个把有关定义组合起来的ProcessorItineraries定义。另一种则是通过描述资源使用情形，它由一系列相互关联的SchedReadWrite派生定义组成。

这背后都是将处理器描述成若干资源，并叙述指令对这些资源的使用情况。现在是时候输出相关的数据结构了。

1250 void SubtargetEmitter::EmitSchedModel(raw_ostream &OS) {

1251 OS << "#ifdef DBGFIELD\n"

1252 << "#error \"<target>GenSubtargetInfo.inc requires a DBGFIELD macro\"\n"

1253 << "#endif\n"

1254 << "#ifndef NDEBUG\n"

1255 << "#define DBGFIELD(x) x,\n"

1256 << "#else\n"

1257 << "#define DBGFIELD(x)\n"

1258 << "#endif\n";

1259

1260 if (SchedModels.hasItineraries()) {

1261 std::vector<std::vector<InstrItinerary> > ProcItinLists;

1262 // Emit the stage data

1263 EmitStageAndOperandCycleData(OS, ProcItinLists);

1264 EmitItineraries(OS, ProcItinLists);

1265 }

与前面章节看到的一样，这里的SchedModels对象中容器ProcModels保存了同一族的各个处理器的CodeGenProcModel对象。如果处理器中有使用执行步骤来描述的，满足1260行条件，将输出这些处理器的步骤（stage）数据。类似于TD文件里使用的InstrStage定义，LLVM也有一个同名的、作用相类的类型。

59 struct InstrStage {

60 enum ReservationKinds {

61 Required = 0,

62 Reserved = 1

63 };

65 unsigned Cycles_; ///< Length of stage in machine cycles

66 unsigned Units_; ///< Choice of functional units

67 int NextCycles_; ///< Number of machine cycles to next stage

68 ReservationKinds Kind_; ///< Kind of the FU reservation

70 /// \brief Returns the number of cycles the stage is occupied.

71 unsigned getCycles() const {

72 return Cycles_;

73 }

75 /// \brief Returns the choice of FUs.

76 unsigned getUnits() const {

77 return Units_;

78 }

80 ReservationKinds getReservationKind() const {

81 return Kind_;

82 }

84 /// \brief Returns the number of cycles from the start of this stage to the

85 /// start of the next stage in the itinerary

86 unsigned getNextCycles() const {

87 return (NextCycles_ >= 0) ? (unsigned)NextCycles_ : Cycles_;

88 }

89 };

InstrStage代表指令执行中的一个非流水线化的步骤。Cycles表示完成该步骤所需的周期，Units表示可供选择用于完成该步骤的功能单元。比如IntUnit1，IntUnit2。NextCycles表示从该步骤开始到下一步开始所应该消逝的周期数。值-1表示下一步应该跟在当前步骤后立即开始。比如：

{ 1, x, -1 }：表示该步骤占用FU x一个周期，下一步在该步骤后立即开始。

{ 2, x|y, 1 }：表示该步骤占用FU x或FU y连续的两个周期，下一步应该在该步骤开始一周期后开始。即，这些步骤要求在时间上重叠。

{ 1, x, 0 }：表示该步骤占用FU x一个周期，下一步与该步骤在同一个周期开始。这可用于表示指令同一时间要求多个步骤。

有两种FU保留类型：指令实际要求的FU，指令仅保留的FU。对其他指令的执行，保留单元不可用。不过，多条指令可以多次保留同一个单元。这两种单元保留用于模拟指令字段改变导致的暂停，使用同样资源（比如同一个寄存器）的FU，等等。

97 struct InstrItinerary {

98 int NumMicroOps; ///< # of micro-ops, -1 means it's variable

99 unsigned FirstStage; ///< Index of first stage in itinerary

100 unsigned LastStage; ///< Index of last + 1 stage in itinerary

101 unsigned FirstOperandCycle; ///< Index of first operand rd/wr

102 unsigned LastOperandCycle; ///< Index of last + 1 operand rd/wr

103 };

InstrItinerary代表指令的调度信息。包括该指令所占据的一组步骤及操作数读、写所在的流水线周期。它是InstrItinData定义在LLVM的对等物。更上一级的封装则是InstrItineraryData，它所定义的数据成员及构造函数有下面这些。它为子目标机器提供数据的封装。

109 class InstrItineraryData {

110 public:

111 MCSchedModel SchedModel; ///< Basic machine properties.

112 const InstrStage *Stages; ///< Array of stages selected

113 const unsigned *OperandCycles; ///< Array of operand cycles selected

114 const unsigned *Forwardings; ///< Array of pipeline forwarding pathes

115 const InstrItinerary *Itineraries; ///< Array of itineraries selected

116

117 /// Ctors.

118 InstrItineraryData() : SchedModel(MCSchedModel::GetDefaultSchedModel()),

119 Stages(nullptr), OperandCycles(nullptr),

120 Forwardings(nullptr), Itineraries(nullptr) {}

121

122 InstrItineraryData(const MCSchedModel &SM, const InstrStage *S,

123 const unsigned *OS, const unsigned *F)

124 : SchedModel(SM), Stages(S), OperandCycles(OS), Forwardings(F),

125 Itineraries(SchedModel.InstrItineraries) {}

3.6.2.2.1. 功能单元与旁路定义

我们已经知道一个处理器CodeGenProcModel对象的ItinsDef成员是其Processor派生定义里实际使用的ProcessorItineraries定义的Record对象（ProcessoràProcItin或ProcessoràSchedModelà Itineraries）。

359 void SubtargetEmitter::

360 EmitStageAndOperandCycleData(raw_ostream &OS,

361 std::vector<std::vector<InstrItinerary> >

362 &ProcItinLists) {

363

364 // Multiple processor models may share an itinerary record. Emit it once.

365 SmallPtrSet<Record*, 8> ItinsDefSet;

366

367 // Emit functional units for all the itineraries.

368 for (CodeGenSchedModels::ProcIter PI = SchedModels.procModelBegin(),

369 PE = SchedModels.procModelEnd(); PI != PE; ++PI) {

370

371 if (!ItinsDefSet.insert(PI->ItinsDef).second)

372 continue;

373

374 std::vector<Record*> FUs = PI->ItinsDef->getValueAsListOfDefs("FU");

375 if (FUs.empty())

376 continue;

377

378 const std::string &Name = PI->ItinsDef->getName();

379 OS << "\n// Functional units for \"" << Name << "\"\n"

380 << "namespace " << Name << "FU {\n";

381

382 for (unsigned j = 0, FUN = FUs.size(); j < FUN; ++j)

383 OS << " const unsigned " << FUs[j]->getName()

384 << " = 1 << " << j << ";\n";

385

386 OS << "}\n";

387

388 std::vector<Record*> BPs = PI->ItinsDef->getValueAsListOfDefs("BP");

389 if (!BPs.empty()) {

390 OS << "\n// Pipeline forwarding pathes for itineraries \"" << Name

391 << "\"\n" << "namespace " << Name << "Bypass {\n";

392

393 OS << " const unsigned NoBypass = 0;\n";

394 for (unsigned j = 0, BPN = BPs.size(); j < BPN; ++j)

395 OS << " const unsigned " << BPs[j]->getName()

396 << " = 1 << " << j << ";\n";

397

398 OS << "}\n";

399 }

400 }

X86家族中只有Atom使用Itinerary机制。Atom的ProcessorItineraries定义AtomItineraries没有定义BP（旁路，bypass），而且只定义了两个Port资源，因此我们得到如下的输出：

#ifdef DBGFIELD

#error "<target>GenSubtargetInfo.inc requires a DBGFIELD macro"

#endif

#ifndef NDEBUG

#define DBGFIELD(x) x,

#else

#define DBGFIELD(x)

#endif

// Functional units for "AtomItineraries"

namespace AtomItinerariesFU {

const unsigned Port0 = 1 << 0;

const unsigned Port1 = 1 << 1;

}

接下来要输出三张表。第一个是InstrStage类型描述的Stage数组，第二个是描述操作数周期的字符串数组，第三个是描述旁路的字符串数组。这些数组的第一个项都是预留给NoItineraries定义。

SubtargetEmitter::EmitStageAndOperandCycleData（续）

402 // Begin stages table

403 std::string StageTable = "\nextern const llvm::InstrStage " + Target +

404 "Stages[] = {\n";

405 StageTable += " { 0, 0, 0, llvm::InstrStage::Required }, // No itinerary\n";

406

407 // Begin operand cycle table

408 std::string OperandCycleTable = "extern const unsigned " + Target +

409 "OperandCycles[] = {\n";

410 OperandCycleTable += " 0, // No itinerary\n";

411

412 // Begin pipeline bypass table

413 std::string BypassTable = "extern const unsigned " + Target +

414 "ForwardingPaths[] = {\n";

415 BypassTable += " 0, // No itinerary\n";

416

417 // For each Itinerary across all processors, add a unique entry to the stages,

418 // operand cycles, and pipepine bypess tables. Then add the new Itinerary

419 // object with computed offsets to the ProcItinLists result.

420 unsigned StageCount = 1, OperandCycleCount = 1;

421 std::map<std::string, unsigned> ItinStageMap, ItinOperandMap;

422 for (CodeGenSchedModels::ProcIter PI = SchedModels.procModelBegin(),

423 PE = SchedModels.procModelEnd(); PI != PE; ++PI) {

424 const CodeGenProcModel &ProcModel = *PI;

425

426 // Add process itinerary to the list.

427 ProcItinLists.resize(ProcItinLists.size()+1);

428

429 // If this processor defines no itineraries, then leave the itinerary list

430 // empty.

431 std::vector<InstrItinerary> &ItinList = ProcItinLists.back();

432 if (!ProcModel.hasItineraries())

433 continue;

434

435 const std::string &Name = ProcModel.ItinsDef->getName();

436

437 ItinList.resize(SchedModels.numInstrSchedClasses());

438 assert(ProcModel.ItinDefList.size() == ItinList.size() && "bad Itins");

439

440 for (unsigned SchedClassIdx = 0, SchedClassEnd = ItinList.size();

441 SchedClassIdx < SchedClassEnd; ++SchedClassIdx) {

442

443 // Next itinerary data

444 Record *ItinData = ProcModel.ItinDefList[SchedClassIdx];

445

446 // Get string and stage count

447 std::string ItinStageString;

448 unsigned NStages = 0;

449 if (ItinData)

450 FormItineraryStageString(Name, ItinData, ItinStageString, NStages);

451

452 // Get string and operand cycle count

453 std::string ItinOperandCycleString;

454 unsigned NOperandCycles = 0;

455 std::string ItinBypassString;

456 if (ItinData) {

457 FormItineraryOperandCycleString(ItinData, ItinOperandCycleString,

458 NOperandCycles);

459

460 FormItineraryBypassString(Name, ItinData, ItinBypassString,

461 NOperandCycles);

462 }

463

464 // Check to see if stage already exists and create if it doesn't

465 unsigned FindStage = 0;

466 if (NStages > 0) {

467 FindStage = ItinStageMap[ItinStageString];

468 if (FindStage == 0) {

469 // Emit as { cycles, u1 | u2 | ... | un, timeinc }, // indices

470 StageTable += ItinStageString + ", // " + itostr(StageCount);

471 if (NStages > 1)

472 StageTable += "-" + itostr(StageCount + NStages - 1);

473 StageTable += "\n";

474 // Record Itin class number.

475 ItinStageMap[ItinStageString] = FindStage = StageCount;

476 StageCount += NStages;

477 }

478 }

479

480 // Check to see if operand cycle already exists and create if it doesn't

481 unsigned FindOperandCycle = 0;

482 if (NOperandCycles > 0) {

483 std::string ItinOperandString = ItinOperandCycleString+ItinBypassString;

484 FindOperandCycle = ItinOperandMap[ItinOperandString];

485 if (FindOperandCycle == 0) {

486 // Emit as cycle, // index

487 OperandCycleTable += ItinOperandCycleString + ", // ";

488 std::string OperandIdxComment = itostr(OperandCycleCount);

489 if (NOperandCycles > 1)

490 OperandIdxComment += "-"

491 + itostr(OperandCycleCount + NOperandCycles - 1);

492 OperandCycleTable += OperandIdxComment + "\n";

493 // Record Itin class number.

494 ItinOperandMap[ItinOperandCycleString] =

495 FindOperandCycle = OperandCycleCount;

496 // Emit as bypass, // index

497 BypassTable += ItinBypassString + ", // " + OperandIdxComment + "\n";

498 OperandCycleCount += NOperandCycles;

499 }

500 }

501

502 // Set up itinerary as location and location + stage count

503 int NumUOps = ItinData ? ItinData->getValueAsInt("NumMicroOps") : 0;

504 InstrItinerary Intinerary = { NumUOps, FindStage, FindStage + NStages,

505 FindOperandCycle,

506 FindOperandCycle + NOperandCycles};

507

508 // Inject - empty slots will be 0, 0

509 ItinList[SchedClassIdx] = Intinerary;

510 }

511 }

512

513 // Closing stage

514 StageTable += " { 0, 0, 0, llvm::InstrStage::Required } // End stages\n";

515 StageTable += "};\n";

516

517 // Closing operand cycles

518 OperandCycleTable += " 0 // End operand cycles\n";

519 OperandCycleTable += "};\n";

520

521 BypassTable += " 0 // End bypass tables\n";

522 BypassTable += "};\n";

523

524 // Emit tables.

525 OS << StageTable;

526 OS << OperandCycleTable;

527 OS << BypassTable;

528 }

3.6.2.2.2. 执行步骤的数据

对使用执行步骤辅助指令调度的每个处理器，其CodeGenProcModel实例的ItinDefList容器保存的是相关ProcessorItineraries定义里的IID列表（类型list<InstrItinData>），这个容器关联了援引相同InstrItinClass定义的调度类型与InstrItinData定义。上面438行断言必须满足，因为在collectProcItins的784行，ProcModel.ItinsDef被调整为NumInstrSchedClasses大小。

对某个处理器CodeGenProcModel对象，440行实质上是遍历所有的非推导的CodeGenSchedClass对象，因此，444行获取的是与指定调度类型匹配的InstrItinData定义的Record对象，并作为450行调用的FormItineraryStageString方法的第二个参数。

274 void SubtargetEmitter::FormItineraryStageString(const std::string &Name,

275 Record *ItinData,

276 std::string &ItinString,

277 unsigned &NStages) {

278 // Get states list

279 const std::vector<Record*> &StageList =

280 ItinData->getValueAsListOfDefs("Stages");

281

282 // For each stage

283 unsigned N = NStages = StageList.size();

284 for (unsigned i = 0; i < N;) {

285 // Next stage

286 const Record *Stage = StageList[i];

287

288 // Form string as ,{ cycles, u1 | u2 | ... | un, timeinc, kind }

289 int Cycles = Stage->getValueAsInt("Cycles");

290 ItinString += " { " + itostr(Cycles) + ", ";

291

292 // Get unit list

293 const std::vector<Record*> &UnitList = Stage->getValueAsListOfDefs("Units");

294

295 // For each unit

296 for (unsigned j = 0, M = UnitList.size(); j < M;) {

297 // Add name and bitwise or

298 ItinString += Name + "FU::" + UnitList[j]->getName();

299 if (++j < M) ItinString += " | ";

300 }

301

302 int TimeInc = Stage->getValueAsInt("TimeInc");

303 ItinString += ", " + itostr(TimeInc);

304

305 int Kind = Stage->getValueAsInt("Kind");

306 ItinString += ", (llvm::InstrStage::ReservationKinds)" + itostr(Kind);

307

308 // Close off stage

309 ItinString += " }";

310 if (++i < N) ItinString += ", ";

311 }

312 }

所输出的描述字符串可以参考上面对类InstrStage说明的例子。InstrItinData定义里还有一个OperandCycles定义用来描述指令发出后，指定操作数的值读、写完成所需的周期数。

319 void SubtargetEmitter::FormItineraryOperandCycleString(Record *ItinData,

320 std::string &ItinString, unsigned &NOperandCycles) {

321 // Get operand cycle list

322 const std::vector<int64_t> &OperandCycleList =

323 ItinData->getValueAsListOfInts("OperandCycles");

324

325 // For each operand cycle

326 unsigned N = NOperandCycles = OperandCycleList.size();

327 for (unsigned i = 0; i < N;) {

328 // Next operand cycle

329 const int OCycle = OperandCycleList[i];

330

331 ItinString += " " + itostr(OCycle);

332 if (++i < N) ItinString += ", ";

333 }

334 }

最后还要输出一个描述旁路（bypass）的数组。可以发现.td文件里的InstrItinData定义被拆分为这三个数组，这是因为这是描写InstrItinData定义比较独立的3个维度。而且这3个维度本身也可能是存在不少的重复定义，创建这三个数组，并通过数组下标来标定InstrItinData定义会获取更为紧凑的数据结构。

336 void SubtargetEmitter::FormItineraryBypassString(const std::string &Name,

337 Record *ItinData,

338 std::string &ItinString,

339 unsigned NOperandCycles) {

340 const std::vector<Record*> &BypassList =

341 ItinData->getValueAsListOfDefs("Bypasses");

342 unsigned N = BypassList.size();

343 unsigned i = 0;

344 for (; i < N;) {

345 ItinString += Name + "Bypass::" + BypassList[i]->getName();

346 if (++i < NOperandCycles) ItinString += ", ";

347 }

348 for (; i < NOperandCycles;) {

349 ItinString += " 0";

350 if (++i < NOperandCycles) ItinString += ", ";

351 }

352 }

注意，对方法FormItineraryOperandCycleString，参数NOperandCycles是一个引用，在326行被设置为InstrItinData定义里OperandCycles的大小。它被传给方法FormItineraryBypassString，用以控制旁路数组的大小。

在EmitStageAndOperandCycleData的466行，NStages是由FormItineraryStageString方法设置的InstrItinData定义Stages的对象。容器ItinStageMap（std::map<std::string, unsigned>）用来保证生成InstrStage的唯一性，468~477行确保输出唯一的InstrStage。容器ItinOperandMap也是类似的作用，确保OperandCycle输出的唯一性。

在504行生成了一个InstrItinerary实例，保存到ProcItinLists容器的相应位置。在514行开始输出这三个数组。例如对X86目标机器，这是：

extern const llvm::InstrStage X86Stages[] = {

{ 0, 0, 0, llvm::InstrStage::Required }, // No itinerary

{ 13, AtomItinerariesFU::Port0 | AtomItinerariesFU::Port1, -1, (llvm::InstrStage::ReservationKinds)0 }, // 1

{ 7, AtomItinerariesFU::Port0 | AtomItinerariesFU::Port1, -1, (llvm::InstrStage::ReservationKinds)0 }, // 2

{ 21, AtomItinerariesFU::Port0 | AtomItinerariesFU::Port1, -1, (llvm::InstrStage::ReservationKinds)0 }, // 3

{ 1, AtomItinerariesFU::Port0 | AtomItinerariesFU::Port1, -1, (llvm::InstrStage::ReservationKinds)0 }, // 4

…

{ 202, AtomItinerariesFU::Port0 | AtomItinerariesFU::Port1, -1, (llvm::InstrStage::ReservationKinds)0 }, // 92

{ 0, 0, 0, llvm::InstrStage::Required } // End stages

};

extern const unsigned X86OperandCycles[] = {

0, // No itinerary

0 // End operand cycles

};

extern const unsigned X86ForwardingPaths[] = {

0, // No itinerary

0 // End bypass tables

};

这三者通过下面将要生成的InstrItinerary数组联系起来。方法EmitItineraries的参数ProcItinLists是在前面的方法EmitStageAndOperandCycleData里准备的。注意，在546行对SchedModels容器ProcModels的遍历顺序与EmitStageAndOperandCycleData准备这些InstrItinerary对象数据时遍历ProcModels容器的顺序是一样的，而且ProcItinLists与ProcModels容器的大小总是相等的（EmitStageAndOperandCycleData的427行）。另外在432行看到，对不使用Itinerary的处理器，ProcItinLists的项是空的，而在509行看到，对于使用Itinerary的处理器，不管是否存在内容相同的Intinerary实例，总是为该处理器的ProcItinLists项生成一个新的Intinerary实例。因此，在下面遍历的处理器与ProcItinLists总是一一对应的（562行条件将不使用Itinerary的处理器滤除了）。

536 void SubtargetEmitter::

537 EmitItineraries(raw_ostream &OS,

538 std::vector<std::vector<InstrItinerary> > &ProcItinLists) {

539

540 // Multiple processor models may share an itinerary record. Emit it once.

541 SmallPtrSet<Record*, 8> ItinsDefSet;

542

543 // For each processor's machine model

544 std::vector<std::vector<InstrItinerary> >::iterator

545 ProcItinListsIter = ProcItinLists.begin();

546 for (CodeGenSchedModels::ProcIter PI = SchedModels.procModelBegin(),

547 PE = SchedModels.procModelEnd(); PI != PE; ++PI, ++ProcItinListsIter) {

548

549 Record *ItinsDef = PI->ItinsDef;

550 if (!ItinsDefSet.insert(ItinsDef).second)

551 continue;

552

553 // Get processor itinerary name

554 const std::string &Name = ItinsDef->getName();

555

556 // Get the itinerary list for the processor.

557 assert(ProcItinListsIter != ProcItinLists.end() && "bad iterator");

558 std::vector<InstrItinerary> &ItinList = *ProcItinListsIter;

559

560 // Empty itineraries aren't referenced anywhere in the tablegen output

561 // so don't emit them.

562 if (ItinList.empty())

563 continue;

564

565 OS << "\n";

566 OS << "static const llvm::InstrItinerary ";

567

568 // Begin processor itinerary table

569 OS << Name << "[] = {\n";

570

571 // For each itinerary class in CodeGenSchedClass::Index order.

572 for (unsigned j = 0, M = ItinList.size(); j < M; ++j) {

573 InstrItinerary &Intinerary = ItinList[j];

574

575 // Emit Itinerary in the form of

576 // { firstStage, lastStage, firstCycle, lastCycle } // index

577 OS << " { " <<

578 Intinerary.NumMicroOps << ", " <<

579 Intinerary.FirstStage << ", " <<

580 Intinerary.LastStage << ", " <<

581 Intinerary.FirstOperandCycle << ", " <<

582 Intinerary.LastOperandCycle << " }" <<

583 ", // " << j << " " << SchedModels.getSchedClass(j).Name << "\n";

584 }

585 // End processor itinerary table

586 OS << " { 0, ~0U, ~0U, ~0U, ~0U } // end marker\n";

587 OS << "};\n";

588 }

589 }

X86目标机器只有Atom处理器使用了Itinerary，因此它输出这样的数组（有950项）：

static const llvm::InstrItinerary AtomItineraries[] = {

{ 0, 0, 0, 0, 0 }, // 0 NoInstrModel

{ 1, 1, 2, 0, 0 }, // 1 IIC_AAA_WriteMicrocoded

{ 1, 2, 3, 0, 0 }, // 2 IIC_AAD_WriteMicrocoded

{ 1, 3, 4, 0, 0 }, // 3 IIC_AAM_WriteMicrocoded

{ 1, 1, 2, 0, 0 }, // 4 IIC_AAS_WriteMicrocoded

{ 1, 4, 5, 0, 0 }, // 5 IIC_BIN_CARRY_NONMEM_WriteALU

…

{ 1, 43, 44, 0, 0 }, // 948 LDMXCSR_VLDMXCSR

{ 1, 17, 18, 0, 0 }, // 949 STMXCSR_VSTMXCSR

{ 0, ~0U, ~0U, ~0U, ~0U } // end marker

};

注释里给出的是所谓的调度类型。注意这里输出的顺序与X86GenInstrInfo.inc里Sched名字空间里的表示调度类型的枚举常量的顺序是完全一样。这个一致性使得我们通过这些枚举常量就能得到对应调度类型的具体参数。

LLVM学习笔记（41）

猜你喜欢