Multi-Bit FF Exploration in Chip Design

In modern chip design, engineers are omnipotent in optimizing power consumption and area. The multi-bit FF discussed here is one of the methods or a process.
insert image description here

MBIT FF vs signle bit FF

The name of Multi-bit means that the usual single-bit FF is packaged into a multi-bit FF. Let's take a look at the similarities and differences between them:

  • Single bit asyn-clear scan-FF

insert image description here

For this single-bit asyn-clear scan-FF, the vendor provides several multi-bit asyn-clear scan-FF,

  • multi-bit2 asyn-clear scan-FF
    insert image description here

  • multi-bit4 asyn-clear scan-FF
    insert image description here

  • multi-bit6 asyn-clear scan-FFinsert image description here

  • multi-bit8 asyn-clear scan-FFinsert image description here

From the schematic diagram of the cell, the difference between multi-bit and signle-bit is very small. It can be simply understood as the FFs of multiple signle-bits are arranged side by side. For the scan chain, the natural installation sequence is also connected together. A brief summary is as follows

part single-bit multi-bit
clock Connect to CP pin of signle-bit Connect to multi-bit unique CP pin: FF clocks of multiple bits share the same driver
data Connect to the data pin of the signle-bit Connect to the data pin of each bit of the multi-bit: the data bits of the FF of each bit are independent of each other
clear/reset Connect to the clear/reset pin of the signle-bit Connect to the unique clear/reset pin of multi-bit: the clear/reset of FF of multiple bits share the same driver
output Connect to Q pin of signle-bit Connect to the Q pin of each bit of the multi-bit: the Q bits of the FF of each bit are independent of each other
scan-enable Connect to the SE pin of the signle-bit Connect to multi-bit unique SE pin: multiple bit FF's SE share the same driver
scan-in Connect to the SI pin of the signle-bit Connect to multi-bit unique SI pin: Multi-bit FFs are connected end to end in scan mode: SI -> bit1'SI -> bit1'Q -> bit2'SI (internal conn) -> bit2'Q -> bit3'SI (internal conn) … -> $lastbit.Q

As you can see, there are three types of pins that share the relationship.

  • clock pin
  • clean/reset pin
  • SI/SE pin

Therefore: since scan is inserted later, this is not sensitive to multi-bit encapsulation, if and only if a group of FFs share the driver in function clock and reset-clear, this group of FFs can be used Secondary packaging becomes multi-bit FF

Advantages of MBITs

Due to MBIT's shared mechanism for some common pins, the resulting advantages are:

  • Transistor-level area optimization based on sharing mechanism
  • The use of common pins reduces the loss of layout connections
  • The number of leaves in the clock tree is reduced, and the length of the clock tree and power consumption are reduced
    at the cell level. Taking the T12 process as an example, the comparison between the signle bit and MBIT with the same function (Scan D Flip-Flop with Async Clear, drive strength: X1) is as follows ( PS: Use multiple single bits to directly build a multi-bit structure for comparison of power consumption area)
cell area (n * row_height) improve ratio
single bit 2.43 THAT
MBIT by 2 4.5 7.4%
MBIT by 4 8.46 12.96%
MBIT by 8 16.92 12.96%
cell leakage power (typical) improve ratio
single bit 0.53188 THAT
MBIT by 2 1.05134 1.17%
MBIT by 4 2.21103 -3.93%
MBIT by 8 3.9436 7.32%

If the signle bit is instantiated multiple times for horizontal ratio, MBIT will generally increase in area: 7.4% ~ 12.96% , and power consumption: about -3.93% ~ 7.32%

After understanding the mechanism of multi-bit, we need to sort out the process of multi-bit together.

The process of MBIT

Advancement of MBIT in the RTL stage

Before reading the RTL, the DC configures the hdlin parameter: hdlin_infer_multibit to identify the multi-bit RTL stage management,

hdlin_infer_multibit value explain Comment
never Do not do any multi-bit recognition Not recommended
default_none (DC 的default value) By identifying the RTL directives handle: infer_multibit/dont_infer_multibit, the multi-bit is identified and encapsulated according to the intention of the RTL designer. recommend
default_all DC judges the self-pair design to use multi-bit flexibly, unless there is a directives handle control of infer_multibit/dont_infer_multibit not optimal solution

The above three methods only affect the identification and creation of multi-bit in the RTL mapping stage. The implication: only affect the result of the first compile_ultra (mapping).
The recommended solution here is: default_none

  • If you use never: this will completely ignore the intention of the front-end designer, and may lose the information transmission of the directives

  • If you use default_all: This will cause DC to have some self-research and judgment methods, and will replace multi-bit with self-research and judgment. The designer’s directives will not be lost here, but some buses or two-dimensional arrays may be replaced. Here can cause two types of problems

    • Timing: In the first round of compile_ultra, the timing information is not complete. At this time, multi-bit replacement will inevitably lead to obstacles that need to be optimized in the future. Premature packing may require a second unpacking

    • Register naming behavior. If the RTL is a 2D array definition like this

      reg [7:0] mem[255:0]
      

      Under normal circumstances, DC will map this type of binary array into:

      mem_reg[255][7]
      mem_reg[255][6]
      ......
      mem_reg[255][0]
      ......
      mem_reg[0][7]
      mem_reg[0][6]
      ......
      mem_reg[0][0]
      

      If, at this time, if default_all is used, DC analyze will perform multi-bit encapsulation on this type of array format, and finally the instance name generated by DC compile_ultra will become strange, as follows:

      # use 4bit register bank
      mem_reg[255][7:4]
      ......
      mem_reg[255][3:0]
      ......
      mem_reg[0][7:4]
      ......
      mem_reg[0][3:0]
      

This naming method will bring some obstacles to formal, and may also bring potential hidden dangers to timing

Summary : In the RTL parsing stage, combining RTL directives with hdlin_infer_multibit = default_none not only respects the meaning of the original work, but also achieves a more controllable multi-bit replacement. If the designer is not sure which must or does not need to do multi-bit replacement, hdlin_infer_multibit = default_none is still used here, so that in the first RTL step, multi-bit is performed during RTL analysis for the needs of RTL designers. Binding, but does not necessarily produce replacement, provided that both timing and control can meet the requirements.

The process of MBIT in R2G

As can be seen from the above description, the replacement of MBIT is mainly for area/power consumption gain, while the timing is not affected (no violation). Therefore, after the physical aware DCT is completed, it is a more appropriate time to replace it:

  • Mapping and logic optimization are basically completed: the influence of ICG has been brought in, and the control sharing of MBIT is relatively clear
  • Since it is a physically aware DCT, the timing information is basically clear. The overall MBIT replacement here can make the most of the advantages of MBIT. If there is timing pressure in the later stage (APR), de-banking can be used to degrade MBIT, which is also secondary. operating space

compile_ultra can replace MBIT according to requirements, but the following rules need to be followed:
insert image description here
Based on the above principles, users can use the following simple process to replace MBIT in synthesis

insert image description here

The core command for MBIT operation is: identify_register_banks. After the first step of compile_ultra is completed, this command can replace FF in DCT/DCG with MBIT. Except for the same clock and control bits between cells, the identify_register_banks command will set the physical location Similar FFs are used for MBIT replacement. Therefore, in order to maintain good inheritance, users need to use DCG process + ICC/ICC2 DEF flow (read_def + place_opt -skip_initial_placement) to complete the MBIT replacement application. Only in this way can the physical advantages of DCG replacing FF be inherited

Of course, users can also replace MBIT in ICC/ICC2, but because the replacement strategies are similar, the replacement can only be performed after the initial layout of the cell. The basic process is as follows:
insert image description here

The process here can be roughly regarded as splitting place_opt. After the first step of coarse placement, the replacement of MBIT is added. The user needs to use the sorce script (similar to identify_register_banks) to replace MBIT, and then continue to execute place_opt remaining steps.

Whether it is in the synthesis or layout stage, the MBIT replacement method is mainly based on two points:

  • timing
  • physical location

Only when there is a margin in the timing and the register is close to the physical location, the MBIT replacement action of the tool will be triggered, which can minimize the impact on the current database, and can also take advantage of the area and power consumption advantages of MBIT

Example of MBIT replacement in DC

Taking DCG as an example, after the first step compile_ultra is completed, use identify_register_banks to replace MBIT

  • Before replacement:
    insert image description here
  • After replacement: You can see that the newly created MBIT is located in the middle of the original two single bits

insert image description here

Command run log:
insert image description here

This will print:

  • single bit cell delete information
  • MBIT pin connection information

It can be seen that the control signals such as CLK/RB here need to be of the same source, and the tool also has a built-in error prevention mechanism. If the control terminal of the target single bit is different, it will print an alarm of PSYN-1203 to ensure that the function is not correct. affected:

[External link picture transfer failed, the source site may have an anti-theft link mechanism, it is recommended to save the picture and upload it directly (img-Adb0yNFF-1653479873004) (D:\01_DOC_PIC\10_Xiaoai\06_public account\01_article\ 20_multi_bit_20220509\image-20220525114107805.png)]
Note: The user can control the behavior of the compile_ultra command through set_multibit_option, so that in the comprehensive incremental optimization step, the tool can perform banking and de-banking operations according to the configuration of set_multibit_option.

MBIT naming and pin mapping

The tool is a replacement script for MBIT generated by identify_register_banks. For the bus, it is usually named in ascending order. Of course, since this is a post-processing script, users can also modify it by themselves, but it is usually not necessary to change the default behavior, so as not to affect subsequent Formal has an impact. The simple command is as follows:
insert image description here

For the synthesized MBIT cell, the corresponding Q output is also in ascending order:

A[0].Q -> MBIT_A[0]__A[1]__A[2]__A[3].Q1
A[1].Q -> MBIT_A[0]__A[1]__A[2]__A[3].Q2
A[2].Q -> MBIT_A[0]__A[1]__A[2]__A[3].Q3
A[3].Q -> MBIT_A[0]__A[1]__A[2]__A[3].Q4

Through this naming method, MBIT is helpful for subsequent work such as formal mapping and gate-sim.

Vocabulary in this chapter

vocabulary explain
Multi-Bit FF multi-bit wide registers

[Knock on the blackboard to draw key points]

insert image description here
Multi-Bit has basically become a standard configuration in modern design. Understanding the application principles and rules will help users improve the design PPA by optimizing the multi-bit process

References

TSMC TSMC N12FF Standard Cell Library Datasheet
Synopsys Multibit Register Synthesis and Physical Implementation Application Note

Guess you like

Origin blog.csdn.net/i_chip_backend/article/details/124972693