In modern chip design, engineers are omnipotent in optimizing power consumption and area. The multi-bit FF discussed here is one of the methods or a process.
MBIT FF vs signle bit FF
The name of Multi-bit means that the usual single-bit FF is packaged into a multi-bit FF. Let's take a look at the similarities and differences between them:
- Single bit asyn-clear scan-FF
For this single-bit asyn-clear scan-FF, the vendor provides several multi-bit asyn-clear scan-FF,
-
multi-bit2 asyn-clear scan-FF
-
multi-bit4 asyn-clear scan-FF
-
multi-bit6 asyn-clear scan-FF
-
multi-bit8 asyn-clear scan-FF
From the schematic diagram of the cell, the difference between multi-bit and signle-bit is very small. It can be simply understood as the FFs of multiple signle-bits are arranged side by side. For the scan chain, the natural installation sequence is also connected together. A brief summary is as follows
part | single-bit | multi-bit |
---|---|---|
clock | Connect to CP pin of signle-bit | Connect to multi-bit unique CP pin: FF clocks of multiple bits share the same driver |
data | Connect to the data pin of the signle-bit | Connect to the data pin of each bit of the multi-bit: the data bits of the FF of each bit are independent of each other |
clear/reset | Connect to the clear/reset pin of the signle-bit | Connect to the unique clear/reset pin of multi-bit: the clear/reset of FF of multiple bits share the same driver |
output | Connect to Q pin of signle-bit | Connect to the Q pin of each bit of the multi-bit: the Q bits of the FF of each bit are independent of each other |
scan-enable | Connect to the SE pin of the signle-bit | Connect to multi-bit unique SE pin: multiple bit FF's SE share the same driver |
scan-in | Connect to the SI pin of the signle-bit | Connect to multi-bit unique SI pin: Multi-bit FFs are connected end to end in scan mode: SI -> bit1'SI -> bit1'Q -> bit2'SI (internal conn) -> bit2'Q -> bit3'SI (internal conn) … -> $lastbit.Q |
As you can see, there are three types of pins that share the relationship.
- clock pin
- clean/reset pin
- SI/SE pin
Therefore: since scan is inserted later, this is not sensitive to multi-bit encapsulation, if and only if a group of FFs share the driver in function clock and reset-clear, this group of FFs can be used Secondary packaging becomes multi-bit FF
Advantages of MBITs
Due to MBIT's shared mechanism for some common pins, the resulting advantages are:
- Transistor-level area optimization based on sharing mechanism
- The use of common pins reduces the loss of layout connections
- The number of leaves in the clock tree is reduced, and the length of the clock tree and power consumption are reduced
at the cell level. Taking the T12 process as an example, the comparison between the signle bit and MBIT with the same function (Scan D Flip-Flop with Async Clear, drive strength: X1) is as follows ( PS: Use multiple single bits to directly build a multi-bit structure for comparison of power consumption area)
cell | area (n * row_height) | improve ratio |
---|---|---|
single bit | 2.43 | THAT |
MBIT by 2 | 4.5 | 7.4% |
MBIT by 4 | 8.46 | 12.96% |
MBIT by 8 | 16.92 | 12.96% |
cell | leakage power (typical) | improve ratio |
---|---|---|
single bit | 0.53188 | THAT |
MBIT by 2 | 1.05134 | 1.17% |
MBIT by 4 | 2.21103 | -3.93% |
MBIT by 8 | 3.9436 | 7.32% |
If the signle bit is instantiated multiple times for horizontal ratio, MBIT will generally increase in area: 7.4% ~ 12.96% , and power consumption: about -3.93% ~ 7.32%
After understanding the mechanism of multi-bit, we need to sort out the process of multi-bit together.
The process of MBIT
Advancement of MBIT in the RTL stage
Before reading the RTL, the DC configures the hdlin parameter: hdlin_infer_multibit to identify the multi-bit RTL stage management,
hdlin_infer_multibit value | explain | Comment | |
---|---|---|---|
never | Do not do any multi-bit recognition | Not recommended | |
default_none (DC 的default value) | By identifying the RTL directives handle: infer_multibit/dont_infer_multibit, the multi-bit is identified and encapsulated according to the intention of the RTL designer. | recommend | |
default_all | DC judges the self-pair design to use multi-bit flexibly, unless there is a directives handle control of infer_multibit/dont_infer_multibit | not optimal solution |
The above three methods only affect the identification and creation of multi-bit in the RTL mapping stage. The implication: only affect the result of the first compile_ultra (mapping).
The recommended solution here is: default_none
-
If you use never: this will completely ignore the intention of the front-end designer, and may lose the information transmission of the directives
-
If you use default_all: This will cause DC to have some self-research and judgment methods, and will replace multi-bit with self-research and judgment. The designer’s directives will not be lost here, but some buses or two-dimensional arrays may be replaced. Here can cause two types of problems
-
Timing: In the first round of compile_ultra, the timing information is not complete. At this time, multi-bit replacement will inevitably lead to obstacles that need to be optimized in the future. Premature packing may require a second unpacking
-
Register naming behavior. If the RTL is a 2D array definition like this
reg [7:0] mem[255:0]
Under normal circumstances, DC will map this type of binary array into:
mem_reg[255][7] mem_reg[255][6] ...... mem_reg[255][0] ...... mem_reg[0][7] mem_reg[0][6] ...... mem_reg[0][0]
If, at this time, if default_all is used, DC analyze will perform multi-bit encapsulation on this type of array format, and finally the instance name generated by DC compile_ultra will become strange, as follows:
# use 4bit register bank mem_reg[255][7:4] ...... mem_reg[255][3:0] ...... mem_reg[0][7:4] ...... mem_reg[0][3:0]
-
This naming method will bring some obstacles to formal, and may also bring potential hidden dangers to timing
Summary : In the RTL parsing stage, combining RTL directives with hdlin_infer_multibit = default_none not only respects the meaning of the original work, but also achieves a more controllable multi-bit replacement. If the designer is not sure which must or does not need to do multi-bit replacement, hdlin_infer_multibit = default_none is still used here, so that in the first RTL step, multi-bit is performed during RTL analysis for the needs of RTL designers. Binding, but does not necessarily produce replacement, provided that both timing and control can meet the requirements.
The process of MBIT in R2G
As can be seen from the above description, the replacement of MBIT is mainly for area/power consumption gain, while the timing is not affected (no violation). Therefore, after the physical aware DCT is completed, it is a more appropriate time to replace it:
- Mapping and logic optimization are basically completed: the influence of ICG has been brought in, and the control sharing of MBIT is relatively clear
- Since it is a physically aware DCT, the timing information is basically clear. The overall MBIT replacement here can make the most of the advantages of MBIT. If there is timing pressure in the later stage (APR), de-banking can be used to degrade MBIT, which is also secondary. operating space
compile_ultra can replace MBIT according to requirements, but the following rules need to be followed:
Based on the above principles, users can use the following simple process to replace MBIT in synthesis
The core command for MBIT operation is: identify_register_banks. After the first step of compile_ultra is completed, this command can replace FF in DCT/DCG with MBIT. Except for the same clock and control bits between cells, the identify_register_banks command will set the physical location Similar FFs are used for MBIT replacement. Therefore, in order to maintain good inheritance, users need to use DCG process + ICC/ICC2 DEF flow (read_def + place_opt -skip_initial_placement) to complete the MBIT replacement application. Only in this way can the physical advantages of DCG replacing FF be inherited
Of course, users can also replace MBIT in ICC/ICC2, but because the replacement strategies are similar, the replacement can only be performed after the initial layout of the cell. The basic process is as follows:
The process here can be roughly regarded as splitting place_opt. After the first step of coarse placement, the replacement of MBIT is added. The user needs to use the sorce script (similar to identify_register_banks) to replace MBIT, and then continue to execute place_opt remaining steps.
Whether it is in the synthesis or layout stage, the MBIT replacement method is mainly based on two points:
- timing
- physical location
Only when there is a margin in the timing and the register is close to the physical location, the MBIT replacement action of the tool will be triggered, which can minimize the impact on the current database, and can also take advantage of the area and power consumption advantages of MBIT
Example of MBIT replacement in DC
Taking DCG as an example, after the first step compile_ultra is completed, use identify_register_banks to replace MBIT
- Before replacement:
- After replacement: You can see that the newly created MBIT is located in the middle of the original two single bits
Command run log:
This will print:
- single bit cell delete information
- MBIT pin connection information
It can be seen that the control signals such as CLK/RB here need to be of the same source, and the tool also has a built-in error prevention mechanism. If the control terminal of the target single bit is different, it will print an alarm of PSYN-1203 to ensure that the function is not correct. affected:
Note: The user can control the behavior of the compile_ultra command through set_multibit_option, so that in the comprehensive incremental optimization step, the tool can perform banking and de-banking operations according to the configuration of set_multibit_option.
MBIT naming and pin mapping
The tool is a replacement script for MBIT generated by identify_register_banks. For the bus, it is usually named in ascending order. Of course, since this is a post-processing script, users can also modify it by themselves, but it is usually not necessary to change the default behavior, so as not to affect subsequent Formal has an impact. The simple command is as follows:
For the synthesized MBIT cell, the corresponding Q output is also in ascending order:
A[0].Q -> MBIT_A[0]__A[1]__A[2]__A[3].Q1
A[1].Q -> MBIT_A[0]__A[1]__A[2]__A[3].Q2
A[2].Q -> MBIT_A[0]__A[1]__A[2]__A[3].Q3
A[3].Q -> MBIT_A[0]__A[1]__A[2]__A[3].Q4
Through this naming method, MBIT is helpful for subsequent work such as formal mapping and gate-sim.
Vocabulary in this chapter
vocabulary | explain |
---|---|
Multi-Bit FF | multi-bit wide registers |
[Knock on the blackboard to draw key points]
Multi-Bit has basically become a standard configuration in modern design. Understanding the application principles and rules will help users improve the design PPA by optimizing the multi-bit process
References
TSMC TSMC N12FF Standard Cell Library Datasheet
Synopsys Multibit Register Synthesis and Physical Implementation Application Note