Read --FPGA design guidelines

The main contents are as follows:

  Peace and swap area and speed

  Hardware principle

  System Principles

  Synchronous design principles

  Ping-pong operation

  Serial to parallel conversion

  Pipelining

  The method of synchronizing data interface

  RAM

  FIFO

1. The balance and exchange area and speed

  Here refers to the area a number of logical resource consumption design FPGA / CPLD's, may be used for FPGA consumed FF (flip-flops), and LUT (lookup table) is measured, a more general manner may be measured using an equivalent proportion of logic design gate count .

  Refers to the design speed and stable operation on the chip, can achieve the highest frequency, the frequency is determined by the timing of design conditions, and designed to meet the clocking requirements, PAD to PAD time, Clock Setup Time, Clock Hold Time, Clock-to- Output Delay is closely related to the timing and many other features amount.

  Area and speed of these two indicators throughout the FPGA clock CPLD design / evaluation is the ultimate standard design quality.

  Area and speed are a pair of opposites contradiction. Requires a co-worker with the minimum design area, the maximum operating frequency is unrealistic. Under the premise of a more scientific design goal should be designed to meet the timing requirements (including the requirements of the design frequency), and takes up minimal chip area. Or in the specified area, it is designed larger timing margin, higher running frequency. These two objectives fully reflects the balance of the area and speed of thought.

  As part of two contradictory status area and speed are not the same. In contrast, to meet the timing requirements of the operating frequency is more important, when the two are in conflict, the use of speed priority criteria.

  Theoretically, if a large timing margin design, can run much faster than the design requirements, then through the complex function modules used to reduce chip area consumed by the entire design, which is used in exchange for the advantages of speed area savings. Conversely, if a design timing requirements are high, the conventional method up to the design frequency, the data stream can generally parallel conversion, parallel operation of multiple copy modules, and ping-pong operation taking thought run of the entire serial-parallel converter design .

2. Hardware principle

  The main principle of the hardware for the HDL code written in terms of

  Verilog hardware abstraction is the use of C language, which is to describe the effect of the nature of hardware! The final result is achieved that the actual circuit chip. So finally judge the merits of the standard section of the HDL code is: and its properties describe the hardware implementation. Including both area and speed. A higher rating level design code, just say this was designed by hardware want HDL code for this form of conversion more smooth and reasonable. And a final design performance, depending on the envisaged design engineers a greater degree of efficiency and rationality of the hardware implementation of the program. (HDL code is just one form of expression hardware design)

  Beginners, one-sided pursuit of code clean, short, is wrong. HDL is the standard contrary. Correct encoding method, the first to achieve the desired confidence hardware circuit implementation, the hardware configuration and the connecting part is very clear, and then use the appropriate HDL statements can be expressed.

  In addition, Verilog HDL as a language, is hierarchical. System level - algorithm level - register transfer level - logic stage - gate level - switching stages.

  Construction of the priority tree will consume a large amount of combinational logic, so if we can use the case where, as far as possible instead of using the case if ..... else ......

3. System Principles

  The meaning of the principle of the system consists of two levels: the higher level, the hardware is a system, a module board how to spend and task allocation, what kind of algorithms and functions for FPGA implementation on the inside, what kind of algorithms and functions fit on DSP, CPU inside implementation, and estimates the size of the FPGA interface design and the like. In particular FPGA design will have to arrange on the overall design of the macro, such as clock domain, module reuse constraint, the problem area, speed and the like. The most important optimization module in the system.

  一般来说实时性要求高,频率快的功能模块适合FPGA实现。而FPGA和CPLD相比,更适合实现规模较大,频率较高、寄存器较多的设计。使用FPGA/CPLD设计时,应该对芯片内部的各种底层硬件资源,和可用的设计资源有一个较深刻的认识。比如FPGA一般触发器资源丰富,CPLD的组合逻辑资源更加丰富。FPGA/CPLD一般是由底层可编程硬件单元,BRAM,布线资源,可配置IO单元,时钟资源等构成。底层可编程硬件单元一般由触发器和查找表组成。Xilinx的底层可编程硬件资源较SLICE,由两个FF和2个LUT构成。Altera 的底层硬件资源叫LE,由1个FF和1个LUT构成。

  使用片内RAN可以实现单口RAM,双口RAM,同步、异步FIFO,ROM,CAM等常用单元模块。

  一般的FPGA系统规划的简化流程

 

 

 

 

4.同步设计原则

  异步电路

    电路的逻辑核心是用组合逻辑电路实现。比如异步的FIFO/RAM读写信号,地址译码等电路。电路的主要信号、输出信号等并不依赖于任何一个时钟性信号,不是由时钟信号驱动FF产生的。

    异步时序电路的最大缺点是容易产生毛刺。在布局布线后仿真和用逻辑分析仪观测实际信号时,这种毛刺尤其明显。

  同步时序 

    电路的核心逻辑用各种各样的触发器实现

    电路的主要信号、输出信号都是由某个时钟沿驱动触发器产生出来的。

    同步时序电路可以很好的避免毛刺。布局布线后仿真,和用逻辑分析仪采样实际工作信号都没有毛刺。

  是否时序电路一定比异步电路使用更多的资源呢?

    从单纯的ASCI设计来看,大约需要7个门来实现一个D触发器,而一个门即可实现一个2输入与非门,所以一般来说,同步时序电路比异步电路占用更大的面积。(FPGA/CPLD中不同,主要是因为单元块的计算方式)

  如何实现同步时序电路的延时?

    异步电路产生延时的一般方法是插入一个Buffer,两级与非门等。这种延时调整手段是不适用同步时序设计思想的。首先要明确一点HDL语法中的延时控制语法,是行为级的代码描述,常用于仿真测试激励,但是在电路综合是会被忽略,并不能启动延时作用。

    同步时序电路的延时一般是通过时序控制完成的。换句话说,同步时序电路的延时被当做一个电路逻辑来设计。对于比较大的和特殊定时要求的延时,一般用高速时钟产生一个计数器,通过计数器的计数控制延迟;对于比较小的延时,可以用D触发器打一下,这种做法不仅仅使信号延时了一个时钟周期,而且完成了信号与时钟的初次同步,在输入信号采样和增加时序约束余量中使用。

  同步时序电路的时钟如何产生?

    时钟的质量和稳定性直接决定着同步时序电路的性能。

  输入信号的同步

    同步时序电路要求对输入信号进行同步化,如果输入数据的节拍和本级芯片的处理时钟同频,并且建立保持时间匹配,可以直接用本级芯片的主时钟对输入数据寄存器采样,完成输入数据的同步化。如果输入数据和本级芯片的处理时钟是异步的,特别是频率不匹配的时候,则要用处理时钟对输入数据做两次寄存器采样,才能完成输入数据的同步化

  是不是定义为Reg型,就一定综合成寄存器,并且是同步时序电路呢?

    答案的否定的。Verilog中最常用的两种数据类型Wire和Reg,一般来说,Wire型指定书数据和网线通过组合逻辑实现,而reg型指定的数据不一定就是用寄存器实现

5.乒乓操作

  乒乓操作是一个常常应用于数据流控制的处理技巧。

  

 

   数据缓冲模块可以为任何的存储模块,比较常用的存储模块为双口RAM(DPRAM),单口RAM(SPRAM),FIFO等。在等一个缓冲周期,将输入的数据流缓存到数据缓存模块1,在第二个缓冲周期,通过输入数据流选择单元,将输入的数据流缓存到数据缓冲模块2.乒乓操作的最大特点是,通过输入数据选择单元和输出数据选择单元,进行运算和处理。把乒乓操作模块当成一个整体,站在两端看数据,输入数据和输出数据流都是连续不断的,没有任何停顿,因此非常适合对数据流进行流水线式处理。所以乒乓操作常常应用于流水线式算法,完成数据的无缝缓冲和处理

  乒乓操作的第二个优点是可以节约缓冲区空间。比如在WCDMA基带应用中,1帧是由15个时隙组成的,有时需要将1整帧的数据延时一个时隙后处理,比较直接的方法就是将这帧数据缓存起来,然后延时一个时隙,进行处理。这时缓冲区的长度为1帧的数据长,假设数据速率是3.84Mb/s,1帧10ms,此时需要缓冲区的长度是38400bit,如果采用乒乓操作,只需定义两个缓冲1时隙的数据RAM,当向一个RAM写数据时,从另一块RAM读数据,然后送到处理单元处理,此时每块RAM的容量仅需2560bit,2块加起来5120bit的容量。

    巧妙的运用乒乓操作,还可以达到低速模块处理高速数据流的效果。

 

 

 

6.串并转换

 

 7.流水线操作

  流水线处理是高速设计中一个常用的设计手段。如果某个设计的处理流程分为若干步骤,而且整个数据处理是单流向的。则可以考虑采用流水线设计方法提高系统的工作频率。

 

   其基本结构为:将适当划分的n个操作步骤单流向串联起来。流水线操作的最大特点和要求是,数据在各个步骤的处理,从时间上是连续的,如果将每个操作步骤简化假设为一个通过D触发器(就是用寄存器打一个节拍),那么流水线操作就类似一个移位寄存器组,数据流依次流经D触发器,完成每个步骤的操作。流水线设计时序图如下:

 

 流水线设计的关键在于,整个设计时序的合理安排。要求每个操作步骤的划分合理。如果前级操作时间恰好等于后级操作时间,设计最为简单,前级的输出直接汇入后级的输入即可。如果前级操作时间大于后级操作时间,则需要进行适当缓存。如果前级操作时间小于后级操作时间,则必须通过复制逻辑,将数据流分流,或在前级对数据采用存储、后处理的方式。否则会造成后级数据的溢出。

8.数据接口的同步方法

  数据接口的同步在FPGA/CPLD设计中一个常见问题。很多设计工作不稳定都是源于数据接口的同步问题。

  1.输入输出的延时不可测,或者可能有变动,如何完成数据的同步?

    对于数据延迟不可测或者变动,就需要建立同步机制。或者用一个同步使能,或者同步指示信号。另外使数据通过RAM或者FIFO的存取,也可以达到数据同步的目的。

    把数据存放在RAM或FIFO的方法如下,将上级芯片提供的数据随路时钟作为写信号,将数据写入RAM或者FIFO,然后使用本级时钟的采样时钟(一般是数据处理的主时钟),将数据读出来即可。这种做法的关键是数据写入RAM或者FIFO要可靠,如果使用同步RAM或者FIFO,就要求有一个与数据延迟相对关系固定的随路指示信号。

 

 

Guess you like

Origin www.cnblogs.com/xzp-006/p/11613778.html