Analysis of Cross-Clock Domain Processing (1) (Clock Domain Crossing (CDC) Design & Verification Techniques Using SystemVerilog)

write in front

         Text Reference 《Clock Domain Crossing (CDC) Design & Verification Techniques Using SystemVerilog》 --Clifford E. Cummings.

        It mainly describes how to design across clock domains. In the text, the black text is the content, and the light blue font is my long-winded text. If you need the original article in English, you can comment and leave an email to me.

        Series of articles:
        Analysis of Cross-Clock Domain Processing (1) (Clock Domain Crossing (CDC) Design & Verification Techniques Using SystemVerilog)

        Analysis of Cross-Clock Domain Processing (2) (Clock Domain Crossing (CDC) Design & Verification Techniques Using SystemVerilog)

        Analysis of Cross-Clock Domain Processing (3) (Clock Domain Crossing (CDC) Design & Verification Techniques Using SystemVerilog )


1 Introduction

        In 2001, I published my first paper on the design of multiple asynchronous clocks. At the time, I hadn't found any good sources describing the design and synthesis techniques required for a proper multiclock design. The 2001 paper is a collection of techniques I've gathered over the years from practical ASIC and FPGA design experience. At the end of the 2001 conference talk, dozens of engineers and colleagues came forward to share enough additional interesting ideas and techniques with me to write a sequel on the topic. Over the past eight years, I have added instruction in multi-clock design techniques to my Advanced and Expert Verilog and SystemVerilog training courses, and over the same period, many more colleagues and students have shared with me more interesting multi-clock designs technology. Since the publication of the first multiclock paper in 2001, the industry has largely identified these types of design approaches as Clock Domain Crossing (CDC) techniques. I will use this common nomenclature in this article.

        This paper includes the best techniques described in the 2001 paper as well as an updated collection of interesting and efficient multi-clock design techniques shared with me over the past decade. The actual conference presentation slides will be primarily a collection of new technologies adopted since the original presentation in 2001, with only enough original slides remaining to introduce basic CDC design concepts and issues.

        The foreword is nothing fancy.

2. Metastable state

        Metastability is when a signal does not assume a stable 0 or 1 state for a period of time at some point during normal operation of the design. In a multi-clock design, metastability cannot be avoided, but the adverse effects of metastability can be reduced.

        To quote Dally and Poulton's book on metastability [9]:

                "When a changing data signal is sampled with a clock...the sequence of events determines the outcome. The smaller the time difference between events, the longer it takes to determine which occurred first. When two events occur very close together , the decision-making process may take longer than allotted and a synchronization failure will occur."

        Figure 1 shows a synchronization failure that occurs when a signal generated in one clock domain is sampled too close to the rising edge of a clock signal from a second clock domain. Synchronization failures are caused by the output becoming metastable and not converging to a legal stable state when the output must be sampled again.

        The generation of metastable state is caused by the fact that the register sampling does not meet the requirements of setup time or hold time. The generation of metastable state is unavoidable. What we can do is to find a way to reduce the frequency of its occurrence. In the cross-clock domain design, due to the existence of cross-domain clock domains, if no measures are taken, there will be a high probability that metastability will be introduced.

2.1. Why is metastability a problem?

        So why is metastability a problem? Figure 2 shows that metastable outputs traversing additional logic in the receive clock domain can cause illegal signal values ​​to propagate throughout the rest of the design. Because the CDC signal may fluctuate for a period of time, the input logic in the receive clock domain may recognize the fluctuating signal's logic level as a different value, propagating false signals into the receive clock domain. 

        Every flip-flop used in any design has a specified setup and hold time, or the time before and after the rising edge of the clock that the data input is not legally allowed to change. This time window is precisely specified as a design parameter to prevent the data signal from changing too close to the other sync signal, causing the output to go into a metastable state. 

        The greatest danger of metastability is that it can introduce the system into an unknown state, which is fatal for many designs.

3. Synchronizer

        An important question to ask when passing signals between clock domains is, do I need to sample every value of the signal passing from one clock domain to another?

3.1. Two synchronization scenarios

        Two situations can arise when passing signals across CDC boundaries, and it is important to determine which situation applies to your design:

  1. Allows to miss samples passing between clock domains.
  2. Every signal passed between clock domains must be sampled.

        Case 1: Sometimes it is not necessary to sample every value, but the important thing is that the sampled value must be accurate. An example is a set of Gray code counters used in standard asynchronous FIFO designs. In a properly designed asynchronous FIFO model, a synchronous Gray code counter does not need to capture every legal value from the opposite clock domain, but it is critical that the sampled values ​​be accurate to identify when full and empty conditions occur.

        Case 2: The CDC signal must be correctly identified or identified and acknowledged ( ie handshake - Translator's Note ) before changes to the CDC signal are allowed.

        In both cases, the CDC signal needs some form of synchronization to the receive clock domain.

        Not all designs require all data to be sampled. For example, in the asynchronous FIFO design, the existence of missed sampling can be allowed. Because the most important thing in the design of asynchronous FIFO is to correctly (or not wrongly) judge the empty and full of the FIFO. Missing some data will not cause errors in its judgment of fullness, and will make its design safer in a sense.

3.2. Dual Trigger Synchronizer

        Quoting Dally and Poulton [9] again about synchronizers:

                "A synchronizer is a device that samples an asynchronous signal and outputs a version of the signal that is synchronized to a local clock or sampling clock."

        The simplest and most common synchronizer used by digital designers is the two-flip-flop synchronizer, as shown in Figure 3. The first flip-flop samples the asynchronous input signal into a new clock domain and waits a full clock cycle to allow any metastable decay on the first stage output signal, then the first stage signal is sampled by the same clock into the second Phase flip-flop, whose intended goal is that the Phase 2 signal is now a stable and valid signal, synchronized and ready to be distributed in the new clock domain.

        In theory, it is still possible for the first stage signal to be metastable when the signal is clocked to the second stage to cause the second stage output signal to also go metastable. The calculation of the time interval between synchronization failures (MTBF) probability is a function of several variables, including the frequency of the clock used to generate the input signal and to clock the synchronization flip-flop. A description of MTBF calculations can be found in Dally and Poulton [9].

        For most synchronization applications, two flip-flop synchronizers are sufficient to eliminate all possible metastability. 

        Dual trigger sync, which is the classic two-beat, and works with most common designs.

3.3.MTBF -- mean time between failures

        For most applications, it is important to run a mean time between failure (MTBF) calculation for any signal that crosses a CDC boundary. In this sense, fail means that a signal is passed to the sync flip-flop, becomes metastable on the first stage sync flip-flop, and a cycle later when it is sampled to the second stage sync flip-flop continue to be metastable. Since the signal does not settle to a known value after one clock cycle, when sampled and delivered to the receive clock domain, the signal may still be metastable, leading to potential failure of the corresponding logic.

        Larger numbers are preferred over smaller numbers when calculating MTBF numbers. A larger MTBF number indicates a longer time interval between potential failures, while a smaller MTBF number indicates that metastability may occur frequently, again leading to design failure.

        Dally and Poulton [9] gave a good equation where a very thorough analysis can be done to calculate the MTBF of a synchronizer circuit. Without repeating the equations and analysis, it should be pointed out that the two most important factors that directly affect the MTBF of a synchronizer circuit are the sampling clock frequency (the speed at which the signal is sampled into the receive clock domain) and the data change frequency (data changes that cross the CDC boundary how fast).

         As can be seen from the partial equations above, failures occur more frequently (shorter MTBF) in higher speed designs, or when sampled data changes more frequently.

        As mentioned earlier, the generation of metastable states is unavoidable and can only be avoided as much as possible. So people introduced the parameter MTBF to characterize the time between two failures, that is, how often a metastable state occurs. If the MTBF can be controlled to a few decades (it can be done), the design can be regarded as a design with little metastability, after all, our digital system life expectancy is not that long (except military grade).

 3.4. Three-flip-flop synchronizer

        For some very high-speed designs, the MTBF of the dual flip-flop synchronizer is too short, add a third flip-flop to increase the MTBF to a satisfactory duration. Of course, satisfaction or not is up to the designer.

        For military-grade or equivalent designs, or ultra-high-speed designs, or designs with higher reliability requirements, three shots or more may be required, depending on design requirements or company regulations. 

3.5. Synchronizing the signal from the transmit clock domain

        Frequently Asked Questions about CDC design: Is it a good idea to register the signal from the transmit clock domain before passing the signal to the receive clock domain? The implicit assumption in the question is that the CDC signals will be synchronized to the receive clock domain; therefore, they do not need to be synchronized in the transmit clock domain. This rationalization is incorrect and should generally require registering the signal in the transmit clock domain.

        Consider an example where a signal in the transmit clock domain is unregistered before being passed to the receive clock domain, as shown in Figure 6.

        In this example, the combined output of the transmit clock domain may experience a combined glitch at the CDC boundary. This combined glitch greatly increases the frequency of data changes, potentially producing small bursts of oscillating data, increasing the number of edges that can be sampled while changing, and correspondingly increasing the likelihood of sampling changing data and generating metastable signals. 

 3.6. Synchronize the signal to the receive clock domain

        Signals in the transmit clock domain should be synchronized before passing to the CDC boundary. Signal synchronization from the transmit clock domain reduces the number of edges that can be sampled in the receive clock domain, effectively reducing the frequency of data changes in the MTBF equation, thereby increasing the time between computational failures (see Section 3.3 for a description ) effect of data change frequency on MTBF).

        In Figure 7, the aclk logic is established on the adat flip-flop before being passed to the bclk domain. The adat flip-flop filters out combinatorial glitches on the flip-flop input (a) and passes a clean signal to the bclk logic. 

        Obviously, before the data is synchronized to the asynchronous clock domain, it first needs to register a beat in its own clock domain to eliminate the glitches generated by the combinational logic. Prevent glitches from being propagated to other clock domains, so that there are multiple edges when data is sampled, causing sampling failure and metastability.

4. Synchronize fast signals to slow clock domains

        As discussed in Section 3.1, if the CDC signal cannot be missed when passing between clock domains, it is important to consider signal width or synchronization techniques when passing signals between clock domains.

        One problem associated with synchronizers is that the signal from the transmit clock domain may change value twice before being sampled, or may be too close to the sampling edge of the slower clock domain. This possibility must be considered any time a signal is sent from one clock domain to another, and the issue of whether the missed signal is significant must be determined.

        When missed mining is not allowed, there are two general ways to solve the problem:

  1. An open-loop solution that ensures signal capture without acknowledgement.
  2. A closed-loop solution is required to acknowledge receipt of a signal that crosses the CDC boundary.

        Both solutions will be discussed in this section.

4.1. Requirements for reliable signaling between clock domains

       Generally speaking, there is no problem in adopting slow clock signals in the fast clock domain, and they can basically be adopted; however, the fast clock signals in the slow clock domain need to be discussed on a case-by-case basis.

        If the frequency of the faster clock domain is 1.5 times (or more) the frequency of the slower clock domain, synchronizing the slower control signal to the faster clock domain is usually not a problem, because the faster clock signal will The CDC signal is sampled one or more times. Realizing that sampling a slower signal to a faster clock domain causes fewer potential problems than sampling a faster signal to a slower clock domain, designers may take advantage of this fact by using a simple two trigger The synchronizer passes a single CDC signal between clock domains.

4.1.1. "Three sides" requirement

        The essence of the "three-sided" requirement is to ensure that the signal is long enough to be picked up by the accepting clock domain.

        Mark Litterick[4] pointed out that when passing a CDC signal between clock domains through a two-flip-flop synchronizer, the CDC signal must be 1.5 times wider than the receive domain clock period. Littereick describes this requirement as "the input data value must be stable on three target clock edges". For particularly long source and destination clock frequencies, this requirement may be safely relaxed to 1.25 times or less of the receive clock domain cycle time, but the "three-sided" guideline is the safest initial design condition, and by using SystemVerilog assertions, It is easier to demonstrate dynamically measuring the fractional width of the CDC signal during simulation. The "three edges" requirement actually applies to both open-loop and closed-loop solutions, but the implementation of the closed-loop solution automatically ensures that at least three edges of all CDC signals are detected.

4.2. Problem - Delivering fast CDC pulses

        Consider the case where there is a serious defect, that is, the frequency of the transmit clock domain is higher than that of the receive clock domain, and the CDC pulse is only one cycle wide in the transmit clock domain. If the CDC signal is only pulsed for one fast clock cycle, the CDC signal may go high and low between the rising edges of the slower clock without being captured into the slower clock domain, as shown in Figure 8.

        This kind of situation is the possibility that slow harvesting and fast harvesting cannot be directly harvested.

4.3. Problem - Sample a long CDC pulse - but not long enough!

        This kind of situation is slow acquisition and fast acquisition. Although it is not a single-cycle pulse signal, it still cannot be acquired by the slow clock domain.

        Consider the unintuitive and flawed case where the transmit clock domain sends a pulse to the receive clock domain that is slightly wider than the period of the receive clock frequency. In most cases, the signal will be sampled and passed, but the CDC pulse changes too close to the two rising clock edges of the receive clock domain to violate the settling time on the first clock edge with a small but real possibility and violation Hold time for the second clock edge and not form the expected pulse. This possible failure is shown in Figure 9. 

4.4. Open Loop Solution - Sample Signal Using Synchronizer

        A possible solution to this problem is to assert the CDC signal for a period of time that exceeds the sample clock cycle time, as shown in Figure 10. As described in Section 4.1.1, the minimum pulse width is 1.5 times the period of the sample clock. It is assumed that the CDC signal will be sampled at least once and possibly twice by the receiver clock.

        Open loop sampling can be used when the relative clock frequency is fixed and analyzed correctly.

        Advantages: The open loop solution is the fastest way to pass a signal across the CDC boundary without requiring acknowledgement of the received signal.

        Cons: The biggest potential problem associated with an open-loop solution is that another engineer may mistake the solution for a generic solution, or the design requirements may change and the engineer may not be able to re-analyze the original open-loop solution. This problem can be minimized by adding SystemVerilog assertions to the model to detect if the input pulse fails to exceed the "three edges" design requirement. 

        The essence of this method is to widen the pulse signal, which is actually to say that the fast signal becomes a signal slower than the full clock domain - down-frequency. The disadvantage of this method is that it is not general enough, and if the slow clock becomes slower, it will not be able to sample.

4.5. Closed Loop Solution - Sample Signal Using Synchronizer

        A second possible solution to this problem is to send an enable control signal, synchronize it to the new clock domain, and then pass the sync signal back through another synchronizer to the send clock domain as an acknowledgement signal.

        Advantage: Synchronizing feedback signals is a very safe technique to confirm that the first control signal has been identified and sampled into a new clock domain.

        Disadvantage: There may be a considerable delay in synchronizing the control signals in both directions before allowing the control signals to change. 

        The essence of this method is to reduce the frequency first, and then shake hands. The degree of frequency reduction is determined by accepting the signal fed back by the clock domain.

Guess you like

Origin blog.csdn.net/wuzhikaidetb/article/details/122874278