Get started with Aurora 8B/10B IP core in one day (2) ---- Aurora overview and data interface (Framing interface, Streaming interface)

write in front

        Series Summary: Getting Started with Aurora 8B/10B IP Core in One Day----Summary (Direct Link)


1. Aurora 8B/10B protocol

The Aurora protocol is a scalable, lightweight link layer protocol (developed by Xilinx) for moving data between point-to-point serial links. This provides a transparent interface to the physical layer, making it easy to use high-speed transceivers on top of proprietary or industry-standard protocols. The Aurora protocol has two implementations on Xilinx's FPGA: 8B/10B and 64B/10B. Most of the two protocols are the same, the main difference is the encoding method:

  • Aurora 8B/10B: Encode 8bit data into 10bit numbers for transmission, try to balance the number of "0" and "1" in the data to achieve DC balance, obviously the overhead of this encoding method is 20%, that is, the efficiency is 80%
  • Aurora 64B/10B: Encode 64bit data into 66bit blocks for transmission. The first two bits of the 66bit block represent the synchronization header, mainly due to the data alignment at the receiving end and the synchronization of the received data bit stream. There are two kinds of synchronization headers, "01" and "10". "01" indicates that the following 64 bits are data, and "10" indicates that the following 64 bits are data information. Data information 0 and 1 are not necessarily balanced, so scrambling is required, and the overhead is small

Aurora 8B/10B are often used for chip (FPGA) to chip (FPGA) communication. It is used to transmit data between devices using one or more transceivers. Connections can be full-duplex (two-way data) or simplex. Up to 16 transceivers (GTX, GTP or GTH) can be implemented, with throughput scalable from 480 Mb/s to 84.48 Gb/s. Aurora core throughput depends on the number of transceivers and the line rate of the chosen transceivers. Calculated by using 25% overhead to calculate the throughput Aurora 8B/10B protocol encoding sum and line rate range of 0.5 Gb/s to 6.6 Gb/s, its transmission throughput is 0.4 Gb/s from single-lane design to 84.48 Gb/s up to 16 lanes.

The figure below is a typical Aurora 8B/10B communication system using two full-duplex modes and multiple lanes.

It is not difficult to see from the above picture: 

  • Users use the user interface to interact with the Aurora 8B/10B IP core
  • The Aurora 8B/10B IP core is in full duplex mode, and its data path consists of multiple lanes
  • After the sent data is 8B/10B encoded by the Aurora IP core, it is sent to another Aurora IP core through multiple lanes, and the IP core sends the received data to the user through the user interface

2. Aurora 8B/10B IP core

2.1, IP core composition

Next, let's take a look at the composition of the Aurora 8B/10B IP core:

The main components are as follows:

  • Lane Logic: Each GT transceiver is driven by an instance of the Lane Logic module, which initializes each individual transceiver and handles encoding and decoding of control characters and error detection.
  • Global Logic: The global logic block performs binding and validation of channel initialization. During operation, the module generates random idle characters required by the Aurora protocol and monitors all channel logic modules for errors.
  • RX User Interface: The AXI4-Stream RX receive port moves data from the channel to the application and performs flow control functions.
  • TX User Interface: The AXI4-Stream TX transmit port moves data from the application to the channel and performs flow control TX functions. Standard clock compensation blocks are embedded in the core. This block controls the periodic transmission of Clock Compensation (CC) characters.

It is basically clear from seeing here: Aurora 8B/10B is a full-duplex point-to-point protocol based on GT high-speed transceiver (physical layer), and each channel of GT high-speed transceiver is a Lane of Aurora protocol.

The following figure is a schematic diagram of the top-level structure of the IP core, which better illustrates the relationship between the IP core and the GT high-speed transceiver.

2.2. Latency

Due to the logic design (pipeline, codec, etc.) of the Aurora 8B/10B IP core, the data sent by the client to the IP core requires a certain delay to be sent through the IP core. Approximate values ​​for this delay are 37 (2 bytes wide) and 41 (4 bytes wide), as shown in the following figure:

2.3, Throughput (throughput rate):

       The Aurora 8B/10B IP core throughput depends on the number and line rate of GT transceivers. Throughput rates for single-lane designs to 16-lane designs range from 0.4Gb/s to 84.48Gb/s, respectively. Throughput is calculated with Aurora 8B/10B protocol encoding and 20% overhead over the 0.5Gb/s to 6.6 Gb/s line rate range.

        That is to say, the more channels of GT high-speed transceivers used and the higher the line rate they support, the higher the throughput of the entire Aurora 8B/10B IP core, but pay attention to multiplying by 80%, because 8B/10B There is a 20% overhead in encoding.

2.4, size end

In the customization of IP cores, there is a choice of big and small ends. The so-called little endian is our most common way of defining multi-bit data: [n:0] The left side is the high bit, and the right side is the low bit, which conforms to the Verilog writing habits, and the big endian is the opposite.
insert image description here

2.5. Data sending and receiving interface

The Aurora 8B/10B IP core supports the AXI4-Stream protocol, and provides two data transmission interfaces according to whether the AXI4-Stream protocol is repackaged: Framing interface (frame transmission interface) and Streaming interface (stream transmission interface).

  • Framing interface (frame transmission interface): On the basis of AXI4-Stream, control signals such as frame header and frame end are added to make the transmission more accurate, but it will reduce transmission efficiency and use more resources
  • Streaming interface (streaming interface): basically a very simplified AXI4-Stream interface, with only valid data, handshake and data signals, this method has high transmission efficiency, but cannot guarantee the accuracy of transmission

For the AXI4-Stream protocol, please refer to: Quick Start AXI4 Bus - Summary (Direct Link)

2.5.1, AXI4-Stream Bit Ordering (AXI4-Stream Bit Ordering):

       Aurora 8B/10B IP cores are in ascending order. The most significant bit of the most significant byte is transmitted and received first. The figure below shows an example of the AXI4-Stream data interface for the n-byte Aurora 8B/10B IP core.

2.5.2, Framing interface

The schematic diagram of Framing interface is as follows:

The Framing interface has the concept of frame, so the interface signal is a bit more complicated than the Streaming interface. The main interfaces are as follows:

Sender (relative to user):

name direction clock domain illustrate
s_axi_tx_tdata[(8n–1):0] enter user_clk The data to be sent by the user, the bit width is determined by the link bit width and the number of links
s_axi_tx_tready output user_clk High indicates that the current IP core is ready to receive data
s_axi_tx_tlast enter user_clk The last data sent, active high
s_axi_tx_tkeep[(n–1):0] enter user_clk Valid byte used to indicate the last data sent
s_axi_tx_tvalid enter user_clk High indicates that the data sent by the current user is valid

Receiver (relative to the user):

name direction clock domain illustrate
m_axi_rx_tdata[8(n–1):0] output user_clk Received data, the bit width is determined by the link bit width and the number of links
m_axi_rx_tlast output user_clk The last data received, active high
m_axi_rx_tkeep[(n–1):0] output user_clk Valid byte used to indicate the last data received
m_axi_rx_tvalid output user_clk It is high to indicate that the currently received data is valid

If you are familiar with the AXI4-Stream protocol, you can basically start receiving and sending data immediately.

send data

  • It can be judged from several signals at the sender that when the handshake between s_axi_tx_tready and s_axi_tx_tvalid is successful, data can be sent
  • Use s_axi_tx_tlast to indicate the last data currently sent
  • s_axi_tx_tkeep to indicate the valid byte of the last data (in the application scenario, when an odd number of bytes are sent, the IP core will automatically add a pad to the data, so there is an invalid byte that needs to be pointed out), which is similar to the AXI4-Stream protocol. not quite the same

Receive data

  • No handshake process is required to receive data
  • When m_axi_rx_tvalid is high, it means that the data at this time is valid data and can be used
  • The usage of m_axi_rx_tkeep and m_axi_rx_tlast is consistent with the signal corresponding to the sender

frame structure

The TX sub-module converts each received user frame to Aurora 8B/10B frame through the TX interface. Start of Frame (SOF) is indicated by adding a 2-byte SCP code group at the beginning of the frame. End of Frame (EOF) is determined by adding a 2-byte End of Channel Protocol (ECP) block to the end of the frame. Insert free code groups when data is not available. Code groups are 8B/10B encoded byte pairs, all data is sent as code pairs, so user frames with an odd number of bytes have a control character called PAD appended to the end of the frame to fill in the final code group. The image below shows a typical Aurora 8B/10B frame with an even number of data bytes.

4 sending cases

        The manual (PG046) lists 4 transmission cases to facilitate our understanding of the sending process:

Example A: Simple Data Transfer

        During the successful handshake between the valid signal and the ready signal, data is transmitted, and when the last data DATA2 is transmitted, the tlast signal is pulled high, indicating that the last data is transmitted at this time. The tkeep signal indicates that those bytes of the last data are valid.

Example B: Data Transfer with Pad

        During the successful handshake between the valid signal and the ready signal, data is transmitted, and when the last data DATA2 is transmitted, the tlast signal is pulled high, indicating that the last data is transmitted at this time. The tkeep signal indicates that those bytes of the last data are valid. Since an odd number of bytes are transmitted at this time, there are invalid bytes in the last data, so the value of the tkeep signal is N-1.

Example C: Data Transfer with Pause

        During the successful handshake between the valid signal and the ready signal, data is transmitted, and when the last data DATA2 is transmitted, the tlast signal is pulled high, indicating that the last data is transmitted at this time. The tkeep signal indicates that those bytes of the last data are valid.

During the handshake, the user interrupts the handshake by pulling the valid signal low, realizing the suspension of data transmission (flow control).

Example D: Data Transfer with Clock Compensation

        Data transfer is automatically interrupted when the Aurora 8B/10B IP core sends a clock compensation sequence. The clock compensation sequence adds 12 bytes of overhead per channel every 10,000 bytes. Others are the same as above.

Receive data case

        Different from the handshake process of sending data, the process of receiving data is very simple. Only when the data valid signal m_axi_rx_tvalid is high, it means that the data received at this time is valid, and m_axi_rx_tkeep and m_axi_rx_tlast are also used to modify the last data received. A typical process is as follows:

        When m_axi_rx_tvalid is high, the received data is valid, otherwise it is invalid.

Framing interface summary:

  • The Framing interface is similar to the repackaged AXI4-Streaming interface. The IP core automatically adds the frame header and frame trailer, and completes clock compensation within a fixed time.
  • The sending end user only needs to send data after the handshake between the sending and receiving parties, and both communicating parties can back-press each other through the handshake signal; the receiving end user only needs to take the data from the bus when the valid signal is valid
  • Because it is a frame structure, a signal is required to constrain the frame length -- tlast; since the data is sent in pairs, the last data may have invalid bytes, so it is necessary to check the number of valid bytes of the last data. constraint --tkeep

2.5.3, Streaming interface

The schematic diagram of the Streaming interface is as follows:

It looks a lot cleaner than the Framing interface, because both the sender and the receiver lack the keep and last signals (4 in total). As mentioned before, the frame frame of the Framing interface makes it necessary to use the keep and last signals to control the length of the frame, so there are many signals. The Streaming interface does not have a frame frame, which is equivalent to a continuously flowing pipeline, so there is no need to use the keep and last signals to control the length.

It is also very simple to use. Sending data can be sent as long as the handshake between the tvalid signal and the tready signal is successful; receiving data is even simpler. As long as tvalid is high, it means that the data received at this time is valid.

Look directly at the picture for a better understanding:

Example A: TX Streaming Data Transfer

        Simple and straightforward, data can be sent only when s_axi_tx_tready and s_axi_tx_tvalid are both high (successful handshake). 

Example B: RX Streaming Data Transfer (receive data)

         Simple and straightforward, only when m_axi_rx_tvalid is high does it indicate that the received data is valid data.

Streaming interface summary:

  • The Streaming interface is the classic AXI4-Streaming interface. There is no frame concept, and the data length on the data bus is unlimited.
  • The sending end user only needs to send data after the handshake between the sending and receiving parties, and both communicating parties can back-press each other through the handshake signal; the receiving end user only needs to take the data from the bus when the valid signal is valid

3. Other

  • In the next section, we will learn about the clock architecture, reset and indicator signals of the Aurora IP core.
  • It is not easy to create. If this article is helpful to you, please like, comment and bookmark more. Your support is the biggest motivation for me to keep updating!

References

        Aurora 8B/10B Protocol Specification

        Aurora 8B/10B v11.1 LogiCORE IP Product Guide

        FPGA design experience (3) Theoretical learning record of Aurora IP core

        Introduction to Aurora IP on Xilinx Platform (Summary)

Guess you like

Origin blog.csdn.net/wuzhikaidetb/article/details/122421609