FPGA UltraScale GTY has the most detailed explanation on the entire network, aurora 8b/10b codec, board-to-board video transmission, and provides 2 sets of engineering source code and technical support


FPGA UltraScale GTY has the most detailed explanation on the entire network, aurora 8b/10b codec, board-to-board video transmission, and provides 2 sets of engineering source code and technical support

1 Introduction

Even those who have never played with GT resources are embarrassed to say that they have played with FPGA. This is what a CSDN boss said, and I firmly believe it. . . GT resources are an important selling point of Xilinx series FPGAs and are also the basis for high-speed interfaces. Whether it is PCIE, SATA, MAC, etc., GT resources are needed for high-speed data serialization and deserialization. Different Xilinx FPGA series have different GT resource types, the low-end A7 has GTP, the K7 has GTX, the V7 has GTH, and the higher-end U+ series has GTY, etc. Their speeds are getting higher and higher, and their application scenarios are becoming more and more high-end. . . UltraScale GTH is suitable for FPGAs of the Xilinx UltraScale series, including Virtex UltraScale, Kintex UltraScale, Zynq® UltraScale and other devices. Under the UltraScale series, there is only GTH. Compared with GTH, UltraScale GTH has a higher line rate and supports more protocol types. , lower power consumption and higher bandwidth. . .

This article uses the UltraScale GTY resource of the xcku5p-ffvb676-1-i model FPGA of the Kirtex UltraScale+ series of Xilinx to do a board-to-board video transmission experiment. There are two video sources, which correspond to whether the development board in the developer's hand has an HDMI input interface. One situation is to use a laptop to simulate HDMI video. The ADV7611 chip decodes the input HDMI video into GRB for use by the FPGA. If your development board has an HDMI input interface, or the HDMI input decoding chip of your development board is not ADV7611, Then you can use the dynamic color bar generated inside the code to simulate the camera video; the video source is selected through the define COLOR_TEST macro definition at the top level of the code, and the HDMI input is used as the video source by default; 2 sets of vivado2022.2 version of the project source code are provided. The first set is the GTY sending project; after the FPGA collects the video, it will first send it to the data grouping module to package the video, and add the control frame header and frame tail based on the character BC and other identifiers; then call Xilinx's official UltraScale GTY IP core is configured in 8b/10b encoding and decoding mode, and the line rate is configured as 5G; then the 8b/10b encoded video is sent out through the onboard SFP optical port; the second set is the GTY receiving project ;The SFP optical port receives the 8b/10b encoded video; UltraScale GTY then performs 8b/10b decoding processing; then the data is sent to the data alignment module for alignment processing; then the data is sent to the data unpacking module to remove the frame header and tail and Restore the video timing; for the sake of convenience, the video is not cached here. Instead, the three IPs of Xilinx official Video In to AXI4-Stream, Video Timing Controller, and AXI4-Stream to Video Out are directly called to simply cache the video and then send it. The HDMI sending module implemented in pure verilog code converts RGB video to HDMI video, and finally outputs the monitor display;

This blog provides two sets of vivado project source code. The difference between the two sets of projects lies in whether GTH is used for sending or receiving; the details are as follows:

vivado工程1:HDMI/动态彩条输入,UltraScale GTY 编码,SFP光口1发送出去;
vivado工程2:SFP光口2输入,UltraScale GTY 解码,HDMI输出显示器;

This blog describes in detail the design plan for board-to-board video transmission experiments using UltraScale GTY resources of the Direct project transplantation is suitable for school students and graduate student project development, and is also suitable for on-the-job engineers to improve their studies. It can be applied to high-speed interfaces or image processing fields in medical, military and other industries;
Provides complete The engineering source code and technical support of running through;
The method of obtaining the engineering source code and technical support is placed at the end of the article, please be patient and read to the end;

Disclaimer

This project and its source code include both parts written by myself and parts obtained from public channels on the Internet (including CSDN, Xilinx official website, Altera official website, etc.). If you feel offended, please send a private message to criticize and educate; based on this, this project The project and its source code are limited to readers or fans for personal study and research, and are prohibited from being used for commercial purposes. If legal issues arise due to commercial use by readers or fans themselves, this blog and the blogger have nothing to do with it, so please use it with caution. . .

2. The GT high-speed interface solution I already have here

My homepage has the FPGA GT high-speed interface column. This column has video transmission routines and PCIE transmission routines for GT resources such as GTP, GTX, GTH, GTY, etc. GTP is built based on the A7 series FPGA development board, and GTX It is built based on the K7 or ZYNQ series FPGA development board, GTH is built based on the KU or V7 series FPGA development board, and GTY is built based on the KU+ series FPGA development board; the following is the column address:
Click to go directly

3. Detailed design plan

This article uses the UltraScale GTY resource of the xcku5p-ffvb676-1-i model FPGA of the Kirtex UltraScale+ series of Xilinx to do a board-to-board video transmission experiment. There are two video sources, which correspond to whether the development board in the developer's hand has an HDMI input interface. One situation is to use a laptop to simulate HDMI video. The ADV7611 chip decodes the input HDMI video into GRB for use by the FPGA. If your development board has an HDMI input interface, or the HDMI input decoding chip of your development board is not ADV7611, Then you can use the dynamic color bar generated inside the code to simulate the camera video; the video source is selected through the define COLOR_TEST macro definition at the top level of the code, and the HDMI input is used as the video source by default; 2 sets of vivado2022.2 version of the project source code are provided. The first set is the GTY sending project; after the FPGA collects the video, it will first send it to the data grouping module to package the video, and add the control frame header and frame tail based on the character BC and other identifiers; then call Xilinx's official UltraScale GTY IP core is configured in 8b/10b encoding and decoding mode, and the line rate is configured as 5G; then the 8b/10b encoded video is sent out through the onboard SFP optical port; the second set is the GTY receiving project ;The SFP optical port receives the 8b/10b encoded video; UltraScale GTY then performs 8b/10b decoding processing; then the data is sent to the data alignment module for alignment processing; then the data is sent to the data unpacking module to remove the frame header and tail and Restore the video timing; for the sake of convenience, the video is not cached here. Instead, the three IPs of Xilinx official Video In to AXI4-Stream, Video Timing Controller, and AXI4-Stream to Video Out are directly called to simply cache the video and then send it. The HDMI sending module implemented in pure verilog code converts RGB video to HDMI video, and finally outputs the monitor display;

Design block diagram

The block diagram of the detailed engineering design plan is as follows:
Insert image description here
Block diagram explanation: The arrow represents the data flow direction, the text inside the arrow represents the data format, and the numbers outside the arrow represent the steps of the data flow direction; < a i=2> Development board 1 in the above picture corresponds to vivado project 1, which is the GTY sending project; development board 2 corresponds to vivado project 2, which is the GTY receiving project;

Video source selection

This is the content of the first set of vivado projects; there are two types of video sources, corresponding to whether the development board in the hands of the developer has an HDMI input interface. One is to use a laptop to simulate HDMI video, ADV7611 chip Decode the input HDMI video into GRB for use by the FPGA; if your development board has an HDMI input interface, or your development board's HDMI input decoding chip is not ADV7611, you can use the dynamic color bars generated internally in the code to simulate camera video; The selection of the video source is carried out through the define macro definition at the top level of the code. The HDMI input is used as the video source by default; the selection of the video source is carried out through the `define macro definition at the top level of the code; as follows:
The code is located at the top level system_wrapper .v;
Insert image description here
The selection logic code part is as follows:
Insert image description here
The selection logic is as follows:
When (comment) define COLOR_TEST, the input source The video is HDMI input;
When (without comment) define COLOR_TEST, the input source video is a dynamic color bar;

ADV7611 decoding chip configuration and collection

This is the content of the first set of vivado projects; using ADV7611 to decode the input HDMI video, it is adapted to the FPGA development board with the ADV7611 decoding chip on the board; the ADV7611 decoding chip requires i2c configuration to be used. The ADV7611 decoding chip configuration and acquisition are Both parts are implemented using the verilog code module. The resolution in the code is configured as 1920x1080; the code location is as follows:
Insert image description here
The resolution in the code is configured as 1920x1080;

Dynamic color bar

This is the content of the first set of vivado projects; dynamic color bars can be configured for videos of different resolutions. The border width of the video, the size of the dynamic moving square, the moving speed, etc. can all be parameterized. I configured the resolution here as 1920x1080. The code location, top-level interface and instantiation of the dynamic color bar module are as follows:
Insert image description here
Insert image description here

video data packet

This is the content of the first set of vivado projects; since video needs to be sent and received through the aurora 8b/10b protocol in UltraScale GTY, the data must be packaged to adapt to the aurora 8b/10b protocol standard; video data package The module code location is as follows:
Insert image description here
First, we store the 16-bit video in the FIFO. When one line is full, it is read out from the FIFO and sent to the GTY for transmission; before that, a frame of video needs to be processed Number, also called an instruction. When GTY is packetizing, data is sent according to fixed instructions. When GTY is unpacking, it is restored according to fixed instructions. The field synchronization signal and video valid signal of the video are restored; when the rising edge of the field synchronization signal of a frame of video arrives. , send a frame of video start command 0, when the falling edge of the field synchronization signal of a frame of video arrives, send a frame of video start command 1, send invalid data 0 and invalid data 1 during the video blanking period, when the video valid signal arrives Each line of video is numbered. First, a line of video start command is sent, and then the current video line number is sent. When a line of video is sent, a line of video end command is sent. After one frame of video is sent, a frame of video end command 0 is sent first. Send another frame of video end command 1; at this point, one frame of video is sent. This module is not easy to understand, so I made detailed Chinese comments in the code. It should be noted that in order to prevent the Chinese comments from being displayed out of order , please use notepad++ editor to open the code; the command definition is as follows:

32'h55_00_00_bc    一帧视频开始指令032'h55_00_01_bc    一帧视频开始指令132'h55_00_02_bc    无效数据032'h55_00_03_bc    无效数据132'h55_00_04_bc    一行视频开始指令;
32'h55_00_05_bc    一行视频结束指令;
32'h55_00_06_bc    一帧视频结束指令032'h55_00_07_bc    一帧视频结束指令1

The instruction can be changed arbitrarily, but the lowest byte must be bc;

UltraScale GTY The most detailed interpretation of the entire network

The most detailed introduction to UltraScale GTY is definitely Xilinx's official "ug578-UltraScale Architecture GTY Transceivers". Let's use this to interpret: I have included the PDF document of "ug578-UltraScale Architecture GTY Transceivers" in the information package. Here, there are ways to obtain it at the end of the article;
The FPGA model of the development board I used is the xcku5p-ffvb676-1-i model of the Kirtex UltraScale+ series; the transceiver speed of UltraScale GTY is 500 Mb/s to Between 30.5 Gb/s, twice as high as UltraScale GTH; UltraScale GTY transceiver supports different serial transmission interfaces or protocols, such as PCIE 1.1/2.0 interface, 10GbE XUAI interface, OC-48, and serial RapidIO interface , SATA (Serial ATA) interface, digital component serial interface (SDI), etc.;
The project calls UltraScale GTY to perform data encoding and decoding of the Aurora 8b/10b protocol. The code location is as follows:
Insert image description here
The basic configuration of UltraScale GTY is as follows: onboard differential crystal oscillator 125M, line rate configuration is 5G, and the protocol type is aurora 8b/10b;
Insert image description here

UltraScale GTY basic structure

In Ultrascale/Ultrascale+ architecture series FPGAs, GTY high-speed transceivers are usually divided using Quad. A Quad consists of four GTYE3/4_CHANNEL primitives and one GTYE3/4_COMMON primitive. Two LC-tank pll
(QPLL0 and QPLL1) are included in each GTYE3/4_COMMON. Instantiation of GTYE3/4_COMMON is only required when using QPLL in an application. The following figure shows the schematic diagram of UltraScale GTY transceiver: "ug578-UltraScale Architecture GTY Transceivers" page 15;
Insert image description here
Each GTYE3/4_CHANNEL consists of a channel PLL (CPLL), a transmitter, and a receiver. . A reference clock can be directly connected to a GTYE3/4_CHANNEL primitive without instantiating GTYE3/4_COMMON, as shown below:
"ug578-UltraScale Architecture GTY Transceivers" page 22;
Insert image description here

The transmitter and receiver functions of the Ultrascale GTY transceiver are independent of each other and are composed of Physical Media Attachment (Physical Media Adaptation Layer PMA) and Physical Coding Sublayer (Physical Coding Sublayer PCS). PMA internally integrates serial-to-parallel conversion (PISO), pre-emphasis, receive equalization, clock generator and clock recovery, etc.; PCS internally integrates 8b/10b codec, elastic buffer, channel bonding and clock correction, etc. Each GTHE3 The logic circuit of the /4_CHANNEL source language is shown in the figure below: "ug578-UltraScale Architecture GTY Transceivers" page 17;
Insert image description here
It doesn't make much sense to say too much here, because I haven't done a few big projects. You won’t understand what’s going on here. For first-time users or those who want to use it quickly, more energy should be focused on the invocation and use of the IP core. I will also focus on the invocation and use of the IP core later;

UltraScale GTY Reference Clock Selection and Distribution

GTY transceivers in UltraScale devices provide different reference clock input options. The reference clock selection architecture supports QPLL0, QLPLL1 and CPLL. Architecturally, each quad contains four GTHE3/4_CHANNEL primitives, one GTHE3/4_COMMON primitive, two dedicated external reference clock pin pairs, and dedicated reference clock routing. If a high-performance QPLL is used, GTHE3/4_COMMON must be instantiated, as shown in the detailed view of the GTHE3/4_COMMON clock multiplexer structure below (page 33 of "ug576-ultrascale-gth-transceivers") in a Quad There are 6 reference clock pin pairs in the quad, two local reference clock pin pairs: GTREFCLK0 or GTREFCLK1, two reference clock pin pairs from the upper two Quads: GTSOUTHREFCLK0 or GTSOUTHREFCLK1, two reference clock pin pairs from the lower quads Two Quads: GTNORTHREFCLK0 or GTNORTHREFCLK1. "ug578-UltraScale Architecture GTY Transceivers" page 31;
Insert image description here

UltraScale GTY send and receive processing flow

First, after the user logic data is 8B/10B encoded, it enters a transmit buffer (Phase Adjust FIFO). This buffer is mainly used to isolate the clocks of the two clock domains of the PMA sublayer and PCS sublayer to solve the problem of clock rate matching and phase between the two. To solve the problem of differences, the high-speed Serdes is finally used for parallel-to-serial conversion (PISO). If necessary, pre-emphasis (TX Pre-emphasis) and post-emphasis can be performed. It is worth mentioning that if the TXP and TXN differential pins are accidentally cross-connected during PCB design, this design error can be compensated for by polarity control (Polarity). The processes at the receiving end and the transmitting end are opposite, and there are many similarities, so I won’t go into details here. What needs to be noted is the elastic buffer of the RX receiving end, which has clock correction and channel binding functions. You can write a paper or even a book for each function point here, so you only need to know a concept and use it in specific projects. Again: for first time use or want to use it quickly For readers, more energy should be focused on the calling and use of IP cores.

UltraScale GTY send interface

Pages 101 to 181 of "ug578-UltraScale Architecture GTY Transceivers" introduce the sending processing process in detail. Most of the content is not necessary for the user to go into details, because the manual basically talks about his own design. Based on this idea, there are not many interfaces left for users to operate. Based on this idea, we focus on the interfaces that are used by the sending part when instantiating UltraScale GTY;
Insert image description here
Users only need Just care about the clock and data of the sending interface. This part of the interface of the UltraScale GTY instantiation module is as follows: The file name is gty_aurora_example_wrapper.v, which is automatically generated by the official after instantiating the IP;
Insert image description here
Insert image description here
in the code I have rebinded and made the top level of the module for you. The code part is as follows:
The file name is gty_aurora_example_top.v; it instantiates the official gty_aurora_example_wrapper.v;
Insert image description here

UltraScale GTY receiving interface

Pages 183 to 316 of "ug578-UltraScale Architecture GTY Transceivers" introduce the sending processing process in detail. Most of the content is not necessary for the user to go into details, because the manual basically talks about his own design. Based on this idea, there are not many interfaces left for users to operate. Based on this idea, we focus on the interfaces that are used by the sending part when instantiating UltraScale GTY;
Insert image description here
Users only need Just care about the clock and data of the sending interface. This part of the interface of the UltraScale GTY instantiation module is as follows: The file name is gty_aurora_example_wrapper.v, which is automatically generated by the official after instantiating the IP;
Insert image description here
Insert image description here
in the code I have rebinded and made the top level of the module for you. The code part is as follows:
The file name is gty_aurora_example_top.v; it instantiates the official gty_aurora_example_wrapper.v;
Insert image description here

UltraScale GTY IP core calling and usage

Insert image description here
The basic configuration of UltraScale GTY is as follows: onboard differential crystal oscillator 125M, line rate configuration is 5G, protocol type is referred to as aurora 8b/10b;
Insert image description here
For specific configuration, please refer to vivado project, in IP After configuration, you need to open the example project and copy the files inside to use in your own project. However, this step has already been done in my project; the method to open the example project is as follows:
Insert image description here

data alignment

This is the content of the second set of vivado projects; since the aurora 8b/10b data transmission and reception of GT resources naturally has data misalignment, it is necessary to perform data alignment processing on the received decoded data. The code location of the data alignment module is as follows :
Insert image description here
The K code control character format I defined is: XX_XX_XX_BC, so I use an rx_ctrl to indicate whether the data is a K code COM symbol;

rx_ctrl = 4'b0000 表示 4 字节的数据没有 COM 码;
rx_ctrl = 4'b0001 表示 4 字节的数据中[ 7: 0] 为 COM 码;
rx_ctrl = 4'b0010 表示 4 字节的数据中[15: 8] 为 COM 码;
rx_ctrl = 4'b0100 表示 4 字节的数据中[23:16] 为 COM 码;
rx_ctrl = 4'b1000 表示 4 字节的数据中[31:24] 为 COM 码;

Based on this, when the K code is received, the data is aligned, that is, the data is beat and combined with the new incoming data. This is the basic operation of FPGA and will not be repeated here;

Video data unpacking

This is the content of the second set of vivado projects; data unpacking is the reverse process of data grouping, and the code location is as follows:
Insert image description here
UltraScale GTY restores video according to fixed instructions when unpacking The field synchronization signal and video effective signal; these signals are important signals for subsequent image cache; so far, the part of data entering and exiting GTX has been finished;

SFP optical port loopback selection

There are two SFP optical ports on the board. You can use one SFP optical port for loopback, or you can use two SFP optical ports for loopback. You can select it through the define SFP_0_LOOP macro in the code. One is used by default after power-on. The SFP optical port performs loopback; the code part is as follows:
The code is located in uiAurora_8b10b_vid.v;
Insert image description here
Insert image description here
The selection logic is as follows:
When (comment ) define SFP_0_LOOP, select 2 SFP optical port loopbacks;
When (without comment) define COLOR_TEST, select 1 SFP optical port loopback;
Here is the explanation Let’s look at:
Since project 1 only uses the transmission of SFP optical port 1, there is no comment on SFP_0_LOOP;
Since project 2 only uses the reception of SFP optical port 2 , so SFP_0_LOOP is commented out;

Image output architecture

This is the content of the second set of vivado projects; for the sake of convenience, the video is not cached here, but the three official Xilinx IPs of Video In to AXI4-Stream, Video Timing Controller, and AXI4-Stream to Video Out are directly called. The video is simply cached and then sent to the HDMI sending module implemented in pure verilog code, which converts the RGB video into HDMI video and finally outputs the monitor display. It should be noted here that the HDMI sending module implemented in pure verilog code is suitable for UltraScale PLUS series FPGA, because The primitives used have changed, UltraScale PLUS uses different primitives from the 7 series FPGA;

4. vivado project 1–>GTY send project

Development board FPGA model: xcku5p-ffvb676-1-i of Xilinx–Kirtex UltraScale+ series;
Development environment: Vivado2022.2;
Input : HDMI or dynamic color bar, resolution 1920x1080@60Hz;
Output: TX port of SFP optical port;
Application: FPGA UltraScale GTY The most detailed explanation on the entire network , aurora 8b/10b codec, board-to-board video transmission;
The engineering code structure is as follows:
Insert image description here
FPGA resource consumption and power consumption estimate after comprehensive compilation is completed As follows:
Insert image description here

5. vivado project 2–>GTY receiving project

Development board FPGA model: xcku5p-ffvb676-1-i of Xilinx–Kirtex UltraScale+ series;
Development environment: Vivado2022.2;
Input : RX port of SFP optical port;
Output: HDMI;
Application: FPGA UltraScale GTY The most detailed explanation on the entire network, aurora 8b/10b codec, board pairing Board video transmission;
The project Block Design is as follows:
Insert image description here
The project code structure is as follows:
Insert image description here
The FPGA resource consumption after comprehensive compilation is completed and Power consumption estimates are as follows:
Insert image description here

6. Project transplantation instructions

Vivado version inconsistency handling

1: If your vivado version is consistent with the vivado version of this project, open the project directly;
2: If your vivado version is lower than the vivado version of this project, you need to open it After the project, click File –> Save As; however, this method is not safe. The safest way is to upgrade your vivado version to the vivado version of this project or a higher version;
Insert image description here
3 : If your vivado version is higher than the vivado version of this project, the solution is as follows:
Insert image description here
After opening the project, you will find that the IPs are locked, as follows:
Insert image description here
At this time It is necessary to upgrade the IP, please do as follows:
Insert image description here
Insert image description here

FPGA model inconsistency handling

If your FPGA model is inconsistent with mine, you need to change the FPGA model. The operation is as follows:
Insert image description here
Insert image description here
Insert image description here
After changing the FPGA model, you also need to upgrade the IP. The method of upgrading the IP has been described previously. ;

Other things to note

1: Since the DDR of each board is not necessarily exactly the same, the MIG IP needs to be configured according to your own schematic diagram. You can even directly delete the MIG of my original project here and re-add the IP and reconfigure it;
2: Modify the pin constraints according to your own schematic diagram, just modify it in the xdc file;
3: Transplanting pure FPGA to Zynq needs to be done in the project Add zynq soft core;

7. Board debugging and verification

Preparation

Two FPGA development boards;
Laptop, if your board does not have an HDMI input interface, you can choose dynamic color bars;
SFP optical port module and Optical fiber;
Connect the optical fiber, power on the board, and download the bit;
The fiber connections between the two boards are as follows:
Insert image description here

static presentation

HDMI input: When UltraScale GTY runs at 5 line rate, the output is as follows:
Insert image description here
Dynamic color bar input: When UltraScale GTY runs at 5G line rate, the output is as follows:
Insert image description here

Dynamic presentation

A short video of dynamic color bar output was recorded. The output dynamic demonstration is as follows:

V7-GTH-COLOR

8. Benefits: Obtaining engineering codes

Benefit: Acquisition of engineering code
The code is too large to be sent by email. It will be sent via a certain network disk link.
How to obtain data: Private, or the V business card at the end of the article.
The network disk information is as follows:
Insert image description here

Guess you like

Origin blog.csdn.net/qq_41667729/article/details/134947297