FPGA high-end project: UltraScale GTH + SDI video codec, SDI to SFP optical port loopback output, providing 2 sets of engineering source code and technical support


FPGA high-end project: UltraScale GTH + SDI video codec, SDI to SFP optical port loopback output, providing 2 sets of engineering source code and technical support

1 Introduction

Xilinx series FPGA currently has two solutions for implementing SDI video encoding and decoding:
One is to use dedicated encoding and decoding chips, such as the typical receiver GS2971 and transmitter GS2972. The advantage is simplicity , for example, the GS2971 receiver directly decodes SDI into parallel YCRCB, and the GS2972 transmitter directly encodes parallel YCRCB into SDI video. The disadvantage is that the cost is higher. You can search for the prices of GS2971 and GS2972 on Baidu; another solution is to use FPGA to implement Encoding and decoding, using FPGA's GTP/GTX/GTH/UltraScale GTH and other resources to achieve deserialization. The advantage is that FPGA resources are reasonably utilized, but the disadvantage is that the operation is more difficult and requires higher FPGA level; UltraScale GTH is suitable for Xilinx UltraScale series On FPGA, including Virtex UltraScale, Kintex UltraScale, Zynq® UltraScale and other devices, there is only GTH under the UltraScale series. Compared with GTH, UltraScale GTH has a higher line rate, supports more protocol types, lower power consumption, and wider bandwidth. High; Similarly, Xilinx also provides dedicated IP for SDI video codecs, such as SMPTE UHD-SDI, which supports 3G-SDI, 6G-SDI, 12G-SDI and other video codecs;

This article uses Xilinx’s Zynq UltraScale+MPSoCs–xczu7ev-ffvc1156-2-i model FPGA to implement UltraScale GTH + SDI video encoding and decoding; the camera is a standard 3G-SDI camera, and the development board has an onboard LMH0384 chip, SDI The video plays the role of equalizing EQ through LMH0384, which can also be understood as single-ended conversion to differential; then the official UltraScale GTH IP core of Xilinx is called to realize the deserialization and serialization of SDI video. The IP configuration is GTH-3G-SDI mode. This mode Specifically used for deserializing and serializing SDI video protocols; then calling Xilinx’s official SMPTE UHD-SDI IP core to implement SDI video decoding and encoding. This IP supports 3G-SDI, 6G-SDI, 12G-SDI and other video codecs. This design is configured in 3G-SDI mode; the SDI video receiving process has now changed from the original single-ended video transmitted by coaxial line to parallel video data. At this time, it can be used as an input source for image processing, and can be cached and color converted. , scaling and other operations; this design uses the SFP optical port loopback method, so the VGA timing recovery module is needed to restore the SDI video valid data into a horizontal synchronization signal (hs), a field synchronization signal (vs), and a valid data signal. (de), standard VGA timing of pixel data (data); then send the video to the GTH sending data packet module to package the video, and add the control frame header and frame tail and other identifiers based on the character BC; Then call Xilinx's official UltraScale GTH IP core to perform GTH sending 8b/10b encoding operation, configure it to Aurora 8b/10b codec mode, and configure the line rate to 5G; then use the onboard SFP optical port to encode the 8b/10b video After loopback, the board receives again; then calls UltraScale GTH to do 8b/10b decoding processing; then sends the data to the GTH receiving data alignment module for alignment processing; then sends the data to the GTH receiving data unpacking module to remove the frame header and frame tail and Restore the video timing; at this point, the loopback operation of the SFP wide 8b/10b codec of the video data is completed; then I use my commonly used FDMA image cache architecture to write the SDI video to DDR4 and then read it out. The image is cached in three frames in DDR4. If you think the delay is too high, you can choose to cache 2 frames; when the video is read out from DDR4, it is also read out with VGA timing, so you need to use the SDI timing generation module to convert the VGA timing video into SDI video; and then The cached SDI video is sent to SMPTE UHD-SDI for SDI video encoding, and then sent to UltraScale GTH for SDI video serialization. This process is the reverse process of the receiving process. At this time, the SDI video becomes high-speed differential data again; development board The LMH0302SQ chip is onboard, and the high-speed differential SDI video plays the role of enhanced driver through LMH0302SQ, which can also be understood as differential conversion to single-ended; I have an SDI to HDMI box in my hand, connect the output SDI video to the box, and then Output the display to output the display;
Attention! !
Attention! !
Attention! !
This project and solution are only applicable to Xilinx UltraScale and UltraScale+ series FPGA devices because the UltraScale GTH IP core is used. Other series of FPFA do not include UltraScale GTH, such as Xilinx's A7, K7, V7, and Zynq7000 series are not available, let alone FPGAs from other companies; it is also possible to use DDR4 or DDR3 for cache; there are two ways to loop back the SFP optical port in this project. One is to achieve loopback through an onboard optical port; The two optical ports on the board implement loopback, which is selected through define macro definition in the code. The first loopback method is selected by default when powering on. This project uses two UltraScale GTH IP cores, but they have different uses. One is configured as GTH. -3G-SDI mode is used for SDI encoding and decoding, and the other configuration is Aurora 8b/10b mode for 8b/10b encoding and decoding; therefore, this project uses UltraScale GTH to the extreme and is an advanced and high-end FPGA practical project;

Provides 2 sets of vivado2022.2 version FPGA project source code. The difference between the two sets of projects is the number of SDI cameras. The first set of projects only uses 1 SDI camera for loopback; the second set of projects uses 4 SDI cameras for loopback. ;Details are as follows:

vivado工程11路SDI输入,DDR4做三帧缓存后回环后1路SDI输出;
vivado工程24路SDI输入,DDR4做三帧缓存后回环后4路SDI输出;

This blog describes in detail the design scheme of Xilinx’s Zynq UltraScale+MPSoCs–xczu7ev-ffvc1156-2-i model FPGA to implement UltraScale GTH + SDI video encoding and decoding. The engineering code can be comprehensively compiled and debugged on the board, and can be directly Project transplantation is suitable for school students and graduate project development, as well as for on-the-job engineers to improve their studies. It can be applied to high-speed interfaces or image processing fields in medical, military and other industries;
Provides a complete , Runtong's engineering source code and technical support;
The method of obtaining the engineering source code and technical support is placed at the end of the article, please be patient until the end;

Disclaimer

This project and its source code include both parts written by myself and parts obtained from public channels on the Internet (including CSDN, Xilinx official website, Altera official website, etc.). If you feel offended, please send a private message to criticize and educate; based on this, this project The project and its source code are limited to readers or fans for personal study and research, and are prohibited from being used for commercial purposes. If legal issues arise due to commercial use by readers or fans themselves, this blog and the blogger have nothing to do with it, so please use it with caution. . .

2. Recommendation of relevant solutions

The GT high-speed interface solution I already have here

My homepage has the FPGA GT high-speed interface column. This column has video transmission routines and PCIE transmission routines for GT resources such as GTP, GTX, GTH, GTY, etc. GTP is built based on the A7 series FPGA development board, and GTX It is built based on the K7 or ZYNQ series FPGA development board, GTH is built based on the KU or V7 series FPGA development board, and GTY is built based on the KU+ series FPGA development board; the following is the column address:
Click to go directly

My current SDI codec solution

My blog homepage has an SDI video column, which is full of FPGA codec SDI engineering source code and blog introduction; there are SDI codecs based on GS2971/GS2972, and SDI codecs based on GTP/GTX resources. ;Column address link:Click to go directly

3. Detailed design plan

This article uses Xilinx’s Zynq UltraScale+MPSoCs–xczu7ev-ffvc1156-2-i model FPGA to implement UltraScale GTH + SDI video encoding and decoding; the camera is a standard 3G-SDI camera, and the development board has an onboard LMH0384 chip. The SDI video is played through LMH0384 The role of balanced EQ can also be understood as single-ended conversion to differential; then the official Xilinx UltraScale GTH IP core is called to realize deserialization and serialization of SDI video. The IP configuration is GTH-3G-SDI mode, which is specifically used for SDI video. Deserialize and serialize the protocol; then call Xilinx’s official SMPTE UHD-SDI IP core to implement SDI video decoding and encoding. This IP supports 3G-SDI, 6G-SDI, 12G-SDI and other video codecs. This design is configured for 3G -SDI mode; the SDI video receiving process has now changed from the original single-ended video transmitted by coaxial line to parallel video data. At this time, it can be used as an input source for image processing, and operations such as caching, color conversion, scaling, etc. can be performed ; This design uses SFP optical port loopback method, so the VGA timing recovery module is needed to restore the SDI video valid data to include horizontal synchronization signal (hs), field synchronization signal (vs), data valid signal (de), pixel The standard VGA timing of data (data); then send the video to the GTH sending data packet module to package the video, and add the control frame header and frame tail based on the character BC and other identifiers; then call Xilinx official The UltraScale GTH IP core performs GTH transmission 8b/10b encoding operation, configures it to Aurora 8b/10b codec mode, and configures the line rate to 5G; then loops back the 8b/10b encoded video through the onboard SFP optical port, and then the board Receive; then call UltraScale GTH to do 8b/10b decoding processing; then send the data to the GTH receiving data alignment module for alignment processing; then send the data to the GTH receiving data unpacking module to remove the frame header and frame tail and restore the video timing; so far , completed the loopback operation of the SFP vast 8b/10b codec of the video data; then used my commonly used FDMA image cache architecture to write the SDI video to DDR4 and then read it out. The image is cached in three frames in DDR4. If you feel delayed If it is too high, you can choose to cache 2 frames; when the video is read out from DDR4, it is also read out with VGA timing, so you need to use the SDI timing generation module to convert the VGA timing video into SDI video; then the cached SDI video Send it to SMPTE UHD-SDI for SDI video encoding, and then send it to UltraScale GTH for SDI video serialization. This process is the reverse process of the receiving process. At this time, the SDI video becomes high-speed differential data again; the development board has an onboard LMH0302SQ chip. High-speed differential SDI video plays the role of enhanced driver through LMH0302SQ, which can also be understood as differential conversion to single-ended; I have an SDI to HDMI box in my hand, connect the output SDI video to the box, and then output it to the monitor. shown;

Provides 2 sets of vivado2022.2 version FPGA project source code. The difference between the two sets of projects is the number of SDI cameras. The first set of projects only uses 1 SDI camera for loopback; the second set of projects uses 4 SDI cameras for loopback. ;Details are as follows:

vivado工程11路SDI输入,DDR4做三帧缓存后回环后1路SDI输出;
vivado工程24路SDI输入,DDR4做三帧缓存后回环后4路SDI输出;

Design block diagram

This design refers to Xilinx official design documents. The official reference design block diagram is as follows:
Insert image description here
The detailed design plan block diagram of this project is as follows:
Insert image description here
Block diagram Explanation: The arrow represents the data flow direction, the text inside the arrow represents the data format, and the numbers outside the arrow represent the steps of the data flow direction;

3G-SDI camera

This is roughly the camera:
Insert image description here
Resolution: 1920x1080@60Hz;
Video format: YUV422;
Data rate: 2.97Gbps;
Output mode: BNC head coaxial line output;

LMH0384 Balanced EQ

The development board has an onboard LMH0384 chip. The SDI video plays the role of equalizing EQ through LMH0384, which can also be understood as single-ended conversion to differential; the schematic part is as follows:
Insert image description here

SDI mode application of UltraScale GTH

The most detailed introduction to UltraScale GTH is definitely Xilinx's official "ug576-ultrascale-gth-transceivers". Let's interpret it here:
"ug576-ultrascale-gth-transceivers" "I have put the PDF document in the information package, and there are ways to obtain it at the end of the article;
The FPGA model of the development board I used is Kirtex7-UltraScale-xcku060-ffva1156-2-i; UltraScale The transceiver speed of GTH is between 500 Mb/s and 16.375 Gb/s, which is 3G higher than GTH; UltraScale GTH transceiver supports different serial transmission interfaces or protocols, such as PCIE 1.1/2.0 interface, 10G network XUAI interface, OC-48, serial RapidIO interface, SATA (Serial ATA) interface, digital component serial interface (SDI), etc.;

This project uses two UltraScale GTH IP cores, but they are used for different purposes. One is configured in GTH-3G-SDI mode to do SDI codec, and the other is configured in Aurora 8b/10b mode to do 8b/10b. encoding and decoding; therefore, this project uses UltraScale GTH to the extreme and is an advanced and high-end FPGA practical project; this chapter introduces the application of GTH-3G-SDI mode; the project calls UltraScale GTH to configure it as GTH-3G-SDI mode. This mode Specially used for deserialization and serialization of SDI video protocol; the code location is as follows:
Insert image description here
The basic configuration of UltraScale GTH is as follows: onboard differential crystal oscillator 148.5M, line rate configuration is 2.97G, and protocol type configuration is GTH-3G-SDI;
Insert image description here
There are other configuration options, such as checking the DRP dynamic configuration interface, etc. Please refer to the project for details;

Aurora 8b/10b codec mode application for UltraScale GTH

This project uses two UltraScale GTH IP cores, but they are used for different purposes. One is configured in GTH-3G-SDI mode to do SDI codec, and the other is configured in Aurora 8b/10b mode to do 8b/10b. encoding and decoding; therefore, this project uses UltraScale GTH to the extreme and is an advanced and high-end FPGA practical project; this chapter introduces the application of Aurora 8b/10b mode; the project calls UltraScale GTH to configure it as Aurora 8b/10b mode, which is specially used for Deserialization and serialization of the Aurora 8b/10b protocol; the code location is as follows:
Insert image description here
The basic configuration of UltraScale GTH is as follows: onboard differential crystal oscillator 156.25M, line rate configuration is 5G, and protocol type configuration is Aurora 8b/10b;
Insert image description here
There are other configuration options, please refer to the project for details;

UltraScale GTH basic structure

Xilinx uses Quad to group serial high-speed transceivers. Four serial high-speed transceivers and a COMMOM (QPLL) form a Quad. Each serial high-speed transceiver is called a Channel. The following picture is a schematic diagram of the UltraScale GTH transceiver in the Kintex7 UltraScale FPGA chip: "ug576-ultrascale-gth-transceivers" page 19;
Insert image description here
In the FPGA of the Ultrascale/Ultrascale+ architecture series, the GTH high-speed Transceivers are usually divided using Quad. A Quad consists of four GTHE3/4_CHANNEL primitives and one GTHE3/4_COMMON primitive. Each GTHE3/4_COMMON contains two LC-tank plls (QPLL0 and QPLL1). Instantiating GTHE3/4_COMMON is only required when using QPLL in your application. Each GTHE3/4_CHANNEL consists of a channel PLL (CPLL), a transmitter, and a receiver. A reference clock can be connected directly to a GTHE3/4_CHANNEL primitive without instantiating GTHE3/4_COMMON;

The transmitter and receiver functions of the Ultrascale GTH transceiver are independent of each other and are composed of Physical Media Attachment (Physical Media Adaptation Layer PMA) and Physical Coding Sublayer (Physical Coding Sublayer PCS). PMA internally integrates serial-to-parallel conversion (PISO), pre-emphasis, receive equalization, clock generator and clock recovery, etc.; PCS internally integrates GTH-3G-SDI, elastic buffer, channel binding and clock correction, etc. Each GTHE3 The logic circuit of the /4_CHANNEL source language is shown in the figure below: "ug576-ultrascale-gth-transceivers" page 20;
Insert image description here
It doesn't make much sense to say too much here, because I haven't done a few big ones. The project will not understand the things inside. For first-time users or those who want to use it quickly, more energy should be focused on the invocation and use of the IP core. I will also focus on the invocation and use of the IP core later; < /span>

Reference clock selection and distribution

GTH transceivers in UltraScale devices offer different reference clock input options. The reference clock selection architecture supports QPLL0, QLPLL1 and CPLL. Architecturally, each quad contains four GTHE3/4_CHANNEL primitives, one GTHE3/4_COMMON primitive, two dedicated external reference clock pin pairs, and dedicated reference clock routing. If a high-performance QPLL is used, GTHE3/4_COMMON must be instantiated, as shown in the detailed view of the GTHE3/4_COMMON clock multiplexer structure below (page 33 of "ug576-ultrascale-gth-transceivers") in a Quad There are 6 reference clock pin pairs in the quad, two local reference clock pin pairs: GTREFCLK0 or GTREFCLK1, two reference clock pin pairs from the upper two Quads: GTSOUTHREFCLK0 or GTSOUTHREFCLK1, two reference clock pin pairs from the lower quads Two Quads: GTNORTHREFCLK0 or GTNORTHREFCLK1.
Insert image description here
As can be seen from the figure below, this GTHE3/4_COMMON is a reference clock selector, used to select clocks from different sources as the reference clock of the transceiver. GTHE3/4_COMMON supports the selection of 7 reference clock sources. Of course, generally speaking, the best performing reference clock source is the closest refclk of the Quad itself, which is GTREFCLK00 and GTREFCLK10. In high-definition video transmission, the United States, Canada, etc. use the NTSC standard, and the base clock is 148.35MHZ, 74.176MHZ. The PAL standard adopted by China, Germany and other countries has reference clocks of 148.5MHZ and 74.25MHZ. In the field of high-definition video, the only difference between the two is the base clock, and the video timing is the same. This also results in two frame rates that we often see on the device, such as 60hz and 59.94hz. Therefore, the reference clock of GTH in this design is differential 148.5M. The schematic diagram is as follows:
Insert image description here

UltraScale GTH send and receive processing flow

First, after the user logical data passes through GTH-3G-SDI, it enters a transmit buffer (Phase Adjust FIFO). This buffer is mainly used to isolate the clocks of the two clock domains of the PMA sublayer and PCS sublayer to solve the problem of clock rate matching and For the phase difference problem, the high-speed Serdes is finally used for parallel-to-serial conversion (PISO). If necessary, pre-emphasis (TX Pre-emphasis) and post-emphasis can be performed. It is worth mentioning that if the TXP and TXN differential pins are accidentally cross-connected during PCB design, this design error can be compensated for by polarity control (Polarity). The processes at the receiving end and the transmitting end are opposite, and there are many similarities, so I won’t go into details here. What needs to be noted is the elastic buffer of the RX receiving end, which has clock correction and channel binding functions. You can write a paper or even a book for each function point here, so you only need to know a concept and use it in specific projects. Again: for first time use or want to use it quickly For readers, more energy should be focused on the calling and use of IP cores.

UltraScale GTH send interface

Pages 104 to 179 of "ug576-ultrascale-gth-transceivers" provide a detailed introduction to the sending processing process. Most of the content is not necessary for the user to delve into, because the manual is basically his own. The design concept leaves few interfaces for users to operate. Based on this idea, we focus on the interfaces that users need to use when instantiating UltraScale GTH;
Insert image description here
Users only need to use You only need to care about the clock and data of the sending interface. This part of the interface of the UltraScale GTH instantiation module is as follows: The file name is the officially generated file for instantiation after instantiating GTH;
Insert image description here

UltraScale GTH receiving interface

Pages 181 to 314 of "ug576-ultrascale-gth-transceivers" provide a detailed introduction to the sending processing process. Most of the content is not necessary for the user to delve into, because the manual is basically his own. Based on the design concept, there are not many interfaces left for users to operate. Based on this idea, we focus on the interfaces that are needed for the sending part when instantiating UltraScale GTH;
Insert image description here
Users only need to use You only need to care about the clock and data of the receiving interface. This part of the interface of the UltraScale GTH instantiation module is as follows: The file name is the officially generated file for instantiation after instantiating GTH;
Insert image description here

UltraScale GTH IP core calling and usage

This project uses two UltraScale GTH IP cores, but they are used for different purposes. One is configured in GTH-3G-SDI mode to do SDI codec, and the other is configured in Aurora 8b/10b mode to do 8b/10b. encoding and decoding; therefore, this project uses UltraScale GTH to the extreme and is an advanced and high-end FPGA practical project;
Insert image description here
The UltraScale GTH IP core in GTH-3G-SDI mode is called and used as follows: a> For more GTH configuration details, please refer to the vivado project; UltraScale GTH IP core in Aurora 8b/10b mode The call and use are as follows: In order to adapt to three SD-SDI, HD-SDI and 3G-SDI are different modes and need to change the speed and reset the GTH, so the DRP interface needs to be opened, as follows:
The basic configuration of UltraScale GTH is as follows: onboard differential crystal oscillator 148.5M, line rate configured as 2.97G, protocol type configured as GTH-3G-SDI;
Insert image description here

Insert image description here

Insert image description here

UltraScale GTH Control Instructions

This is a module file only for UltraScale GTH in GTH-3G-SDI mode; in order to adapt to the three different modes of SD-SDI, HD-SDI and 3G-SDI, it is necessary to perform speed change and reset operations on GTH. Mainly It is completed through the DRP interface. For this purpose, the official Xilinx reference code is used. The UltraScale GTH control part code is as follows:
Insert image description here
The UltraScale GTH control module includes the following functions: 1. Used to control the GTH transceiver reset logic; 2. Allow the RX and TX serial dividers to be dynamically switched to support different modes of SD-SDI, HD-SDI and 3G-SDI; 3. Dynamic switching of the TX reference clock to support HD-SDI and two different bit rates in the 3G-SDI standard: 1.485 Gb/s and 1.485/1.001 Gb/s bit rates in HD-SDI mode, 2.97 Gb/s and 2.97/1.001 Gb/s in 3G-SDI mode bit rate; 4. Data recovery unit for recovering data in SD-SDI mode; 5. RX bit rate detection to determine whether the receiver is receiving an integer frame rate signal or a fractional frame rate signal.
Please refer to the code for details;

Detailed explanation of SMPTE UHD-SDI

Call Xilinx's official SMPTE UHD-SDI IP core to implement SDI video decoding and encoding. This IP supports 3G-SDI, 6G-SDI, 12G-SDI and other video codecs. This design is configured in 3G-SDI mode; according to the official manual, SMPTE The UHD-SDI data transceiver architecture is as follows:
Insert image description here

SMPTE UHD-SDI receive

The block diagram of the SMPTE UHD-SDI receiver is as follows:
Insert image description here
The data from the serial transceiver RX enters the SMPTE UHD-SDI receiver through the rx_data_in port. For SD, HD and 3G modes, 20 bits per clock cycle; 40 bits per clock cycle for 6G and 12G modes. In SD mode, 20-bit data on rx_data_in goes to DRU (data recovery unit), and DRU recovers 10-bit data from 11 times oversampled data. The data is descrambled by the SDI descrambler and then word aligned by the SDI framer. Then there is the synchronization bit recovery function. This feature restores 3FF and 000 values ​​modified by the transmitter to reduce run length in 6G and 12G-SDI modes. These three modules run at full rx_clk speed and process 40, 20 or 10 bits of data per clock cycle depending on the SDI mode. Data enters a stream demux, which determines how many streams are interleaved and then separates each stream on a separate data path, supporting up to 16 streams. Each data stream enters a processing unit that performs CRC error checking, line number capture, and ST 352 packet capture. It is also possible to extract video timing from stream demux
and generate rx_eav, rx_sav and rx_trs timing signals. These timing signals are detected by the SDI mode and used by the transport detection module.

SMPTE UHD-SDI send

The block diagram of the SMPTE UHD-SDI transmitter is as follows:
Insert image description here
SMPTE UHD-SDI can support up to 16 SDI data streams. The data streams are first inserted into the module through ST 352 and can be selectively inserted. ST 352 payload ID packet, the data stream output from the ST 352 insert module is called tx_ds1_st352_out to tx_ds16_st352_out. Outputting these streams allows users to insert ancillary data after ST 352 packets. The remainder of the transmitter can either directly use the stream output by the ST 352 packet insertion module, or it can use the 16 tx_ds1_anc_in to tx_ds16_anc_in data streams. Note that if tx_dsn_anc_in streams are used, they must be full SDI streams, not just ancillary data. Normally, only ST 352 packets are inserted into the Y stream of each Y/C stream pair. In 3G-SDI level A mode-only mode, both data stream 1 and data stream 2 must insert ST 352 messages. Each pair of Y/C data streams then passes through a data stream processing module that performs line number insertion and CRC generation and insertion. After stream processing, the data streams are MUX-interleaved to form 40-, 20-, or 10-bit wide multiplexed SDI data streams. The multiplexed data stream is then scrambled by an SDI scrambler. Finally, the data is output on the tx_txdata port to the corresponding serial transceiver.

SMPTE UHD-SDI IP core calling and usage

The SMPTE UHD-SDI configuration interface is very simple. This design is configured in 3G-SDI mode, as follows:
Insert image description here
Please refer to the engineering code for the use of SMPTE UHD-SDI, because there are many interfaces, here Can’t write;

VGA timing recovery

This design uses cache loopback, so the VGA timing recovery module is needed to restore the SDI video valid data to include horizontal synchronization signal (hs), field synchronization signal (vs), and data valid signal (de) , the standard VGA timing of pixel data (data); the code location is as follows:
Insert image description here
The top-level interface of the VGA timing recovery module is as follows:
Insert image description here
The input of the VGA timing recovery module comes from SMPTE The output of the UHD-SDI IP core. The output of the IP provides rx_eav, rx_sav, and rx_trs signals that represent synchronization word information. The timing relationship of these signals is as follows:
Insert image description here
As you can see from the above figure , rx_trs is synchronized with four sync words, rx_eav is pulled high for one clock cycle in XYZ (the fourth word of embedded sync word) to indicate the end of valid pixels. rx_sav is not shown in the above figure, but similarly, rx_sav is pulled high for one clock cycle in XYZ (the fourth word of the embedded synchronization word) to indicate the start of a valid pixel. In the module embedded_synchronous, the image area is extracted based on these signals. As for the data stream, we have already mentioned before that the SDI signal uses 20 bits to represent a pixel, which is divided into 10 bits of brightness data Y and 10 bits of chroma data C. In the receiving part, corresponding to rx_ds1a and rx_ds2a, we splice two 10bits data into one 20bits data. Since we already have the timing synchronization signal, there is no need to perform sequence detection on the data port to extract the synchronization word. We only need to obtain the corresponding control word synchronously when the rx_trs signal arrives, so as to solve the corresponding H, V, and F flag bits. Extract de_p effective pixel area through these flag signals. With these signals, we can process the image data according to the familiar horizontal and field synchronization method.

GTH sends data packet

This is a module file only available for UltraScale GTH in Aurora 8b/10b mode; since the video needs to be sent and received in GTH through the Aurora 8b/10b protocol, the data must be packaged to adapt to the Aurora 8b/10b protocol standard ;The code location of the video data packet module is as follows:
Insert image description here
First, we store the 16-bit video in the FIFO. When one line is full, it is read from the FIFO and sent to the GTH for transmission; before that, we need A frame of video is numbered, also called an instruction. When GTH is packetizing, data is sent according to fixed instructions. When GTH is unpacking, the field synchronization signal and video valid signal of the video are restored according to fixed instructions; when the field synchronization of a frame of video is When the rising edge of the signal arrives, a frame of video start command 0 is sent. When the falling edge of the field synchronization signal of a frame of video arrives, a frame of video start command 1 is sent. During the video blanking period, invalid data 0 and invalid data 1 are sent. When the video When a valid signal arrives, each line of video is numbered. First, a line of video start command is sent, and then the current video line number is sent. When one line of video is sent, a line of video end command is sent. After one frame of video is sent, one frame is sent first. Video end command 0, and then send a frame of video end command 1; at this point, one frame of video is sent. This module is not easy to understand, so I made detailed Chinese comments in the code. It should be noted that in order to prevent Chinese To display comments out of order, please use the notepad++ editor to open the code; the command definition is as follows:

32'h55_00_00_bc    一帧视频开始指令032'h55_00_01_bc    一帧视频开始指令132'h55_00_02_bc    无效数据032'h55_00_03_bc    无效数据132'h55_00_04_bc    一行视频开始指令;
32'h55_00_05_bc    一行视频结束指令;
32'h55_00_06_bc    一帧视频结束指令032'h55_00_07_bc    一帧视频结束指令1

Part of the code screenshot is as follows:
Insert image description here
The instruction can be changed at will, but the lowest byte must be bc;

GTH receive data alignment

This is a module file only available for UltraScale GTH in Aurora 8b/10b mode; since the Aurora 8b/10b data transmission and reception of GT resources naturally has data misalignment, it is necessary to perform data alignment processing on the received decoded data. The code position of the data alignment module is as follows:
Insert image description here
The K code control character format I defined is: XX_XX_XX_BC, so an rx_ctrl is used to indicate whether the data is a K code COM symbol; the specific analysis is as follows:

rx_ctrl = 4'b0000 表示 4 字节的数据没有 COM 码;
rx_ctrl = 4'b0001 表示 4 字节的数据中[ 7: 0] 为 COM 码;
rx_ctrl = 4'b0010 表示 4 字节的数据中[15: 8] 为 COM 码;
rx_ctrl = 4'b0100 表示 4 字节的数据中[23:16] 为 COM 码;
rx_ctrl = 4'b1000 表示 4 字节的数据中[31:24] 为 COM 码;

Based on this, when the K code is received, the data is aligned, that is, the data is beat and combined with the new incoming data. This is the basic operation of FPGA and will not be repeated here;

GTH receives data unpacking

This is a module file only available for UltraScale GTH in Aurora 8b/10b mode; data unpacking is the reverse process of GTH sending data packets. The code location is as follows:
Insert image description here
UltraScale GTH solution The video's field sync signal and video valid signal are restored according to fixed instructions during the package; these signals are important signals for the subsequent image cache. At this point, the data in and out of GTX part has been finished. I have described the block diagram of the entire process in the code;

Overview of GTH Aurora 8b/10b codec architecture

There are two ways to implement SFP optical port loopback in this project. One is to implement loopback through one optical port on the board, but to implement loopback through two optical ports on the board. The choice is made through the define macro definition in the code. , the first loopback method is selected by default after power-on; this project uses two UltraScale GTH IP cores, but they are used for different purposes. One is configured as GTH-3G-SDI mode for SDI codec, and the other is configured as Aurora 8b/ 10b mode does 8b/10b encoding and decoding; so this project uses UltraScale GTH to the extreme and is an advanced and high-end FPGA practical project; the GTH Aurora 8b/10b encoding and decoding architecture code is as follows:
Insert image description here
define The macro definition to select several SFP optical port loopback codes is as follows: uiAurora_8b10b_vid.v
Insert image description here
Insert image description here

image cache

Old fans who often read my blog should know that my image caching routine is FDMA. Its function is to send the image into DDR for 3-frame buffering and then read out the display. The purpose is to match the input and output. Clock difference and improving output video quality. Regarding FDMA, please refer to my previous blog, blog address: Click to go directly

SDI timing generation

When the video is read out from DDR4, it is also read out with VGA timing, so the SDI timing generation module needs to be used to convert the VGA timing video into SDI video; then the cached SDI video is sent to SMPTE UHD-SDI To do SDI video encoding, the code location is as follows:
Insert image description here
The SDI timing generation module outputs data to the SMPTE UHD-SDI IP core. The IP checks the input sending data timing requirements as follows:
Insert image description here
In the SDI receiving part above, we directly obtained the timing reference information provided by the IP core. In the sending part, we need to use the SDI timing generation module to generate the corresponding data stream format. The output format can be adjusted by configuring the select_std is_720p module. Among them, the clock and reset signals are provided by the SDI IP core to provide SDI transmission format information, and are composed of hd_sdn, is_720p, and select_std signals. The module provides a variety of formats and can be used flexibly. Readers can also find industry information corresponding to the corresponding video formats and add more video formats according to their own needs. From the figure above, you can understand the composition of the data stream sent by SDI. The sending data stream and the receiving data stream are the same, both are data streams with embedded synchronization words. In the sending part, you need to generate the data stream with embedded synchronization words according to the EAV/SAV data format. Dout corresponds to the data stream with embedded synchronization words added, and is directly connected to the sending part of the SDI IP core. The three interfaces vout, data_req, and data in the SDI timing generation module are video data and control interfaces, which are used to read video data from the DDR cache, that is, read video data from FDMA;

LMH0302SQ enhanced driver

The development board has an onboard LMH0302SQ chip. The high-speed differential SDI video is enhanced by the LMH0302SQ, which can also be understood as differential conversion to single-ended; the LMH0302SQ schematic diagram is as follows:
Insert image description here

video output

After the previous operation, the SDI input video is decoded and then encoded. Here it becomes a high-speed differential video. An SDI to HDMI box is used to convert the output SDI video to HDMI video, so that it can be output to the monitor. SDI conversion HDMI boxes are sold on a certain store, they cost about one to two hundred dollars, and they look like this:
Insert image description here

4. vivado project 1:1 channel SDI video codec

Development board FPGA model: Xilinx–Zynq UltraScale+MPSoCs–xczu7ev-ffvc1156-2-i;
Development environment: Vivado2022.2;
Input: 1 channel 3G-SDI camera, resolution 1920x1080@60Hz;
Output: SDI; resolution 1920x1080@60Hz;
Application: UltraScale GTH + SDI video Codec, SDI to SFP optical port loopback output;
Block Design is as follows:
Insert image description here
Project code structure is as follows:
Insert image description here
Comprehensive compilation completed The final FPGA resource consumption and power consumption estimates are as follows:
Insert image description here

5. vivado project 2: 4-channel SDI video codec

Development board FPGA model: Xilinx–Zynq UltraScale+MPSoCs–xczu7ev-ffvc1156-2-i;
Development environment: Vivado2022.2;
Input: 4-channel 3G-SDI camera, resolution 1920x1080@60Hz;
Output: 4-channel SDI; resolution 1920x1080@60Hz;
Application: UltraScale GTH + SDI video codec, SDI to SFP optical port loopback output;
Block Design is as follows:
Insert image description here
Project code structure is as follows:
Insert image description here
Comprehensive The estimated FPGA resource consumption and power consumption after compilation is as follows:
Insert image description here

6. Project transplantation instructions

Vivado version inconsistency handling

1: If your vivado version is consistent with the vivado version of this project, open the project directly;
2: If your vivado version is lower than the vivado version of this project, you need to open it After the project, click File –> Save As; however, this method is not safe. The safest way is to upgrade your vivado version to the vivado version of this project or a higher version;
Insert image description here
3 : If your vivado version is higher than the vivado version of this project, the solution is as follows:
Insert image description here
After opening the project, you will find that the IPs are locked, as follows:
Insert image description here
At this time It is necessary to upgrade the IP, please do as follows:
Insert image description here
Insert image description here

FPGA model inconsistency handling

If your FPGA model is inconsistent with mine, you need to change the FPGA model. The operation is as follows:
Insert image description here
Insert image description here
Insert image description here
After changing the FPGA model, you also need to upgrade the IP. The method of upgrading the IP has been described previously. ;

Other things to note

1: Since the DDR of each board is not necessarily exactly the same, the MIG IP needs to be configured according to your own schematic diagram. You can even directly delete the MIG of my original project here and re-add the IP and reconfigure it;
2: Modify the pin constraints according to your own schematic diagram, just modify it in the xdc file;
3: Transplanting pure FPGA to Zynq needs to be done in the project Add zynq soft core;

7. Board debugging and verification

Preparation

FPGA development board;
3G-SDI camera;
BNC to SMA coaxial cable;
SDI to HDMI box;
monitor, needs to support 1080P;

Output demo

The output demonstration is as follows:
Insert image description here

8. Benefits: Obtaining engineering codes

Benefit: Acquisition of engineering code
The code is too large to be sent by email. It will be sent via a certain network disk link.
How to obtain data: Private, or the V business card at the end of the article.
The network disk information is as follows:
Insert image description here

Guess you like

Origin blog.csdn.net/qq_41667729/article/details/134918026