FPGA 多路视频处理:图像缩放+视频拼接显示,HDMI采集,提供2套工程源码和技术支持


FPGA 多路视频处理:图像缩放+视频拼接显示,HDMI采集,提供2套工程源码和技术支持

1、前言

Even if I have never played with image scaling and video splicing, I am embarrassed to say that I have played with FPGA. This is what a CSDN boss said, and I firmly believe it. . . This article uses the image scaling multi-channel video splicing solution of Xilinx's Kintex7 FPGA. There are two video sources, which correspond to whether the developer has a camera or not. One is to use the onboard HDMI input interface (laptop simulates HDMI input) , use the IT6802 decoding chip to decode the differential HDMI video of TMDS into 24-bit RGB video data for FPGA use; if your FPGA development board does not have an HDMI input interface, you can use the dynamic color bar generated internally by the code to simulate the camera video; video source The choice is made through the `define macro definition at the top level of the code. The HDMI input is used as the video source by default after power-on. Two sets of project source codes of the vivado2019.1 version are provided. The difference between the two sets of project source codes lies in the different resolution and resolution after image scaling. The methods of video splicing are different. Project 1 reduces the input 1920x1080 resolution video to 960x1080 through the image scaling module, and then copies the video into two copies to simulate two video inputs, and splices the two videos to output at 1920x1080 resolution. Make a 2-split screen splicing display on the video; Project 2 will reduce the input 1920x1080 resolution video to 960x480 through the image scaling module, and then copy the video into 4 copies to simulate 4-channel video input. After splicing the 4-channel video, the resolution will be 1920x1080. Do a 4-split screen splicing display on the output video at a high rate; use my commonly used FDMA routine to cache the FPGA-scaled video. The cache medium is DDR3, then read out the video and generate a standard 1920x1080 resolution VGA timing, and then Use the RGB to HDMI module implemented in pure verilog to output the video to the monitor;

This blog describes in detail the design scheme of FPGA image scaling and multi-channel video splicing. The engineering code can be comprehensively compiled and debugged on the board, and can be directly transplanted to the project. It is suitable for school students and graduate project development, and is also suitable for on-the-job engineers to learn and improve. It is used in high-speed interfaces or image processing fields in medical, military and other industries;
provides complete and run-through engineering source code and technical support;
the method of obtaining engineering source code and technical support is placed at the end of the article, please be patient until the end;

Version update instructions

This version is the second version. Based on readers' suggestions, the following improvements and updates have been made to the first version of the project:
1: Added the option of inputting video static color bars. Some readers said that their FPGA development board does not have an HDMI input interface. This resulted in great difficulties during the transplantation process. Based on this, a static color bar was added, which is generated internally by the FPGA and can be used without an external camera. The usage method is explained later; 2: Optimized FDMA, the previous
FDMA The data read and write burst length of AXI4 is 256, which results in insufficient bandwidth on low-end FPGAs, resulting in poor image quality. Based on this, the data read and write burst length of AXI4 in FDMA is changed to 128; 3: Optimized HDMI
output Module, the custom IP used before, some readers said that the IP cannot be updated. Although it can be used normally, it is inconvenient to view the source code. Based on this, the HDMI output module was changed to a pure verilog implementation, which is straightforward;

Disclaimer

This project and its source code include both parts written by myself and parts obtained from public channels on the Internet (including CSDN, Xilinx official website, Altera official website, etc.). If you feel offended, please send a private message to criticize and educate; based on this, this project The project and its source code are limited to readers or fans for personal study and research, and are prohibited from being used for commercial purposes. If legal issues arise due to commercial use by readers or fans themselves, this blog and the blogger have nothing to do with it, so please use it with caution. . .

2. Recommendation of relevant solutions

This project is an integrated version of image scaling and video splicing. Before this, I have launched an FPGA image scaling solution and an FPGA video splicing solution respectively, so I recommend the following:

FPGA image scaling solution recommendation

This solution uses pure verilog code to achieve image scaling of any size. For details, please refer to my previous blog. The blog link is as follows:
Click directly to go

FPGA video splicing solution recommendation

This solution uses pure verilog code to implement multi-channel video splicing. For details, please refer to my previous blog. The blog link is as follows: The reference blog link for the
4-channel video splicing solution is as follows: Click directly to go to
the 8-channel video splicing solution. The reference blog link is as follows: Click directly to go
The 16-channel video splicing solution reference blog link is as follows: click directly to go

3. Design idea framework

There are two types of video sources, which correspond to whether the developer has a camera or not. One is to use the onboard HDMI input interface (laptop simulates HDMI input) and use the IT6802 decoding chip to decode the TMDS differential HDMI video into 24 bits. The RGB video data is used by the FPGA; if your FPGA development board does not have an HDMI input interface, you can use the dynamic color bar generated internally in the code to simulate the camera video; the video source is selected through the `define macro definition at the top level of the code, power on HDMI input is used as the video source by default; two sets of project source codes for the vivado2019.1 version are provided. The difference between the two sets of project source codes lies in the different resolutions after image scaling and the different methods of video splicing. Project 1 will input the 1920x1080 resolution video Reduce the image to 960x1080 through the image scaling module, and then copy the video into two copies to simulate two video inputs. After splicing the two videos, make a 2-split screen splicing display on the 1920x1080 resolution output video; Project 2 will input the 1920x1080 The resolution video is reduced to 960x480 through the image scaling module, and then the video is copied to 4 copies to simulate 4-channel video input. After splicing the 4-channel video, a 4-split screen splicing display is performed on the 1920x1080 resolution output video; after FPGA scaling For the video, I use my commonly used FDMA routine to cache the image. The cache medium is DDR3. Then I read out the video and generate a standard 1920x1080 resolution VGA timing. Then I use the RGB to HDMI module implemented in pure verilog to output the video to the monitor. ;
The design block diagram of Project 1 is as follows:
Insert image description here
The design block diagram of Project 2 is as follows:
Insert image description here

Video source selection

There are two types of video sources, which correspond to whether the developer has a camera or not. One is to use the onboard HDMI input interface (laptop simulates HDMI input) and use the IT6802 decoding chip to decode the TMDS differential HDMI video into 24 bits. The RGB video data is used by the FPGA; if your FPGA development board does not have an HDMI input interface, you can use the dynamic color bar generated internally in the code to simulate the camera video; the video source is selected through the macro definition at the top level of the code, and is used by default after power-on. HDMI input is used as the video source; the selection of the video source is carried out through the `define macro definition at the top level of the code; as follows: The video source
Insert image description here
selection logic code part is as follows:
Insert image description here
The selection logic is as follows:
When (note) define USE_SENSOR, the input source video is a dynamic color bar;
When (without comment) define USE_SENSOR, the input source video is HDMI;

IT6802 decoding chip configuration and collection

The IT6802 decoding chip requires i2c configuration before it can be used. Regarding the configuration and use of the IT6802 decoding chip, please refer to my previous blogs. Blog address: Click to go directly to
the IT6802 decoding chip configuration and acquisition. Both parts are implemented using the verilog code module. Code location As follows:
Insert image description here
The code is configured as 1920x1080 resolution;

Dynamic color bar

The dynamic color bar can be configured for videos of different resolutions. The border width of the video, the size of the dynamic moving block, the moving speed, etc. can all be parameterized. Here I configure the resolution as 1280x720, the code location of the dynamic color bar module and the top-level interface. An example is as follows:
Insert image description here
Insert image description here

Buffer FIFO

The function of the buffer FIFO is to solve the cross-clock domain problem. When the video is not scaled, there is no video cross-clock domain problem, but this problem exists when the video is reduced or enlarged. Using FIFO buffering can make the image scaling module read every time. All input data is valid. Note that the input timing of the original video has been disrupted here;

Detailed explanation of image scaling module

Design block diagram

This design integrates commonly used bilinear interpolation and neighborhood interpolation algorithms into one code, and selects a certain algorithm by inputting parameters; the code is implemented using pure verilog, without any IP, and can be transplanted arbitrarily between Xilinx, Intel, and domestic FPGAs. ; The code uses ram and fifo as the core to implement data caching and interpolation. The design architecture is as follows: The
Insert image description here
video input timing requirements are as follows:
Insert image description here
the input pixel data can only be changed when dInValid and nextDin are high at the same time;
the video output timing requirements are as follows:
Insert image description here
the output pixel data is in It can be output only when dOutValid and nextdOut are high at the same time;

Code block diagram

The code is implemented using pure verilog, without any IP, and can be transplanted arbitrarily between Xilinx, Intel, and domestic FPGAs; there
are many ways to implement image scaling, the simplest is Xilinx’s HLS method, using the opencv library and a few lines of c++ language The code can be completed. For information about HLS implementation of image scaling, please refer to the article I wrote before. HLS implementation of image scaling
. There are other image scaling routine codes on the Internet, but most of them use IP, making it difficult to transplant on other FPGA devices, and the versatility is not good. Good; in comparison, this design code is universal; the code structure is as shown in the figure;
Insert image description here
the top-level interface part is as follows:
Insert image description here

Integration and selection of 2 interpolation algorithms

This design integrates commonly used bilinear interpolation and neighborhood interpolation algorithms into one code, and selects a certain algorithm through input parameters; the
specific selection parameters are as follows:

input  wire i_scaler_type //0-->bilinear;1-->neighbor

You can select by entering the value of i_scaler_type;

Enter 0 to select the bilinear interpolation algorithm;
enter 1 to select the neighborhood interpolation algorithm;

Regarding the mathematical differences between these two algorithms, please refer to my previous article HLS implements image scaling

Video splicing algorithm

The video splicing scheme is as follows: Take the 4-channel OV5640 camera splicing in Project 2 as an example; the
Insert image description here
output screen resolution is 1920X1080;
the input camera resolution is 960X540;
the 4-channel input can just fill the entire screen;
the splicing display principle of multi-channel video is as follows:
Insert image description here
Taking the output of two cameras CAM0 and CAM1 to the same monitor as a column, in order to display two images on one monitor, we must first understand the following relationships: hsize: the effective space actually occupied by each row of images in the memory
, When 32bit is used to represent a pixel, the memory occupied is hsize 4;
hstride: used to set the address of the first pixel of each row of images, when 32bit is used to represent a pixel, v_cnt
hstride 4;
vsize: valid row;
so it is easy to get The address of the first pixel of each row of cam0 is also v_cnt
hstride 4;
similarly, if we need to display cam1 anywhere in the hsize and vsize spaces, we only need to care about the address of the first pixel of each row of cam1, and we can use the following Formula v_cnt
hstride*4+offset;
uifdma_dbuf supports stride parameter setting. The stride parameter can set the interval address from the first pixel to the next starting pixel of each row of data in the X (hsize) direction of the input data. The stride parameter can be used very conveniently. Arrange the input video into memory.
Regarding uifdma_dbuf, you can refer to the article I wrote before and click to view: FDMA implements three-frame buffering of video data
Based on the above preparation, the base address of each camera cache is as follows:
CAM0: ADDR_BASE=0x80000000;
CAM1: ADDR_BASE=0x80000000+(1920-960)X4;
CAM2: ADDR_BASE=0x80000000+(1080-540)X1920X4;
CAM3: ADDR_BASE=0x8 0000000+(1080 -540)X1920X4+(1920-960)X4;
After the address is set, it’s basically done;

image cache

Old fans who often read my blog should know that my routine for image caching is FDMA. Its function is to send the image to DDR for 3-frame buffering and then read it out for display. The purpose is to match the input and output clock difference and improve the output. Video quality, regarding FDMA, please refer to my previous blog. Blog address: Click to go directly
here. When splicing multiple channels of video, multiple channels of FDMA are called for caching. Specifically, each channel of video calls 1 channel of FDMA, and 4 channels of video splicing are used. Example:
Call 4 channels of FDMA, three of which are configured in write mode, because these three channels of video only need to be written to DDR3 here, and the reading is completed by another FDMA. The configuration is as follows: The other 1 channel of FDMA is configured in read and write mode
Insert image description here
. Because the 4-channel video needs to be read out at the same time, the configuration is as follows: The
Insert image description here
key point of video splicing is the difference in the cache address of the 4-channel video in DDR3. Taking 4-channel video splicing as an example, the write address of the 4-channel FDMA is as follows:
The write base address of the first video cache: 0x80000000;
the write base address of the second video cache: 0x80000f00; the write base address of
the third video cache: 0x803f4800; the write base address of
the fourth video cache: 0x803f5700;
the read base address of the video cache: 0x80000000;

video output

After the video is read from FDMA, it is output to the display after passing through the VGA timing module and HDMI sending module. The code location is as follows: The VGA timing
Insert image description here
configuration is 1280X720. The HDMI sending module is handwritten with verilog code and can be used for HDMI sending applications in FPGA. About this module, Please refer to my previous blog, blog address: click to go directly

4. Vivado Project 1: 2-way video scaling and splicing

Development board FPGA model: Xilinx–Kintex7–xc7k325tffg676-2;
Development environment: Vivado2019.1;
Input: HDMI (IT6802 decoding) or dynamic color bar, resolution 1920x1080;
Output: HDMI, display 2-way spliced ​​video at 1080P resolution ;
Engineering application: FPGA image scaling, multi-channel video splicing;
The project BD is as follows:
Insert image description here
The project code structure is as follows:
Insert image description here
The resource consumption and power consumption of the project are as follows:
Insert image description here

5. Vivado Project 2: 4-way video scaling and splicing

Development board FPGA model: Xilinx–Kintex7–xc7k325tffg676-2;
Development environment: Vivado2019.1;
Input: HDMI (IT6802 decoding) or dynamic color bar, resolution 1920x1080;
Output: HDMI, display 4-channel spliced ​​video at 1080P resolution ;
Engineering application: FPGA image scaling, multi-channel video splicing;
The project BD is as follows:
Insert image description here
The project code structure is as follows:
Insert image description here
The resource consumption and power consumption of the project are as follows:
Insert image description here

6. Project transplantation instructions

Vivado version inconsistency handling

1: If your vivado version is consistent with the vivado version of this project, open the project directly;
2: If your vivado version is lower than the vivado version of this project, you need to open the project and click File –> Save As; but this method does not It is not safe. The safest way is to upgrade your vivado version to the vivado version of this project or a higher version;
Insert image description here
3: If your vivado version is higher than the vivado version of this project, the solution is as follows:
Insert image description here
After opening the project, you will find that the IP has been It is locked, as follows:
Insert image description here
At this time, the IP needs to be upgraded, and the operation is as follows:
Insert image description here
Insert image description here

FPGA model inconsistency handling

If your FPGA model is inconsistent with mine, you need to change the FPGA model. The operation is as follows:
Insert image description here
Insert image description here
Insert image description here
After changing the FPGA model, you also need to upgrade the IP. The method of upgrading the IP has been described previously;

Other things to note

1: Since the DDR of each board is not necessarily exactly the same, the MIG IP needs to be configured according to your own schematic. You can even directly delete the MIG of my original project and re-add the IP and reconfigure it; 2: According to your
own To modify the pin constraints of the schematic diagram, just modify it in the xdc file;
3: When transplanting pure FPGA to Zynq, you need to add the zynq soft core to the project;

7. Board debugging, verification and demonstration

Preparation

You need the following equipment to transplant and test the project code:
1: FPGA development board;
2: Onboard HDMI input interface, if not, just choose dynamic color bars;
3: HDMI transmission line;
4: HDMI display, requirements Resolution supports 1920x1080;

static presentation

Project 1: HDMI (IT6802 decoding) 1920x1080 input scaled to 960x1080, 2-channel video splicing 2-split screen output as follows:
Insert image description here
Project 1: Dynamic color bar 1920x1080 input scaled to 960x1080, 2-channel video splicing 2-split screen output as follows:
Insert image description here
Project 2: HDMI (IT6802 decoding) After the 1920x1080 input is scaled to 960x540, the 4-channel video splicing and 4-split screen output are as follows:
Insert image description here
Project 2: Dynamic color bar. After the 1920x1080 input is scaled to 960x540, the 4-channel video splicing and 4-split screen output are as follows:
Insert image description here

Dynamic presentation

The dynamic video demonstration is as follows:

FPGA-Video scaling and splicing-HDMI-2023

8. Benefits: Obtain project source code

Bonus: Acquisition of engineering code.
The code is too large to be sent by email. It will be sent via a certain network disk link.
The information acquisition method is: private, or the V business card at the end of the article.
The network disk information is as follows:
Insert image description here

Guess you like

Origin blog.csdn.net/qq_41667729/article/details/133338991