【信息技术】【2012.10】基于FPGA的实时车辆检测图像处理算法的硬件实现

在这里插入图片描述

本文为美国明尼苏达大学(作者:Peng Li)的硕士论文,共57页。

众所周知,车辆跟踪过程的计算量非常密集。传统上,车辆跟踪算法是通过软件方法实现的;但软件方法的计算延迟大,导致车辆的跟踪帧速率低。然而,在智能交通系统(ITS)的安全监控和危险报警等应用中,实时车辆跟踪不仅要提高跟踪精度,而且要提高响应时间。为此,本文尝试设计一个基于硬件的实时车辆检测系统,这是整个跟踪系统中的一个典型要求。该车辆检测系统利用摄像机实时采集图像,然后应用固定块大小运动估计(FBSME)、可重构块大小运动估计(RBSME)、可变块大小运动估计(VBSME)和高斯混合等多种图像处理算法,对这些图像进行实时的车辆检测处理。

我们首先提出了支持任意块大小运动估计的RBSME算法的VLSI实现,实验结果表明,与传统设计相比,该结构只需花费5%的硬件开销即可实现块大小可调的灵活性。然后,我们提出了一种低功耗VBSME算法的VLSI实现,该算法采用快速的全搜索块匹配算法来降低功耗,同时保持最优解。快速全搜索算法是基于当前最小绝对误差之和(SAD)与保守下界的比较,从而消除不必要的SAD计算。在现场可编程门阵列(FPGA)中,我们首先通过实验确定了具体的保守下界,然后实现了快速全搜索算法。据我们所知,这是第一次探索一种快速的全搜索块匹配算法,以降低VBSME算法实现的功耗,并在硬件上完成了设计。实验结果表明,与传统的基于非快速全搜索算法的VBSME设计相比,该硬件实现可以节省45%的功耗。

最后,提出了一种基于混合高斯(MoG)的图像分割算法的片上系统架构。由于视频分割应用的MoG算法计算量大,为了满足高帧速率、高分辨率视频分割任务的实时性要求,提出了一种MoG算法的硬件实现方案。此外,我们将硬件IP集成到SOC架构中,使得一些关键参数(如学习速率和阈值)可以在线配置,这使得系统非常灵活地适应不同的环境。该系统已在Xilinx XtremeDSP视频启动工具包Spartan-3A DSP3400A上进行了实现和测试。实验结果表明,在25MHz的时钟频率下,该设计满足了30帧每秒(fps)的视频图形阵列(VGA)分辨率(640×480)的实时要求。

It is well known that vehicle tracking processes are very computationally intensive. Traditionally, vehicle tracking algorithms have been implemented using software approaches. The software approaches have a large computational delay, which causes low frame rate vehicle tracking. However, real-time vehicle tracking is highly desirable to improve not only tracking accuracy but also response time, in some ITS (Intelligent Transportation System) applications such as security monitoring and hazard warning. For this purpose, this thesis makes an attempt to design a hardware based system for real-time vehicle detection, which is typically required in the complete tracking system. The vehicle detection systems capture pictures using a camera in real-time and then we apply several image processing algorithms, such as Fixed Block Size Motion Estimation (FBSME), Reconfigurable Block Size Motion Estimation (RBSME), Variable Block Size Motion Estimation (VBSME) and Mixtures of Gaussian, to process these images in real-time for vehicle detection. We first propose the Very-Large-Scale Integration (VLSI) implementation for RBSME algorithm, which supports arbitrary block size motion estimation. Experiment results show that the proposed architecture achieves the flexibility of adjustable block size at the expense of only 5% hardware overhead compared to the traditional design. We then propose a low-power VLSI implementation for the VBSME algorithm, which employs a fast full-search block matching algorithm to reduce power consumption, while preserving the optimal solutions 1 . The fast full-search algorithm is based on the comparison of the current minimum Sum of Absolute Difference (SAD) to a conservative lower bound so that unnecessary SAD calculations can be eliminated. We first experimentally decide on the specific conservative lower bound and then implement the fast full-search algorithm in Field-Programmable Gate Array (FPGA). To the best of our knowledge, this is the first time that a fast full-search block matching algorithm is explored to reduce power consumption in the context of VBSME, and designed in hardware. Experiment results show that the proposed hardware implementation can save power consumption by 45% compared to conventional VBSME designs based on the non-fast full-search algorithms.At last, we propose an System-on-a-Chip (SoC) architecture for an Mixture of Gaussian (MoG) based image segmentation algorithm. The MoG algorithm for video segmentation application is computational intensive. To meet real-time requirement of high frame rate high resolution video segmentation tasks, we present a hardware implementation of the MoG algorithm. Moreover, we integrated the hardware IP into an SoC architecture, so that some key parameters, such as learning rate and threshold, can be configured on-line, which makes the system extremely flexible to adapt to different environments. The proposed system has been implemented and tested on Xilinx XtremeDSP Video Starter Kit Spartan-3ADSP 3400A Edition. Experiment results show that under a clock frequency of 25MHz, this design meets the real-time requirement for Video Graphics Array (VGA) resolution (640 ×480) at 30 frame-per-second (fps).

1 引言

2 可配置块大小的运动估计

3 可变块大小的运动估计

4 混合高斯算法

5 结论与讨论

附录A 缩略词汇总

下载英文原文地址:

http://page2.dfpan.com/fs/1l3c7ja252616299166/

更多精彩文章请关注微信号:在这里插入图片描述

猜你喜欢

转载自blog.csdn.net/weixin_42825609/article/details/88545605