Histogram Equalization of Image Based on FPAG

Histogram equalization, also known as grayscale equalization, refers to transforming an input image into an output image that is approximately the same at each grayscale level through a certain grayscale mapping (that is, the output histogram is uniform). In an equalized image, pixels will occupy as many gray levels as possible and be evenly distributed. Therefore, such an image will have a high contrast and a large dynamic range. Histogram equalization can be a good solution to the problem of camera overexposure or underexposure.
1. MATLAB implementation

%--------------------------------------------------------------------------
%                         直方图均衡化
%--------------------------------------------------------------------------
close all
clear all;
clc;

I = rgb2gray(imread('car.bmp'));
Ieq=histeq(I);

subplot(221),imshow(I);title('原图');
subplot(222),imhist(I);
subplot(223),imshow(Ieq);title('直方图均衡化');
subplot(224),imhist(Ieq);

Click Run to get the following results:
insert image description hereFrom the results, it can be seen that the contrast of the image is significantly improved, and the histogram becomes more uniform.
2. FPGA implementation
1. Theoretical analysis

The formula for histogram equalization is as follows, H(i) is the number of pixels in the i-level grayscale, A0 is the area of ​​the image (that is, the resolution), and Dmax is the maximum value of the grayscale, which is 255.
insert image description here

2. Implementation steps

As in the case of histogram stretching, histogram equalization is also divided into true equalization and pseudo equalization. This design uses pseudo-equalization, that is, the image of the previous frame is used for statistics, the frame gap is accumulated and normalized, and the current frame is used for normalized mapping output.

The statistical work cannot be done at least until the previous frame image has "flowed through". This limitation determines that it is difficult for us to both count and output the final result in the same frame. There is no doubt that the previous statistical results must be cached, accumulated and normalized. Before the next statistics, the cached results, cumulative sum results need to be cleared (pictures do not need to be cleared), and the normalized results are reserved for the current frame output. Here I consider using 2 rams to realize the whole process of histogram equalization. Taking pictures as an example, it is realized by pseudo equalization of two frames of pictures.
  
  The overall idea is as follows:
insert image description hereWe can implement it in the following steps:

(1) The previous frame: the histogram H(i) of the statistical image, and the statistical results are input to ram1 in real time. Note that the input data must be statistically superimposed, which is considered a difficulty;

//==========================================================================
//==    前一帧:直方图灰度统计
//==========================================================================
//数据前后拍进行比较
//---------------------------------------------------
assign hist_cnt_yes = gray_data_vld && gray_data_r == gray_data;    //相等,可以相加
assign hist_cnt_not = gray_data_vld && gray_data_r != gray_data;    //不等,只是一个

//灰度计数器
//---------------------------------------------------
always @(posedge clk or negedge rst_n) begin
    if(!rst_n) begin
        hist_cnt <= 32'b0;
    end
    else if(hist_cnt_not) begin
        hist_cnt <= 32'b1;
    end
    else if(hist_cnt_yes) begin
        hist_cnt <= hist_cnt + 1'b1;
    end
    else begin
        hist_cnt <= 32'b0;
    end
end

//统计结果输入到统计 ram1 中
//---------------------------------------------------
assign wr_en_1   = hist_cnt_not;
assign wr_addr_1 = gray_data_r;
assign wr_data_1 = rd_data_1 + hist_cnt;
assign rd_addr_1 = gray_vsync ? addr_cnt : gray_data;    //帧间隙按顺序输出,前一帧按像素地址输出

//双口ram,存储统计结果
//---------------------------------------------------
ram_32x256 u_ram_1
(
    .clock                  (clk                    ),
    .wren                   (wr_en_1                ),
    .wraddress              (wr_addr_1              ),
    .data                   (wr_data_1              ),
    .rdaddress              (rd_addr_1              ),
    .q                      (rd_data_1              )
);

(2) The gap between the previous frame and the current frame: design the counter addr_cnt, count 0-255, use this as the address to output the statistical results from ram1, and then perform cumulative sum, pay attention to timing alignment insert image description here
.

//给出addr_cnt,过1拍才出rd_data_1,相当于消耗1clk,与之对齐的是addr_flag_r1
//累加和的计算又耗费1clk
//---------------------------------------------------
always @(posedge clk or negedge rst_n) begin
    if(!rst_n) begin
        sum <= 32'b0;
    end
    else if(addr_flag_r1) begin
        sum <= sum + rd_data_1;
    end
    else begin
        sum <= 32'b0;
    end
end

(3) The gap between the previous frame and the current frame: when performing step (2), equalization must also be performed at the same time, that is, the insert image description here
result is input into ram2; the calculation of the formula here requires some skills to avoid multiplication and division. Pay attention to timing alignment.



//==========================================================================
//==    帧间隙,求和后进行均衡化运算
//==    图像分辨率640*480,为避免乘除法采用640*512来处理
//==    [(2^5+2^4)+(2^2+2^1)] / 2^16, 为优化时序用流水线花2拍处理
//==========================================================================

(4) Current frame: output the equalized data, take the pixel as ram2 address, output the equalized mapping result from ram2 in real time, replace the original pixel output, and achieve the purpose of equalization.

//==========================================================================
//==    当前帧,直方图均衡化后的映射输出
//==========================================================================
ram_32x256 u_ram_2
(
    .clock                   (clk                    ),
    .wren                    (addr_flag_r4           ),    //写使能
    .wraddress               (addr_cnt_r4            ),    //顺序地址
    .data                    (step_2                 ),    //归一化结果
    .rdaddress               (gray_data              ),
    .q                       (hist_data              )
);

//ram读数据会落后读使能一拍,因此其他信号也要打拍对齐
//---------------------------------------------------
always @(posedge clk) begin
    hist_vsync <= gray_vsync;
    hist_hsync <= gray_hsync;
    hist_data_vld <= gray_data_vld;
end

3. Points to note

(1) Formula calculation simplification
Taking 640x480 as an example, the calculation of [255/(640x480)] is difficult for FPGA, it can be converted to shift, and it is recommended to calculate by 640x512: (2)
insert image description hereram

Single-port ram does not seem to work, so apply for dual-port ram IP, one set of ports is enough.

(3) Timing alignment

The ram sends out the read address, and the data comes out after a delay of one or two beats (see IP settings). Accumulated sums and formula calculations consume a certain number of beats. During the design process, we must always pay attention to timing alignment. It is recommended to design while simulating.
  3. Results display

The board is broken, let’s do it with simulation this time, use Matlab to generate the pre_img.txt file from the picture, Verilog design a timing sequence that imitates OV7725, and read the pre_img.txt file for the image data part. The final processed data is then written into a post_img.txt file, and finally Matlab reads the post_img.txt file and restores the result to a picture. The whole process only needs to use Modelsim and Matlab software, which is more convenient.
insert image description here

Guess you like

Origin blog.csdn.net/weixin_45104510/article/details/128344100