m FPGA-based LDPC minimum sum decoding algorithm verilog implementation, including testbench and matlab auxiliary verification program

Table of contents

1. Algorithm simulation effect

2. Algorithms involve an overview of theoretical knowledge

3. MATLAB core program

4. Complete algorithm code file


1. Algorithm simulation effect

The simulation results of matlab2022a/vivado2019.2 are as follows:

Matlab simulation:

0.5 code rate, H is a matrix of 4608×9216.

FPGA simulation:

 The comparison is as follows:

2. Algorithms involve an overview of theoretical knowledge

         LDPC decoding is divided into hard decision decoding and soft decision decoding.

        Hard-decision decoding, also known as algebraic decoding, is mainly represented by the bit flip (BF) decoding algorithm, which is relatively simple to implement, but the decoding performance is poor. The basic assumption of hard-decision decoding is that when the verification equation is not established, it means that there must be a bit error at this time, and among all the possible error bits, the bit that does not satisfy the largest number of verification equations has the highest probability of error. In each iteration, flip the bit with the highest error probability and re-decode with the updated codeword.

        Soft-decision decoding is a decoding algorithm based on probability theory. It usually needs to be combined with iterative decoding to reflect the advantages of decoding performance. The basic algorithm is the belief propagation (BP) decoding algorithm. The complexity of the decoding method is much higher, but the decoding performance is very good.

        In order to solve the difficult problem of BP decoding algorithm, a wave of optimization algorithms has been initiated in academia, such as logarithmic domain belief propagation decoding (LLR BP) algorithm, minimum sum (Min-Sum) decoding algorithm, Normalized Min-Sum decoding algorithm, etc. Decoding algorithm, Offset Min-Sum decoding algorithm, etc. have emerged one after another.

        In the process of iterative decoding, there are two information scheduling methods: flood scheduling and hierarchical scheduling. The characteristic of flood scheduling is that in each decoding iteration process, all soft information from variable nodes to check nodes is calculated first, and then all soft information from check nodes to variable nodes is calculated. The characteristic of layered scheduling is that when calculating the soft information of each layer, the relevant node information in this iteration is updated for the soft information calculation of the next layer.

        The minimum sum decoding (MS, Min-Sum) algorithm is based on the LLR BP algorithm decoding, and simplifies the expression of the check node information update, and the rest of the steps are consistent with the LLR BP decoding algorithm.
        Comparing the check node information update process of the LLR BP decoding algorithm and the Min-Sum decoding algorithm, it can be seen that the main difference between them is that the tanh(.) operation and the addition operation in the LLR BP decoding algorithm are used in the Min-Sum decoding algorithm. The algorithm is replaced by the minimum value and the operation symbol. MS decoding simplifies the LLR BP decoding algorithm and reduces the complexity of the decoding algorithm.


       Divide it evenly into 256 sub-matrices, denoted as H0, H1, ..., H255 respectively, the size of each sub-matrix is ​​18×36, the rest of H is also divided according to this, the result after division is shown in Figure 1.

The decoding process of the minimum sum algorithm is as follows:

The basic idea of ​​the decoder design based on the minimum sum algorithm is to optimize the parameters of the quantization decoder according to the density evolution theory, so that the quantization decoder can reach the highest threshold.

The process of the whole algorithm is carried out as follows:

First: Initialize the value of each variable node and assign the initial value;

Second: Determine whether the number of iterations has exceeded the preset maximum number of iterations, and if so, the iteration ends;

Third: Each iteration, the information of the variable node is updated;

Fourth: Calculate the L value on each variable node Vn

Fifth: For each variable node Vn, judge the L value, output the sequence Vk, and end the decoding;

The minimum sum algorithm is similar to the BP decoding algorithm in essence. In addition, the whole algorithm is carried out in the logarithmic domain.

The above is the basic flow of the whole decoding algorithm.

       The min-sum decoding algorithm is similar to the BP decoding algorithm, which simplifies the original exponential operation process, thereby reducing the amount of calculation of the decoder. The min-sum algorithm is used for iterative update. This update is divided into check node update and The variable node is updated. Its iterative decoding step is divided into two steps. The overall structure of the entire system is as follows:

3.MATLAB/Verilog core program

......................................................................
for iteration=1:50
    iteration
    %horizontal step
    %横向步骤:由信息节点的先验概率按置信传播算法得出各校验节点的后验概率。
    for i=1:h1num %计算概率差
        newh(h1i(i),h1j(i)).dqmn=newh(h1i(i),h1j(i)).qmn0-newh(h1i(i),h1j(i)).qmn1;
    end

    for i=1:rows
        colind=find(h1i==i);%统计与校验节点相联系的第i行的数据位个数
        colnum=length(colind);
        for j=1:colnum
            drmn=1;
            for k=1:colnum
                if k~=j
                    drmn=drmn*newh(i,h1j(colind(k))).dqmn;
                end
            end
            newh(i,h1j(colind(j))).rmn0=(1+drmn)/2;
            newh(i,h1j(colind(j))).rmn1=(1-drmn)/2;
        end
    end

    %vertical step
    %纵向步骤:由校验节点的后验概率推算出信息节点的后验概率。
    for j=1:cols
        rowind=find(h1j==j);
        rownum=length(rowind);
        for i=1:rownum
            prod_rmn0=1;
            prod_rmn1=1;
            for k=1:rownum
                if k~=j
                    prod_rmn0=prod_rmn0*newh(h1i(rowind(k)),j).rmn0;
                    prod_rmn1=prod_rmn1*newh(h1i(rowind(k)),j).rmn1;
                end
            end
            const1=pl0(j)*prod_rmn0;
            const2=pl1(j)*prod_rmn1;
            newh(h1i(rowind(i)),j).alphamn=1/( const1 + const2 ) ;
            newh(h1i(rowind(i)),j).qmn0=newh(h1i(rowind(i)),j).alphamn*const1;
            newh(h1i(rowind(i)),j).qmn1=newh(h1i(rowind(i)),j).alphamn*const2;
            %update pseudo posterior probability
            %更新伪后验概率
            const3=const1*newh(h1i(rowind(i)),j).rmn0;
            const4=const2*newh(h1i(rowind(i)),j).rmn1;
            alpha_n=1/(const3+const4);
            newh(h1i(rowind(i)),j).qn0=alpha_n*const3;
            newh(h1i(rowind(i)),j).qn1=alpha_n*const4;
            %tentative decoding
            %译码尝试,对信息节点的后验概率作硬判决
            if newh(h1i(rowind(i)),j).qn1>0.5
                vhat(j)=1;
            else
                vhat(j)=0;
            end
        end
    end
    
    if mul_GF2(vhat,H.')==zero
    %如果判决条件满足,译码结束,否则继续迭代
        break;
    end
end

uhat = extract_mesg(vhat,rearranged_cols);

end


function [C]=mul_GF2(A,B)

C=A*B;
C=mod(C, 2);

end
............................................................................
reg[12:0]cnt;
reg[35:0]o_read_select;
reg      o_read_enable;
reg[7:0] o_address;

reg[35:0]dout_tmp_r;


assign Max_lens = i_code_rate ? 13'd6911: 13'd4607;
assign finishs = (cnt == Max_lens);

always @(posedge i_sys_clk or negedge i_sys_rst)
begin 
     if(!i_sys_rst)
     cnt <=  13'h0;
else if(i_state[3])
     begin
	       if(finishs)
	       cnt <=  13'h0;
     else
	       cnt <=  cnt + 1'b1;
     end
end

llr_values llr_values_u(
								.i_clk    (i_sys_clk),
								.i_address(cnt),
								.o_values (dout_tmp)
							  );

always @(posedge i_sys_clk or negedge i_sys_rst)
begin
     if(!i_sys_rst)
     o_address <= 8'd0;
else
     o_address <= dout_tmp[13:6];	    
end

always @(posedge i_sys_clk or negedge i_sys_rst)
begin
     if(!i_sys_rst)
     o_read_select <= 36'd0;
else if(i_state[3])
	  o_read_select <= dout_tmp_r;
else
	  o_read_select <= 36'd0;
end

always @ (posedge i_sys_clk or negedge i_sys_rst)
begin
     if(!i_sys_rst)
     o_read_enable <= 1'b0;
else 
	  o_read_enable <= i_state[3];
end


always @(dout_tmp[5:0])
case(dout_tmp[5:0])
	6'd0  : dout_tmp_r = 36'b0000_0000_0000_0000_0000_0000_0000_0000_0001;	
	6'd1  : dout_tmp_r = 36'b0000_0000_0000_0000_0000_0000_0000_0000_0010;	
	6'd2  : dout_tmp_r = 36'b0000_0000_0000_0000_0000_0000_0000_0000_0100;	
	6'd3  : dout_tmp_r = 36'b0000_0000_0000_0000_0000_0000_0000_0000_1000;	
	
	6'd4  : dout_tmp_r = 36'b0000_0000_0000_0000_0000_0000_0000_0001_0000;		
	6'd5  : dout_tmp_r = 36'b0000_0000_0000_0000_0000_0000_0000_0010_0000;	
	6'd6  : dout_tmp_r = 36'b0000_0000_0000_0000_0000_0000_0000_0100_0000;		
	6'd7  : dout_tmp_r = 36'b0000_0000_0000_0000_0000_0000_0000_1000_0000;	
	
	6'd8  : dout_tmp_r = 36'b0000_0000_0000_0000_0000_0000_0001_0000_0000;	
	6'd9  : dout_tmp_r = 36'b0000_0000_0000_0000_0000_0000_0010_0000_0000;	
	6'd10 : dout_tmp_r = 36'b0000_0000_0000_0000_0000_0000_0100_0000_0000;	
	6'd11 : dout_tmp_r = 36'b0000_0000_0000_0000_0000_0000_1000_0000_0000;	
	
	6'd12 : dout_tmp_r = 36'b0000_0000_0000_0000_0000_0001_0000_0000_0000;	
	6'd13 : dout_tmp_r = 36'b0000_0000_0000_0000_0000_0010_0000_0000_0000;		
	6'd14 : dout_tmp_r = 36'b0000_0000_0000_0000_0000_0100_0000_0000_0000;	
	6'd15 : dout_tmp_r = 36'b0000_0000_0000_0000_0000_1000_0000_0000_0000;	
	
	6'd16 : dout_tmp_r = 36'b0000_0000_0000_0000_0001_0000_0000_0000_0000;	
	6'd17 : dout_tmp_r = 36'b0000_0000_0000_0000_0010_0000_0000_0000_0000;	
	6'd18 : dout_tmp_r = 36'b0000_0000_0000_0000_0100_0000_0000_0000_0000;	
	6'd19 : dout_tmp_r = 36'b0000_0000_0000_0000_1000_0000_0000_0000_0000;	
	
	6'd20 : dout_tmp_r = 36'b0000_0000_0000_0001_0000_0000_0000_0000_0000;	
	6'd21 : dout_tmp_r = 36'b0000_0000_0000_0010_0000_0000_0000_0000_0000;	
	6'd22 : dout_tmp_r = 36'b0000_0000_0000_0100_0000_0000_0000_0000_0000;	
	6'd23 : dout_tmp_r = 36'b0000_0000_0000_1000_0000_0000_0000_0000_0000;	
	
	6'd24 : dout_tmp_r = 36'b0000_0000_0001_0000_0000_0000_0000_0000_0000;	
	6'd25 : dout_tmp_r = 36'b0000_0000_0010_0000_0000_0000_0000_0000_0000;	
	6'd26 : dout_tmp_r = 36'b0000_0000_0100_0000_0000_0000_0000_0000_0000;	
	6'd27 : dout_tmp_r = 36'b0000_0000_1000_0000_0000_0000_0000_0000_0000;	
	
	6'd28 : dout_tmp_r = 36'b0000_0001_0000_0000_0000_0000_0000_0000_0000;	
	6'd29 : dout_tmp_r = 36'b0000_0010_0000_0000_0000_0000_0000_0000_0000;	
	6'd30 : dout_tmp_r = 36'b0000_0100_0000_0000_0000_0000_0000_0000_0000;	
	6'd31 : dout_tmp_r = 36'b0000_1000_0000_0000_0000_0000_0000_0000_0000;	
	
	6'd32 : dout_tmp_r = 36'b0001_0000_0000_0000_0000_0000_0000_0000_0000;	
	6'd33 : dout_tmp_r = 36'b0010_0000_0000_0000_0000_0000_0000_0000_0000;	
	6'd34 : dout_tmp_r = 36'b0100_0000_0000_0000_0000_0000_0000_0000_0000;	
	6'd35 : dout_tmp_r = 36'b1000_0000_0000_0000_0000_0000_0000_0000_0000;	
	
	default:dout_tmp_r = 36'b0000_0000_0000_0000_0000_0000_0000_0000_0000;	
endcase


endmodule
14_033_m

4. Complete algorithm code file

V

Guess you like

Origin blog.csdn.net/hlayumi1234567/article/details/129625620