Implementation of QSPI underlying driver code based on FPGA

Introduction to QSPI

I believe that all outstanding engineers are already very familiar with the SPI protocol. The full name of SPI is Serial Peripheral Interface. It is a high-speed, full-duplex synchronous communication bus that is widely used in communication transmission between devices. .
The QSPI we are going to talk about in this article is an extension of the SPI interface. Q stands for quad, which means 4 times the transmission. It is also called four-wire SPI. Therefore, the transmission rate of this interface will be much faster than the standard SPI. It is widely used in SPI Flash storage medium. The following article will use a Flash chip Datasheet to describe in detail how to use FPGA to implement QSPI communication.

write timing

QSPI write timing diagram
As can be seen from the timing diagram, there are a total of 6 signals in the diagram, which are CE (chip select signal), CLK (clock signal), and SIO0–SIO34 data lines from top to bottom. Similar to the SPI interface, the chip select and clock signals remain unchanged. When reading and writing data, the chip select signal is low level, and when sampling or sending data, it is on the rising edge or falling edge of the clock. The only difference is that the data lines have changed from the original MOSI and MISO to 4 data lines. So how do we apply these four data lines? As can be seen from the figure, SIO0 will send commands, addresses and data, while SIO1–SIO3 only send commands and data. Further observation can see that the write command is 0x38, which is sent by 8 CLKs, and the address signal totals 24 bits. The transmission is completed simultaneously by 4 data lines within 6 CLKs, and the starting bits sent by each data line are different. Finally, the data is sent. In the same way, 4 data lines are sent at the same time, 1 CLK Send 4bit data, the size sent can be set by the user. To sum up, the QSPI communication writing process can be summarized as sending a one-byte command word first (this command word is different for different chips), then sending a 3-byte address (same principle), and finally send data. Therefore, there are ideas in the design of FPGA. The simplest method is to use a state machine to describe this process. The specific code will be shown below.

Read timing

QSPI read timing diagram
As can be seen from the figure, the operation flow of the read timing is similar to that of the write timing, except that the command word has changed from 0x38 to 0xEB. The rest of the operation flow is the same as the write timing, so it will not be explained in detail.
But it should be noted that when SPI is expanded to QSPI, it is no longer full-duplex communication, but becomes half-duplex. The 4 lines SIO0–SIO3 will become three-state gates, which are the inout interfaces in FPGA. They need to meet specific conditions to input or output data.
The following will give the underlying driver code for QSPI communication. In actual engineering applications, it is necessary to write the application layer program in conjunction with the chip's data manual, and then combine it with the underlying logic to implement specific functions, such as using the QSPI or SPI interface to A Flash chip is read and written.

Verilog code for QSPI implementation

module QSPI_DRIVE #(
		parameter DIV = 3
)(
	input wire clk,
	input wire rst,
	//--------应用层传输进该模块的命令、地址、数据等--------//
	input wire [3:0] i_cmd_mode,
    input wire [7:0] i_flash_cmd,
    input wire [23:0] i_addr,
    input wire [7:0] i_data,
    input wire [15:0] i_data_num,
    input wire i_wr,
 	output reg [7:0] o_data,
 	//---------QSPI 接口---------//
	output reg qspi_cs,
	output reg qspi_csk,
	inout   reg qspi_sio0,
	inout   reg qspi_sio1,
	inout   reg qspi_sio2,
	inout   reg qspi_sio3
);

reg [7:0] div_cnt;
reg [7:0] cmd_cnt;
reg [7:0] addr_cnt;
reg [7:0] data_cnt;
reg [15:0] num_cnt;
reg [3:0] cmd_mode_lock;
reg [7:0] flash_cmd_lock;
reg [23:0] addr_lock;
reg [7:0] r_data_temp;
reg  qspi_sckd0;
wire qspi_sck_p,qspi_sck_n;

//---------------FSM---------------//
reg [7:0] state,n_state;
localparam  IDLE  = 8'h00,
			START = 8'h01,
			CMD   = 8'h02,
			ADDR  = 8'h04,
			DATA  = 8'h08,
			STOP  = 8'h10;
always@(posedge clk)begin
	if(rst)
		state <= IDLE;
	else
		state <= n_state;
end

always@(*)begin
	if(rst)begin
		n_state = IDLE;
	end else begin
		case(state)
			IDLE : begin
				if(i_cmd_mode[3])
					 n_state = START;
				else
					 n_state = IDLE;
		   	end
		   	START : begin
				 n_state = CMD;
			end
			CMD : begin
			 if(cmd_cnt == 8'd15)
				 if(cmd_mode_lock[1])begin
					  n_state = ADDR;
			 	 end else if(cmd_mode_lock[0])begin
					  n_state = DATA;		
				 end else begin
					  n_state = STOP;
				 end
			 else
			 	n_state = CMD;
			end
			ADDR : begin
				if(addr_cnt == 8'd12)
					if(cmd_mode_lock[0])begin
						n_state = DATA;
					end else begin
						n_state =STOP;
					end
				else
					n_state = ADDR;
			end
			DATA : begin
			 if(data_cnt == 8'd4)
				 if(cmd_mode_lock[2] && (num_cnt == 16'b0))begin
					  n_state = STOP;
			 	 end else if(!cmd_mode_lock[2])begin
					  n_state = STOP;		
				 end else begin
					  n_state = DATA;
				 end
			 else
			 	n_state = DATA;
			end	
			STOP : begin
				n_state = IDLE;
			end
			default : begin
				n_state = IDLE;
			end
		endcase
	end
end

//----------锁数据-----------//
always@(posedge clk)begin
	if(rst)begin
		cmd_mode_lock <= 4'b0;
		flash_cmd_lock <= 8'b0;
		addr_lock <= 24'b0;
	end else if(i_cmd_mode[3] && (state == IDLE))begin
		cmd_mode_lock <= i_cmd_mode;
		flash_cmd_lock <= i_flash_cmd;
		addr_lock <= i_addr;		
	end else begin
		cmd_mode_lock <= cmd_mode_lock ;
		flash_cmd_lock <= flash_cmd_lock ;
		addr_lock <= addr_lock ;
	end
end

//-----------各个功能计数器计数---------//
always@(posedge clk)begin//时钟分频,DIV为分频系数
	if(rst)
		div_cnt <= 8'h00;
	else if(div_cnt == DIV)
		div_cnt <= 8'h00;
	else if((state == CMD) || (state == ADDR) || (state == DATA ))
		div_cnt <= div_cnt + 1'b1;
	else
		div_cnt <= 8'h00;
end

always@(posedge clk)begin//命令字计数
	if(rst)
		cmd_cnt <= 8'h00;
	else if((state == CMD) && (div_cnt == DIV))
		cmd_cnt <= cmd_cnt + 1'b1;
	else if(state == CMD)
		cmd_cnt <= cmd_cnt;
	else
		cmd_cnt <= 8'h00;
end

always@(posedge clk)begin//地址计数
	if(rst)
		addr_cnt <= 8'h00;
	else if((state == ADDR) && (div_cnt == DIV))
		addr_cnt  <= addr_cnt + 1'b1;
	else if(state == ADDR)
		addr_cnt <= addr_cnt ;
	else
		addr_cnt <= 8'h00;
end

always@(posedge clk)begin//数据计数,在sck上升沿和下降沿均会加1
	if(rst)
		data_cnt <= 8'h00;
	else if((state == DATA) && cmd_mode_lock[1] &&  (data_cnt == 8'd4))
		data_cnt <= 8'h00;
	else if((state == DATA) && (qspi_sck_p || qspi_sck_n))
		data_cnt <= data_cnt + 1'b1;
	else if(state == DATA)
		data_cnt <=data_cnt;
	else
		data_cnt <= 8'h00; 
end

always@(posedge clk)begin//传输的数据长度计数,传输完成后num为0
	if(rst)
		num_cnt <= 16'h00;
	else if((state == IDLE) && i_cmd_mode[3])
		num_cnt <= i_data_num;
	else if((cmd_mode_lock[3] && (div_cnt == DIV) &&  (data_cnt == 8'd3))
		num_cnt <= num_cnt - 1'b1;
	else 
		num_cnt <=num_cnt ;
end

//-------------QSPI数据采样及发送--------------//
always@(posedge clk)begin//产生片选信号
	if(rst)
		qspi_cs <=1'b1;
	else if(state == START)
		qspi_cs <=1'b0;
	else if(state == STOP)
		qspi_cs <=1'b1;
	else
		qspi_cs <=qspi_cs ;		
end

always@(posedge clk)begin//产生qspi采样时钟
	if(rst)
		qspi_sck <=1'b0;
	else if((state == CMD) || (state == ADDR) || (state == DATA) && (div_cnt == DIV))
		qspi_sck <=!qspi_sck ;
	else if((state == CMD) || (state == ADDR) || (state == DATA))
		qspi_sck <=qspi_sck ;
	else
		qspi_sck <=1'b0;		
end

always@(posedge clk)begin
	if(rst)
		qspi_sckd0 <= 1'b1;
	else
		qspi_sckd0  <= qspi_sck;
end
assign qspi_sck_n = (qspi_sckd0 && (!qspi_sck)) ? 1'b1 : 1'b0;//取sck下降沿
assign qspi_sck_p = ((!qspi_sckd0) && qspi_sck) ? 1'b1 : 1'b0;//取sck上升沿

always@(posedge clk)begin//sio0数据线传输命令字、地址以及数据
	if(rst)
		qspi_sio0_temp <=1'b0;
	else if((state == START) || (state == ADDR) || (state == DATA) && (div_cnt == DIV))
		qspi_sio0_temp <=i_flash_cmd[7];
	else if(qspi_sck_n)begin
		if(state == CMD)
			qspi_sio0_temp <=flash_cmd_lock[7 - (cmd_cnt>>1)];
		else if(state == ADDR)
			qspi_sio0_temp <= addr_lock[20 - (addr_cnt<<1)];
		else if(state == DATA)
			qspi_sio0_temp <= i_data[4 - (data_cnt<<1)]; 
		else
			qspi_sio0_temp <= qspi_sio0_temp ;
	end else
		qspi_sio0_temp <= qspi_sio0_temp ;
end

always@(posedge clk)begin//sio1数据线传输地址以及数据
	if(rst)
		qspi_sio1_temp <=1'b0;
	else if(qspi_sck_n)begin
	    if(state == ADDR)
			qspi_sio1_temp <= addr_lock[21 - (addr_cnt<<1)];
		else if(state == DATA)
			qspi_sio1_temp <= i_data[5 - (data_cnt<<1)]; 
		else
			qspi_sio1_temp <= qspi_sio1_temp ;
	end else
		qspi_sio1_temp <= qspi_sio1_temp ;
end

always@(posedge clk)begin//sio2数据线传输地址以及数据
	if(rst)
		qspi_sio2_temp <=1'b0;
	else if(qspi_sck_n)begin
	    if(state == ADDR)
			qspi_sio2_temp <= addr_lock[22 - (addr_cnt<<1)];
		else if(state == DATA)
			qspi_sio2_temp <= i_data[6 - (data_cnt<<1)]; 
		else
			qspi_sio2_temp <= qspi_sio2_temp ;
	end else
		qspi_sio2_temp <= qspi_sio2_temp ;
end

always@(posedge clk)begin//sio3数据线传输地址以及数据
	if(rst)
		qspi_sio3_temp <=1'b0;
	else if(qspi_sck_n)begin
	    if(state == ADDR)
			qspi_sio3_temp <= addr_lock[23 - (addr_cnt<<1)];
		else if(state == DATA)
			qspi_sio3_temp <= i_data[7 - (data_cnt<<1)]; 
		else
			qspi_sio3_temp <= qspi_sio3_temp ;
	end else
		qspi_sio3_temp <= qspi_sio3_temp ;
end

reg qspi_sio0_temp;//由于是三态门,需要定义中间变量
reg qspi_sio1_temp;
reg qspi_sio2_temp;
reg qspi_sio3_temp;
//在各状态下赋相对应的值,在写数据的时候i_wr信号为高,读时为低
assign qspi_sio0 = (state == CMD || state == ADDR) ? qspi_sio0_temp : (i_wr) ? qspi_sio0_temp : 1'bz;
assign qspi_sio1 = (state == ADDR) ? qspi_sio1_temp: (i_wr) ? qspi_sio1_temp: 1'bz;
assign qspi_sio2 = (state == ADDR) ? qspi_sio2_temp: (i_wr) ? qspi_sio2_temp: 1'bz;
assign qspi_sio3 = (state == ADDR) ? qspi_sio3_temp: (i_wr) ? qspi_sio3_temp: 1'bz;

always@(posedge clk)begin//QSPI发送数据,将数据线上的数据移位至r_data_temp寄存器
	if(rst)begin
		r_data_temp <= 8'b0;
	end else if(qspi_sck_p && (state == DATA))begin
			r_data_temp[7 - (data_cnt-1)<<1] <= qspi_sio3 ;
			r_data_temp[6 - (data_cnt-1)<<1] <= qspi_sio2 ;
			r_data_temp[5 - (data_cnt-1)<<1] <= qspi_sio1 ;
			r_data_temp[4 - (data_cnt-1)<<1] <= qspi_sio0 ;
	end else begin
		r_data_temp <= r_data_temp;
	end
end

always@(posedge clk)begin//将移位寄存器中的数据输出
	if(rst)
		o_data <= 8'b0;
	else if(data_cnt == 8'd4)
		o_data <= r_data_temp;
	else
		o_data  <= o_data;
end

endmodule

Simulation waveform diagram

Please add image description
Please add image description
In the waveform diagram, the command word 0xEB represents the read operation, 0x38 represents the write operation, and 0x06 represents the write enable command. The data written this time is 0x01, 0x12, 0x23 in sequence to 0xF0. It can be seen from the figure that the written The data is consistent with the read data, indicating that the QSPI communication function is normal. After actual testing, the QSPI rate of this article can reach 75MHz. (FPGA clock frequency is 150MHz).

Summarize

To sum up, it can be seen that the underlying QSPI code and SPI code have similar writing ideas. The main differences are writing command words, writing addresses, and data collection of 4 data lines. The most important thing is to write it in conjunction with actual applications to achieve specific functions!

Guess you like

Origin blog.csdn.net/m0_51575600/article/details/132693943