The STM32 HAL library realizes the serial DMA transceiver mechanism of ping-pong buffer and idle interrupt, and easily runs at 2M baud rate

foreword

Direct memory access (Direct Memory Access, DMA ), allows some devices to access data independently without CPU intervention. Therefore, using DMA can save considerable CPU processing time when accessing large amounts of data. The general DMA transfer direction in STM32: memory -> memory, peripherals -> memory, memory -> peripherals. The peripherals here can be data transceiver devices such as UART and SPI.

Universal Asynchronous Receiver /Transmitter ( UART ), generally called a serial port in embedded development, is usually used in medium and low-speed communication scenarios, with a low baud rate of 6400 bps and a high energy of 4~5 Mbps. In scenarios where the baud rate is lower than 115200 bps and the amount of data is not large, DMA is generally not used to send and receive data, because the main frequency of the STM32 chip is tens to hundreds of megahertz, and the interrupt response of the low-speed serial port is just sprinkling water. However, when the amount of sending and receiving data is large, or the baud rate is increased to the order of Mbps, it is necessary to use DMA. At this time, using the blocking method or interrupt method to send and receive data will take up too much CPU time and affect the performance of other tasks. implement.

For the use of DMA to send and receive data in STM32, there are many routines and blogs on the Internet, and it is no problem to learn the use of DMA. But most of them are for basic use, and data anomalies are prone to occur in high-speed, large-data-volume scenarios. For a high-speed, reliable serial port transceiver program, DMA is necessary, and double buffer , idle interrupt and  FIFO data buffer are also very important components. This is also the problem that this article will solve.

STM32CubeMX configuration

The development platform used in this article:

  • STM32F407 (RoboMaster Type C board)
  • STM32CubeMX 6.3.0
  • STM32Cube FW_F4 V1.26.2
  • CLion
  • GNU C/C++ Compiler

First enable the high-speed external clock

Then set up the clock tree. The first place is the frequency of the external crystal oscillator , fill in according to the actual frequency of the crystal oscillator you use; the second place generally fills in the maximum frequency of the chip you use , and the F407 I use here is 168 MHz. Fill in and press Enter, and the values ​​in other places will be automatically calculated, which is very convenient.

Next configure the serial port:

  1. Select a serial port;

  2. Set the mode to Asynchronous (asynchronous);

  3. Set the baud rate, frame length, parity and stop bit length;

  4. Click Add to add the DMA configuration for receiving and sending, pay attention to change the DMA mode to Circular in RX, so that DMA receiving only needs to be turned on once, and the DMA will automatically reset to the starting position of the buffer after the buffer is full, no longer need each time Re-enable DMA after receiving;

  5. Open the serial port total interrupt;

  6. Select the correct GPIO pin. In the case that most of the default pins selected by CubeMX are correct, this is easy to be ignored. It is difficult to think of the problem here when a BUG is found and checked again, so be sure to check it.

Other settings such as debugging interface, operating system, and project management are not described in detail, and GENERATE CODE can be generated after a routine operation.

Serial DMA reception

After the serial port receives the data, the DMA will transfer it byte by byte  RX_Buf . When a certain amount is transported, an interrupt (idle interrupt, half-full interrupt, full-full interrupt) will be generated, and the program will enter the callback function to process the data. The step of processing data in this article is to write data into FIFO for application to read, which will be introduced later. Let's first look at the flow chart of data reception.

Both full-full interrupt and half-full interrupt are well understood, that is, the interrupt generated when the buffer of the serial port DMA is filled half and full. The idle interrupt is an interrupt generated when the serial port does not receive data within one byte after receiving the last frame of data, that is, the bus has entered an idle state. This is very convenient for receiving variable-length data.

Most of the tutorials on the Internet now use the method of full interrupt plus idle interrupt to receive data, but there are certain risks: DMA can transfer data independently of CPU, which means that CPU and DMA may access the buffer at the same time, As a result, when the CPU processes the data in the middle, the DMA continues to transfer the data and overwrites the previous buffer, resulting in data loss. So a more reasonable approach is to implement a ping-pong cache with a half-full interrupt.

A ping-pong cache implemented by a buffer

Ping-pong cache means that when one cache writes data, the device reads data from another cache for processing; after the data is written, the two sides exchange caches, and then write and read data respectively. This leaves enough time for the device to process data, and avoids the situation where the old data in the buffer is not completely read and then overwritten by new data. But there is a small problem, that is, the serial DMA of most models of STM32 has only one buffer. How to realize the ping-pong buffer?

That's right, half-full interrupts. Now, one buffer can be split into two for use.

Looking at this picture, let's understand the three interrupts mentioned above: the half-full interrupt is triggered after the first half of the receiving buffer is filled, and the full-full interrupt is triggered after the second half is filled; neither of these two interrupts are triggered, but When the data packet has ended and there is no subsequent data, the idle interrupt is triggered. For example: send a data packet with a size of 25 to this program with a buffer size of 20, it will generate three interrupts, as shown in the figure below.

program implementation

The principle introduction is complete, thanks to ST for providing the HAL library, and then it is very simple to implement them in C language.

First enable serial port DMA reception.

#define RX_BUF_SIZE 20
uint8_t USART1_Rx_buf[RX_BUF_SIZE];
HAL_UARTEx_ReceiveToIdle_DMA(&huart1, USART1_Rx_buf, RX_BUF_SIZE);

USART1_Rx_buf Then write the callback function, and move the data in the FIFO to the FIFO in the callback function  .

void HAL_UARTEx_RxEventCallback(UART_HandleTypeDef *huart, uint16_t Size)
{
    static uint8_t Rx_buf_pos;	//本次回调接收的数据在缓冲区的起点
    static uint8_t Rx_length;	//本次回调接收数据的长度
    Rx_length = Size - Rx_buf_pos;
    fifo_s_puts(&uart_rx_fifo, &USART1_Rx_buf[Rx_buf_pos], Rx_length);	//数据填入 FIFO
    Rx_buf_pos += Rx_length;
    if (Rx_buf_pos >= RX_BUF_SIZE) Rx_buf_pos = 0;	//缓冲区用完后,返回 0 处重新开始
}

This callback function itself is a weak function and needs to be rewritten by itself. It has two incoming parameters, the first parameter goes without saying, and the second parameter  Size refers to the size that has been used in the entire buffer . It has a very magical place, the three interrupts mentioned above will enter here , so there are only a few lines of code to write.

But this brings up a question, how to distinguish these three interrupts? The answer is that there is no need to distinguish, you only need to calculate the start address and data length of the received data each time to complete the reception. So I defined two static variables: the length of the data received this time = the total size of the buffer being used - the starting position of the data received by this callback in the buffer; and the starting position starts from 0, each callback Just add the length of the received data this time.

Serial DMA send

The sending of the serial port DMA is much simpler than the receiving. You only need to copy the data from the FIFO of the sending data to the sending buffer, and then call the sending function of the HAL library to complete:

const uint8_t TX_FIFO_SIZE = 100;
static uint8_t buf[TX_FIFO_SIZE];				//发送缓冲区
uint8_t len = fifo_s_used(&uart_tx_fifo);		//待发送数据长度
fifo_s_gets(&uart_tx_fifo, (char *)buf, len);	//从 FIFO 取数据
HAL_UART_Transmit_DMA(&huart1, buf, len);		//发送

FIFO queue

First In, First Out (FIFO) may seem unfamiliar, but it should be familiar if you call it a queue . In this article, FIFO is used as a buffer between the DMA transceiver buffer ( RX_Buf, TX_Buf) and the application program. Speaking abstractly, look at the data flow of the receiving status shown in the figure below. Data flow is reversed when sending.

When the serial port receives data, the DMA transfers the data from the serial port register to the receiving buffer opened in the memory  RX_Buf, and generates an interrupt (half-full interrupt, full-full interrupt, idle interrupt); the data in the interrupt callback function is sent  RX_Buf to the FIFO , the application only needs to detect whether the FIFO is empty, and the data can be read if it is not empty.

This may seem superfluous, there is already a DMA receive buffer,  RX_Buf wouldn't it be nice to read data directly from this? The problem here is that the correct  RX_Buf processing can only be performed in the callback function of the interrupt generated by DMA; although the interrupt callback function is blocked, it does not affect the serial port receiving data and DMA continues to transfer data to it  RX_Buf, but  RX_Buf the size is always limited Yes, the later data will overwrite the previous data. Therefore, as soon as the data comes, it needs to be processed immediately, and the data will be lost if it is not timely . Although FIFO will also have the problem of overflow, but the probability of occurrence is smaller, and it is relatively simple to deal with.

Another reason to use FIFO is that it isolates the application layer from the driver layer. In the App, it doesn’t matter  RX_Buf how many data will be obtained under what circumstances, just read the data from the FIFO; the interrupt callback function of the serial port DMA also has a fixed writing method, just push the data into the FIFO. FIFO is undoubtedly a very necessary component in scenarios where the data is of variable length and the amount of data is large.

But careful friends may find that there is also a "fifo" in the above picture of STM32CubeMX configuring serial port DMA. What is the difference between this and the FIFO described above? This is also the confusion I have had, and I will explain it a little bit.

FIFO and DMA's FIFO are not the same FIFO

There is also FIFO in DMA, but its function is to add a FIFO buffer between the serial port register and the memory buffer, and the data flow is as follows.

Since the serial port register can only store one byte, the DMA in direct mode must transfer data to the memory buffer once for each byte. The actual effect of DMA's FIFO is simply to save a batch of data and send them together, which can reduce software overhead and the number of data transmissions on the AHB bus. It is suitable for scenarios where data is continuous and there are other tasks with high overhead in the system. . However, it is also because the DMA FIFO must accumulate a batch before sending, and if the accumulation is not enough, it will not be sent, so there are some limitations. This article does not use DMA's FIFO , but uses direct mode.

transplant FIFO

After talking for a long time, it's finally time to write code. Instead of implementing a FIFO ring buffer myself, I ported  the FIFO used in the RoboMaster AI robot's firmware .

  1. fifo.c Copy and  file in the above ropo  fifo.h to your own project.

  2. fifo.h Delete  in #include "sys.h"and find in the above link  sys.h, copy the implementation of the following lines of mutex into  fifo.h , and include the header file additionally  cmsis_gcc.h:

    #include <cmsis_gcc.h>
    #define MUTEX_DECLARE(mutex) unsigned long mutex
    #define MUTEX_INIT(mutex)    do{mutex = 0;}while(0)
    #define MUTEX_LOCK(mutex)    do{__disable_irq();}while(0)
    #define MUTEX_UNLOCK(mutex)  do{__enable_irq();}while(0)
    
  3. The implementation of dynamically created queues in this FIFO library is used  malloc. If the operating system is used, it should be changed to the memory management API of the operating system. However, this article does not use a dynamic way to create a queue.

use FIFO

It has been introduced in the two sections of serial port DMA receiving and serial port DMA sending, and here is the usage method.

fifo_s_puts(&uart_rx_fifo, &USART1_Rx_buf[Rx_buf_pos], Rx_length);	//数据填入 FIFO
uint8_t len = fifo_s_used(&uart_tx_fifo);		//待发送数据长度
fifo_s_gets(&uart_tx_fifo, (char *)buf, len);	//从 FIFO 取数据

pressure test

Of course, such a set of sending and receiving processes does not need to be used in a low-speed environment (115200 bps), but it is still doubtful how high the baud rate it can be used in and how stable it is. So we need to test it.

I chose serial port modules of two chips, PL2303 and FT232, for testing.

PL2303

The PL2303 data sheet supports serial port baud rates from 75 bps to 6 Mbps. However, after my test, the maximum baud rate is about  970000 bps , no matter how high it is, the data cannot be received, which is far below the expected value. Hope someone kind enough to tell me what's going on.

Then test the communication stability. I send 17 Bytes data packets to the microcontroller with the minimum automatic retransmission interval of 10 ms, and the microcontroller returns all the data. The test ran for 58 minutes, sent 1958.69 KB data, received 1958.69 KB, no packet loss, and the stability passed. In the screenshot, the value of Tx is larger than Rx because the data packet was sent but not received when it stopped, and the two data are always equal during the whole process of running.

Another small problem with PL2303 is that there is a problem with its driver on Win10. You need to download and install the old version of the driver yourself to use it.

FT232

The FT232 data sheet has the following description:

  • Data transfer rates from 300 baud to 3 Mbaud (RS422/RS485 and TTL levels) and 300 baud to 1 Mbaud (RS232)

The measured maximum baud rate of this serial port module using the FT232 chip in my hand is 2 Mbps , which finally met the expectation.

The stability test at 2 Mbps is also very good, running for 66 minutes without packet loss.

Then use a logic analyzer to simply test  the actual communication delay at 2 Mbps . The test method is to send a 17 Bytes data packet, and the microcontroller will return all the data through the serial port after receiving it:

  • The forwarding delay is  around 400 μs  . In my program, this period of time is mainly determined by the frequency of detecting whether the FIFO is empty, and the current theoretical value is 1000 Hz.

  • The duration of a single 17 Bytes data packet is  84 μs , and the total length of the sending and receiving process is about  0.5 ms .

Comparing with   the communication at  115200 bps , the length of a single 17 Bytes data packet is about 1500 μs , and the entire sending and receiving process of the data packet is about  3100 μs .

According to online blogs, STM32F407 supports up to 10.5 Mbps, but I didn't find this in the manual. But 2 Mbps is certainly not its limit. If the microcontroller is connected to the computer, it is limited by the serial port module, and 2 Mbps is basically the ceiling, but the serial communication between the microcontroller and the microcontroller still has potential to be tapped.

reference

acuity. (2020, September 3).  A rigorous STM32 serial port DMA transmission & reception (1.5Mbps baud rate) mechanism_As long as the thinking does not slip, there are always more ideas than problems. -CSDN blog_dma receptionA rigorous STM32 serial port DMA send & receive (1.5Mbps baud rate) mechanism_A rigorous serial port_Acuity.'s blog-CSDN blog

STMicroelectronics. (2021, June). Description of STM32F4 HAL and Low-Layer Drivershttps://www.st.com/content/ccc/resource/technical/document/user_manual/2f/71/ba/b8/75/54/47/cf/DM00105879.pdf/files/DM00105879.pdf/jcr:content/translations/en.DM00105879.pdf

Guess you like

Origin blog.csdn.net/wanglei_11/article/details/131576165