STM32 X-CUBE-AI: Pytorch model deployment process

overview

STM32 CUBE MX expansion package: X-CUBE-AI deployment process: model conversion, CUBEAI model verification, CUBEAI model application .
The deep learning architecture uses the Pytorch model, which includes multiple LSTMs and fully connected layers (including Dropout and activation function layers).

Version:

STM32CUBEMX: 6.8.1
X-CUBE-AI: 8.1.0 (this version is recommended, LSTM support is updated)
ONNX: 1.14.0

References

If you encounter ERROR and BUG, ​​you can go to the ST community to ask questions: ST Community
CUBEAI Getting Started Guide Download address: X-CUBE-AI Getting Started Guide Manual
Official application example: Deployment example

STM32CUBEAI installation

There are already many tutorials on the installation of the CUBEAI extension package, so I won't go into details here. CUBEAI installation
It should be noted that when installing CUBEAI on STM32CUBEMX, the latest version of CUBEAI may not be installed, so you can go to ST official website to download the latest version (the latest version will update the model implementation). https://www.st.com/zh/embedded-software/x-cube-ai.html
insert image description here

CUBEAI model support

Currently, CUBEAI supports three types of models:

  1. Loud:.h5
  2. TensorFlow:.tflite
  3. All models that can be converted into ONNX format: .onnx

Pytorch deploying CUBEAI needs to convert the Pytorch generated model .pth to .onnx.

LSTM model conversion considerations

  1. Due to the limitations of the CUBEAI extension package and ONNX on LSTM conversion, when building the Pytorch model, you need to set the LSTM batch_first=False(which does not affect the model training and application). After setting, you need to pay attention to the format of the input and output data.
  2. Before the input of the fully connected layer inside the LSTM model, slice the data and take the last time step to avoid some problems.
  3. For multiple LSTMs, the function can take the form of forwardmultiple inputs x, each of which xis the input of an LSTM.

model conversion

  1. Pytorch->Onnx
    uses the torch.onnx.export() function, in which variables can be dynamically set. There are many tutorials on the use of functions, so I won’t repeat them here. Pytorch to ONNX and verification
    Since it belongs to the model application stage after deployment, the input data can be set by itself batch=1, seq_lengthand input_numdynamic parameters can also be set ( seq_lengthafter being set as dynamic parameters, in the sample data verified by CUBEMX seq_length=1).

  2. The general process of Onnx->STM32
    can refer to: STM32 Model Verification
    Select the corresponding model in STM32CUBEMX, and modify the model name to facilitate subsequent network deployment (do not use the default network name).
    insert image description here
    Various problems may occur during the verification phase. For the types of errors, please refer to:
    insert image description here

model application

So far, the Pytorch model has been successfully converted to ONNX and verified and passed in CUBEMX. The following is the model application part of STM32.

After opening the project through keil, store model-related files in the following directory. The main functions used are: modelName.c、modelName.h, where modelNameis the model name defined in CUBEMX.
insert image description here
The main functions used are as follows:

  1. ai_modelName_create_and_init: for model creation and initialization
  2. ai_modelName_inputs_get: Used to get model input data
  3. ai_modelName_outputs_get: Used to get model output data
  4. ai_pytorch_ftc_lstm_run: Used to feedforward run the model to get the output
  5. ai_mnetwork_get_error: used to obtain the model error code, for debugging

The relevant parameters and usage of each function are as follows.
Similar codes can be found in: X-CUBE-AI Getting Started Manual

1 Error type and code

/*!
 * @enum ai_error_type
 * @ingroup ai_platform
 *
 * Generic enum to list network error types.
 */
typedef enum {
    
    
  AI_ERROR_NONE                         = 0x00,     /*!< No error */
  AI_ERROR_TOOL_PLATFORM_API_MISMATCH   = 0x01,
  AI_ERROR_TYPES_MISMATCH               = 0x02,
  AI_ERROR_INVALID_HANDLE               = 0x10,
  AI_ERROR_INVALID_STATE                = 0x11,
  AI_ERROR_INVALID_INPUT                = 0x12,
  AI_ERROR_INVALID_OUTPUT               = 0x13,
  AI_ERROR_INVALID_PARAM                = 0x14,
  AI_ERROR_INVALID_SIGNATURE            = 0x15,
  AI_ERROR_INVALID_SIZE                 = 0x16,
  AI_ERROR_INVALID_VALUE                = 0x17,
  AI_ERROR_INIT_FAILED                  = 0x30,
  AI_ERROR_ALLOCATION_FAILED            = 0x31,
  AI_ERROR_DEALLOCATION_FAILED          = 0x32,
  AI_ERROR_CREATE_FAILED                = 0x33,
} ai_error_type;

/*!
 * @enum ai_error_code
 * @ingroup ai_platform
 *
 * Generic enum to list network error codes.
 */
typedef enum {
    
    
  AI_ERROR_CODE_NONE                = 0x0000,    /*!< No error */
  AI_ERROR_CODE_NETWORK             = 0x0010,
  AI_ERROR_CODE_NETWORK_PARAMS      = 0x0011,
  AI_ERROR_CODE_NETWORK_WEIGHTS     = 0x0012,
  AI_ERROR_CODE_NETWORK_ACTIVATIONS = 0x0013,
  AI_ERROR_CODE_LAYER               = 0x0014,
  AI_ERROR_CODE_TENSOR              = 0x0015,
  AI_ERROR_CODE_ARRAY               = 0x0016,
  AI_ERROR_CODE_INVALID_PTR         = 0x0017,
  AI_ERROR_CODE_INVALID_SIZE        = 0x0018,
  AI_ERROR_CODE_INVALID_FORMAT      = 0x0019,
  AI_ERROR_CODE_OUT_OF_RANGE        = 0x0020,
  AI_ERROR_CODE_INVALID_BATCH       = 0x0021,
  AI_ERROR_CODE_MISSED_INIT         = 0x0030,
  AI_ERROR_CODE_IN_USE              = 0x0040,
  AI_ERROR_CODE_LOCK                = 0x0041,
} ai_error_code;

2 Model creation and initialization

/*!
 * @brief Create and initialize a neural network (helper function)
 * @ingroup pytorch_ftc_lstm
 * @details Helper function to instantiate and to initialize a network. It returns an object to handle it;
 * @param network an opaque handle to the network context
 * @param activations array of addresses of the activations buffers
 * @param weights array of addresses of the weights buffers
 * @return an error code reporting the status of the API on exit
 */
AI_API_ENTRY
ai_error ai_modelName_create_and_init(
  ai_handle* network, const ai_handle activations[], const ai_handle weights[]);

Focus on the input parameters networkand activations:datatypes are ai_handle(ie void*). The initialization method is as follows:

	ai_error err;
	ai_handle network = AI_HANDLE_NULL;
	const ai_handle act_addr[] = {
    
     activations };
		
	// 实例化神经网络
	err = ai_modelName_create_and_init(&network, act_addr, NULL);
	if (err.type != AI_ERROR_NONE)
	{
    
    
		printf("E: AI error - type=%d code=%d\r\n", err.type, err.code);
	}

3 Get input and output data variables

/*!
 * @brief Get network inputs array pointer as a ai_buffer array pointer.
 * @ingroup pytorch_ftc_lstm
 * @param network an opaque handle to the network context
 * @param n_buffer optional parameter to return the number of outputs
 * @return a ai_buffer pointer to the inputs arrays
 */
AI_API_ENTRY
ai_buffer* ai_modelName_inputs_get(
  ai_handle network, ai_u16 *n_buffer);

/*!
 * @brief Get network outputs array pointer as a ai_buffer array pointer.
 * @ingroup pytorch_ftc_lstm
 * @param network an opaque handle to the network context
 * @param n_buffer optional parameter to return the number of outputs
 * @return a ai_buffer pointer to the outputs arrays
 */
AI_API_ENTRY
ai_buffer* ai_modelName_outputs_get(
  ai_handle network, ai_u16 *n_buffer);

You need to create the input and output data first:

// 输入输出结构体
ai_buffer* ai_input;
ai_buffer* ai_output;

// 结构体内容如下
/*!
 * @struct ai_buffer
 * @ingroup ai_platform
 * @brief Memory buffer storing data (optional) with a shape, size and type.
 * This datastruct is used also for network querying, where the data field may
 * may be NULL.
 */
typedef struct ai_buffer_ {
    
    
  ai_buffer_format        format;     /*!< buffer format */
  ai_handle               data;       /*!< pointer to buffer data */
  ai_buffer_meta_info*    meta_info;  /*!< pointer to buffer metadata info */
  /* New 7.1 fields */
  ai_flags                flags;      /*!< shape optional flags */
  ai_size                 size;       /*!< number of elements of the buffer (including optional padding) */
  ai_buffer_shape         shape;      /*!< n-dimensional shape info */
} ai_buffer;

Then call the function to assign the structure:

	ai_input = ai_modelName_inputs_get(network, NULL);
	ai_output = ai_modelName_outputs_get(network, NULL);

Next, you need to assign values ​​to the data in the structure. Both ai_input and ai_output are input and output addresses. For models with multiple inputs, you can index multiple inputs in an array:

// 单输入
ai_float *pIn;
ai_output[0].data = AI_HANDLE_PTR(pIn);

// 多输入
ai_float *pIn[]
for(int i=0; i<AI_MODELNAME_IN_NUM; i++)
	{
    
    
		ai_input[i].data = AI_HANDLE_PTR(pIn[i]);
	}
// 输出
ai_float *pOut;
ai_output[0].data = AI_HANDLE_PTR(pOut);

pInIt is an array of pointers, and multiple input data pointers are stored in the array; AI_MODELNAME_IN_NUMit is a macro definition, indicating the quantity of input data.
AI_HANDLE_PTRIt is ai_handlea type macro definition, passing in ai_float *a pointer, and converting the data into ai_handlea type.

#define AI_HANDLE_PTR(ptr_)           ((ai_handle)(ptr_))

4 Get model feed-forward output

/*!
 * @brief Run the network and return the output
 * @ingroup pytorch_ftc_lstm
 *
 * @details Runs the network on the inputs and returns the corresponding output.
 * The size of the input and output buffers is stored in this
 * header generated by the code generation tool. See AI_PYTORCH_FTC_LSTM_*
 * defines into file @ref pytorch_ftc_lstm.h for all network sizes defines
 *
 * @param network an opaque handle to the network context
 * @param[in] input buffer with the input data
 * @param[out] output buffer with the output data
 * @return the number of input batches processed (default 1) or <= 0 if it fails
 * in case of error the error type could be queried by 
 * using @ref ai_pytorch_ftc_lstm_get_error
 */
AI_API_ENTRY
ai_i32 ai_modelName_run(
  ai_handle network, const ai_buffer* input, ai_buffer* output);

The function passes in the network handle, input and output buffer pointers, and returns the number of processed batches (should be 1 in the application stage). You can determine whether the model runs successfully by judging whether the return value is 1.

	printf("---------Running Network-------- \r\n");
	batch = ai_modelName_run(network, ai_input, ai_output);
	printf("---------Running End-------- \r\n");
	if (batch != BATCH) {
    
    
		err = ai_mnetwork_get_error(network);
		printf("E: AI error - type=%d code=%d\r\n", err.type, err.code);
		Error_Handler();
	}

After running, pOutthe model output can be obtained by viewing the array data.

void printData_(ai_float *pOut, ai_i8 num)
{
    
    
	printf("(Total Num: %d): ", num);
	for (int i=0; i < num; i++)
	{
    
    
		if (i == num-1)
		{
    
    
			printf("%.4f. \r\n", pOut[i]);
		}
		else
		{
    
    
			printf("%.4f, ", pOut[i]);
		}
	}
}

Model application summary

and can be packaged according to the method in the official deployment example .AI_InitAI_Run

summary

Encountered bugs are continuously updated.

Guess you like

Origin blog.csdn.net/qq_48691686/article/details/131438689