1. About mmdeploy
MMDeploy is an OpenMMLab model deployment toolbox that provides a unified deployment experience for each algorithm library. Based on MMDeploy, developers can easily generate the SDK required for the specified hardware from the training repo, saving a lot of adaptation time. MMDeploy provides a series of tools to help you deploy algorithms under OpenMMLab to various devices and platforms more easily.
2. Environment installation
Tip: Do not use the deploy tool in the mmseg project, it is recommended to use the mmdeploy project under openmmlab (the lesson of blood and tears).
Since it is inference deployment, there is a high probability that the model has been trained. The basic openmmlab environment package (mmsegmentation, mmcv-full...) should have been installed, so the steps in the official website (install mmseg, mmcv) There is no need to operate, just activate the local training environment (for example: openmmlab), and then install the missing packages.
The packages that need to be installed are: mmdeploy\onnx\tensorrt\pycuda. . .
The pre-compiled platforms and devices are as follows. If you are not there, please download the source code and compile and install it yourself.
2.1 Precompiled installation (Linux-x86_64, CUDA 11.x, TensorRT 8.2.3.0):
wget https://github.com/open-mmlab/mmdeploy/releases/download/v0.12.0/mmdeploy-0.12.0-linux-x86_64-cuda11.1-tensorrt8.2.3.0.tar.gz
tar -zxvf mmdeploy-0.12.0-linux-x86_64-cuda11.1-tensorrt8.2.3.0.tar.gz
cd mmdeploy-0.12.0-linux-x86_64-cuda11.1-tensorrt8.2.3.0
pip install dist/mmdeploy-0.12.0-py3-none-linux_x86_64.whl
pip install sdk/python/mmdeploy_python-0.12.0-cp38-none-linux_x86_64.whl
cd ..
# 安装推理引擎 TensorRT
# !!! 从 NVIDIA 官网下载 TensorRT-8.2.3.0 CUDA 11.x 安装包并解压到当前目录
pip install TensorRT-8.2.3.0/python/tensorrt-8.2.3.0-cp38-none-linux_x86_64.whl
pip install pycuda
export TENSORRT_DIR=$(pwd)/TensorRT-8.2.3.0
export LD_LIBRARY_PATH=${TENSORRT_DIR}/lib:$LD_LIBRARY_PATH
# !!! 从 NVIDIA 官网下载 cuDNN 8.2.1 CUDA 11.x 安装包并解压到当前目录
export CUDNN_DIR=$(pwd)/cuda
export LD_LIBRARY_PATH=$CUDNN_DIR/lib64:$LD_LIBRARY_PATH
2.2 Precompiled installation (Linux-x86_64, CUDA 11.x, ONNX):
wget https://github.com/open-mmlab/mmdeploy/releases/download/v0.12.0/mmdeploy-0.12.0-linux-x86_64-onnxruntime1.8.1.tar.gz
tar -zxvf mmdeploy-0.12.0-linux-x86_64-onnxruntime1.8.1.tar.gz
cd mmdeploy-0.12.0-linux-x86_64-onnxruntime1.8.1
pip install dist/mmdeploy-0.12.0-py3-none-linux_x86_64.whl
pip install sdk/python/mmdeploy_python-0.12.0-cp3X-none-linux_x86_64.whl #注意自己python版本对应
cd ..
# 安装推理引擎 ONNX
pip install onnxruntime-gpu==1.8.1 #安装GPU版本
pip install onnxruntime==1.8.1 #安装CPU版本
wget https://github.com/microsoft/onnxruntime/releases/download/v1.8.1/onnxruntime-linux-x64-1.8.1.tgz
tar -zxvf onnxruntime-linux-x64-1.8.1.tgz
export ONNXRUNTIME_DIR=$(pwd)/onnxruntime-linux-x64-1.8.1
export LD_LIBRARY_PATH=$ONNXRUNTIME_DIR/lib:$LD_LIBRARY_PATH
3. mmseg project torch to onnx
After the preparation is ready, we can use the tools in MMDeploy tools/deploy.py
to convert OpenMMLab's PyTorch model into a format supported by the inference backend.
python tools/deploy.py \
$DEPLOY_CFG \
$MODEL_CFG \
$PTH_MODEL_PATH \
--work-dir $OUT_PATH \
--show --device cuda --dump-info
in:
DEPLOY_PATH is the config file path of ./mmedeploy/configs/mmseg/XXX.py under the mmdeploy project
MODEL_CFG is the config file for your own training, usually in the same directory as the pth model
PTH_MODEL_PATH is the address of the pth model file that needs to be transferred
OUT_PATH is the address of the output onnx model file and the corresponding json storage address
python tools/deploy.py \
/root/workspace/mmdeploy/configs/mmseg/segmentation_onnxruntime_dynamic.py \
/root/workspace/mmseg/MMSEG_DEPLOY/uper_swin_base.py \
/root/workspace/mmseg/MMSEG_DEPLOY/best_mIoU_epoch_500.pth \
--work-dir /root/workspace/mmseg/MMSEG_DEPLOY/OUTPUT \
--show --device cuda --dump-info
4. Use the onnx model file for inference
The reasoning process is relatively simple and can be modified in a fancy way
from mmdeploy.apis import inference_model
result = inference_model(model_cfg = 'data/project/mmseg/weights/ersi_uper_swin_base.py',
deploy_cfg = '/data/project/mmdeploy-master/configs/mmseg/segmentation_onnxruntime_dynamic.py',
backend_files = ['/data/project/end2end.onnx'],
img ='in_img_path/xxx.png',
device='cuda:0')
in:
model_cfg is the config file during training
deploy_cfg is the /configs/mmseg/XXX.py file under the mmdeploy project
backend_file is the address of the onnx model just transferred
img is the original image to be inferred
Note: The result output here is in list format. If you need to save the result, you need to convert it to numpy first, then take the corresponding dimension, and finally save it in image format. Or refer to the processing method of mmseg/tools/test.py, append all image prediction results to a large list, and finally save them sequentially.
pred = (np.array(result)[0,:,:]).astype(np.uint8) #Save result as an image
cv2.imwrite(out_img_path,pred)