Image segmentation and background replacement of the U2-Net model processed by OpenVINO on the AI Aix development board

Author: Kang Yaoming - Intel Edge Computing Innovation Ambassador

1. Introduction to AIxBoard

1. Background

AIxBoard (AixBoard) is an artificial intelligence embedded development board launched by Blue Frog Intelligence in 2023. It is a member of the official series of Intel development kits and is designed to support entry-level artificial intelligence applications and edge smart devices. It is small in size and powerful in function. It is a powerful small computer for professional makers and developers. With the help of the OpenVINO tool suite, the CPU and iGPU have strong AI reasoning capabilities, supporting image classification, object detection, segmentation and speech Run multiple neural networks in parallel in applications such as processing.

2. Configuration

 3. Computing power

With the help of OpenVINO tools, CPU+iGPU heterogeneous computing and reasoning can be realized, and the computing power of IGPU is about 0.6TOPS

4. Advantages

               • Ready to use out of the box, no need to purchase additional accessories, no need to burn the system, just turn on the power 

               • Cost-effective, onboard 8G memory, 64GB storage, WiFi6 gigabit wireless network card, CPU+iGPU heterogeneous computing 

               • Good versatility, smooth operation of Win10/Win11, desktop Linux, good software compatibility

               • Good ease of use, rich case resources, low learning threshold

               • Official certification, Intel's official recommended development kit, guaranteed quality

               • Personalized customization, the boot logo can be replaced, and the product can be customized

5. Physical map

2. Environment  configuration

1. Download and install anaconda

It is recommended to use the domestic Tsinghua source, which can speed up the download

https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/

After the installation is complete according to the prompts, open anaconda:

Click the command line tool as shown below: 

 update conda

pip replaces Tsinghua source:

pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simple

2. Install the openvino package

For details, please refer to the official website: https://www.intel.cn/content/www/cn/zh/developer/tools/openvino-toolkit/download.html?ENVIRONMENT=DEV_TOOLS&OP_SYSTEM=WINDOWS&VERSION=v_2023_0_1&DISTRIBUTION=PIP&FRAMEWORK=ONNX%2CPYTORCH%2CTEN SORFLOW_2

pip install openvino-dev[ONNX,pytorch,tensorflow2]==2023.0.1

 3. Check whether the installation is successful

Use the following command to see that openvino recognizes that the supported devices are CPU and GPU. As shown in the figure below, it means that the installation is successful and the inference device can be found.

benchmark_app –help

3. Image segmentation and background replacement demonstration of U2-Net model

 

Download repository:

git clone https://github.com/openvinotoolkit/openvino_notebooks.git

1.  Model introduction

U2-Net, developed on the basis of RSU, is used for salient object detection (SOD), which solves the problem of making the network deeper with lower memory and computational costs while maintaining high-resolution feature maps. The U2 network is a two-layer nested U-shaped structure, as shown in the figure below. Its top-level is a large U-shaped structure (cube in the picture) consisting of 11 stages. Each stage is populated by a well-configured Residual Ublock (RSU) (bottom level U-structure). Therefore, the nested U structure can more effectively extract intra-stage multi-scale features and aggregate inter-stage multi-level features.

2.  Model download

u2net_lite = model_config(

    name="u2net_lite",

    url="https://drive.google.com/uc?id=1rbSTGKAE-MTxBYHd-51l2hMOQPT_7EPy",

    model=U2NETP,

    model_args=(),

)

model_path = Path(MODEL_DIR) / u2net_model.name / Path(u2net_model.name).with_suffix(".pth")

Execute the above code to download the original model file.

3. Export onnx

Load the model and pretrained weight files, export to onnx format. Pytorch saves all model parameters with an internally defined dict called state_dict.

# Load the model.

net = u2net_model.model(*u2net_model.model_args)

net.eval()



# Load the weights.

print(f"Loading model weights from: '{model_path}'")

net.load_state_dict(state_dict=torch.load(model_path, map_location="cpu"))



torch.onnx.export(net, torch.zeros((1,3,512,512)), "u2net.onnx")

4. Convert to IR format

Convert the Pytorch model to OpenVINO IR format with FP16 precision using the Model Optimizer Python API. Adds the mean to the model and scales the input using the standard deviation with the scale_values ​​parameter. With these options, there is no need to normalize the input data before propagating it through the network. The mean and standard deviation values ​​can be found in the dataloader file in the U2-Net repository and are multiplied by 255 to support images with pixel values ​​0-255.

model_ir = mo.convert_model(

    "u2net.onnx",

    mean_values=[123.675, 116.28 , 103.53],

    scale_values=[58.395, 57.12 , 57.375],

    compress_to_fp16=True

)

5. Image preprocessing

OpenVINO IR models require images in RGB format. Convert the image to RGB, resize it to 512 x 512, and transpose the dimensions to the format required by the OpenVINO IR model.

IMAGE_URI = "https://storage.openvinotoolkit.org/repositories/openvino_notebooks/data/data/image/coco_hollywood.jpg"

 

image = cv2.cvtColor(

    src=load_image(IMAGE_URI),

    code=cv2.COLOR_BGR2RGB,

)

resized_image = cv2.resize(src=image, dsize=(512, 512))

input_image = np.expand_dims(np.transpose(resized_image, (2, 0, 1)), 0)

6. Perform inference

Load the model network to OpenVINO Runtime, and device_name is specified as a CPU device for inference.

# Load the network to OpenVINO Runtime.

ie = Core()

compiled_model_ir = ie.compile_model(model=model_ir, device_name="CPU")

# Get the names of input and output layers.

input_layer_ir = compiled_model_ir.input(0)

output_layer_ir = compiled_model_ir.output(0)



# Do inference on the input image.

start_time = time.perf_counter()

result = compiled_model_ir([input_image])[output_layer_ir]

end_time = time.perf_counter()

print(

    f"Inference finished. Inference time: {end_time-start_time:.3f} seconds, "

    f"FPS: {1/(end_time-start_time):.2f}."

)

After the inference is completed, you can see the printed execution time and FPS value, as shown in the figure below:

After the device is switched to GPU, it can be seen that the inference time is shortened by nearly 10 times, and the FPS is also increased by nearly 10 times. However, seeing that the GPU utilization rate of the task manager is only about 50%, this shows that the acceleration effect of the GPU is still considerable.

7. Results Visualization

After opencv processing, from left to right are the original image, the segmentation result and the original image with the background removed.

resized_result = np.rint(

    cv2.resize(src=np.squeeze(result), dsize=(image.shape[1], image.shape[0]))

).astype(np.uint8)



# Create a copy of the image and set all background values to 255 (white).

bg_removed_result = image.copy()

bg_removed_result[resized_result == 0] = 255



fig, ax = plt.subplots(nrows=1, ncols=3, figsize=(20, 7))

ax[0].imshow(image)

ax[1].imshow(resized_result, cmap="gray")

ax[2].imshow(bg_removed_result)

for a in ax:

 

    a.axis("off")

8. Add background

Through the cv2.cvtColor method, the new Great Wall background picture is read, and after operations such as transposition, the new Great Wall background picture is replaced for the dog.

background_image = cv2.cvtColor(src=load_image(BACKGROUND_FILE), code=cv2.COLOR_BGR2RGB)

background_image = cv2.resize(src=background_image, dsize=(image.shape[1], image.shape[0]))



# Set all the foreground pixels from the result to 0

# in the background image and add the image with the background removed.

background_image[resized_result == 1] = 0

new_image = background_image + bg_removed_result



# Save the generated image.

new_image_path = Path(f"{OUTPUT_DIR}/{Path(IMAGE_URI).stem}-{Path(BACKGROUND_FILE).stem}.jpg")

cv2.imwrite(filename=str(new_image_path), img=cv2.cvtColor(new_image, cv2.COLOR_RGB2BGR))



# Display the original image and the image with the new background side by side

fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(18, 7))

ax[0].imshow(image)

ax[1].imshow(new_image)

for a in ax:

    a.axis("off")

plt.show()

4. Conclusion

AI Aix development board, onboard 8G memory, 64GB storage, WiFi6 Gigabit wireless network card, CPU+iGPU heterogeneous computing, and an M.2 slot to support expandable hard disk. The equipped Intel Celeron N5105 is a quad-core processor of the Jaser Lake series, which is positioned as an embedded CPU, and it does a very good job in terms of performance, power consumption, and heat dissipation. In the test, the built-in integrated graphics card can shorten the accelerated inference time of the U2-Net model by nearly 10 times, and the FPS has also increased by nearly 10 times. However, the GPU utilization rate is only about 50%, which shows that the acceleration effect of the GPU is still considerable, and there is still potential to further release computing power, which is a handy tool for AI developers.

Guess you like

Origin blog.csdn.net/gc5r8w07u/article/details/131957943