EGE-UNet

EGE-UNet surpassed the existing state-of-the-art methods on the two mainstream skin disease segmentation datasets of ISIC2017 and ISIC2018. Compared with TransFuse, the model maintains excellent segmentation performance while reducing parameters and calculation costs by 494 times and 160 times respectively.

Paper link: https://arxiv.org/pdf/2307.08473.pdf

MICCAI 2023 Here is an introduction to the latest research work published by Shanghai Jiaotong University  , a Efficient Group Enhanced UNet, EGE-UNet model called , which is based on  U-Net magic changes to solve the problems faced in the segmentation of medical images (especially skin lesions). As it was developed for mobile health applications, it addresses the high parameter and computational load issues faced by many current models.

In simple terms, EGE-UNettwo main modules are fused:

  • Group multi-axis Hadamard Product Attention module ( GHPA )

  • Group Aggregation Bridge module (GAB)

Among them, the Hadamard product attention mechanismGHPA (HPA)  is used to  extract lesion information from multiple perspectives by grouping input features and operating on different axes  . This approach is inspired by Multi-Head Self-Attention (MHSA), while HPA can reduce the model size because its complexity is designed to be linear and different from the quadratic complexity of MHSA.HPA

On the other hand, GAB multi-scale information can be effectively extracted by fusing high-level semantic features and low-level detail features of different scales and masks generated by the decoder through group aggregation, which is crucial for medical image segmentation.

Finally, by fusing the above two modules, the authors propose EGE-UNeta model that achieves excellent segmentation performance with extremely low parameters and computational complexity. The model not only focuses on performance improvement, but also pays more attention to usability in real-world environments.

According to the experimental report in this paper, EGE-UNet surpassed the existing state-of-the-art methods on the two mainstream skin disease segmentation datasets of ISIC2017 and ISIC2018. Compared with the model, while maintaining excellent segmentation performance, the parameters and calculation costs were reduced by 494 times TransFuseand  160  times  respectively  . To the best of the authors' knowledge, this is the first model whose number of parameters is limited to  50KB  , a testament to its efficiency and usefulness!

method
Framework

As shown, EGE-UNetthe design follows the U-shape architecture, including a symmetrical encoder-decoder section. The encoder  stage consists of six, and the number of channels in each stage is {8, 16, 24, 32, 48, 64}. The first three stages employ ordinary convolutions, while the last three stages use the proposed GHPA to extract representation information from multiple views.

Compared to the simple  Skip connection connection in UNet, EGE-UNet is integrated at every stage between encoder and decoder  GAB. In addition, the model also utilizes deep supervision to generate mask predictions at different scales, which are used in the loss function as one of the inputs to GAB. Through the integration of these advanced modules, EGE-UNet significantly reduces parameters and computational load while improving segmentation performance over previous methods.

GHPA module

We divide the input into four groups equally in the channel dimension, and perform HPA on the height-width, channel-height, and channel-width axes of the first, second, and third groups, respectively. For the last set, we only use DW on feature maps. Finally, by connecting the four groups along the channel dimension, another DW is applied to integrate information from different perspectives.

GAB module

It is well known that obtaining multi-scale information is crucial for dense prediction tasks. Therefore, this paper introduces GAB, which accepts three inputs:

  1. Low-level features

  2. advanced features

  3. mask

As shown in the figure above, first, high-level features are resized using depthwise separable convolution (DW) and bilinear interpolation to match the size of low-level features. Second, we divide the two feature maps into four groups along the channel dimension, and concatenate one group of low-level features with one group of high-level features to obtain four groups of fused features. For each set of fused features, the masks are concatenated. Next, dilated convolutions with 3 kernel sizes and different dilation rates are applied to different groups in order to extract information at different scales. Finally, four groups are concatenated along the channel dimension, and then ordinary convolutions with a kernel size of 1 are applied to enable interactions between features at different scales.

Finally, since different GABs require mask information at different scales, deep supervision is used here to calculate loss functions at different stages to generate more accurate mask information.

experiment

In the experimental part, EGE-UNet was validated on two publicly available skin lesion segmentation datasets (ISIC2017 and ISIC2018), showing performance beyond existing methods. On ISIC2017the data set, compared with larger models, for example TransFuse, EGE-UNetit not only has better performance, but also significantly reduces the parameters and calculation amount, reaching 494 times and 160 times respectively. 

Moreover, for other lightweight models, EGE-UNet surpasses UNeXt-S by increasing mIoU by 1.55% and DSC by 0.97%, while reducing parameters and computation by 17% and 72%. Furthermore, EGE-UNet reduces parameters to about 50KB for the first time while maintaining excellent segmentation performance. 

In the ablation experiments, the authors also demonstrate the effectiveness of the proposed GHPA and GAB modules. They not only improve performance, but also significantly reduce parameters and computation.

Summarize

This paper mainly proposes two novel modules, GHPA and GAB, which greatly reduce the complexity of the model and improve the performance of the model. At the same time, this paper builds on these two modules EGE-UNetfor the skin lesion segmentation task, and experimental results show that the method achieves state-of-the-art performance while significantly reducing resource requirements.

 

Guess you like

Origin blog.csdn.net/qq_29788741/article/details/131874071