Image segmentation method sharing | Brain cancer image segmentation method based on optimized integrated ConvNet

foreword

slightly.

1 method

The effect of using a single neural network for image segmentation is usually low in accuracy and cannot meet expectations, so the author optimizes the integration of deep neural networks to adapt to the task of brain tumor segmentation. Specifically, a lightweight ensemble method consisting of 2 networks is proposed, each of which is selectively trained on the training set. The outputs of these networks are segmentation maps that differ in segmenting tumor sub-regions, and finally the segmentation maps are combined to obtain the final prediction. The training details of these two networks are as follows.

1.1 Subnetwork 1: 3D - ConvNet

The first sub-model used in the ensemble is 3D-ConvNet. It uses multi-fiber units with weighted atrous convolutions for multi-scale representations for 3D volume segmentation, as shown in Figure 1. Additionally, the network is fine-tuned to improve segmentation.

 Before the data is input into the network for training, the data is enhanced using various data augmentation techniques such as cropping, rotating, and mirroring. The model is trained on 150 epochs using a block size of 128×128 and combining the dice loss and focal loss functions. The fine-tuned hyperparameters are shown in Table 1.

 

Zero padding is applied to the MRI data such that the original 240×240×155 sized voxels are converted to 240×240×160, a depth exactly divisible by the network. Once the data is ready for validation, the trained network is used to generate probability maps. These probability maps are then used to integrate to predict the final outcome.

1.2  Subnetwork 2: 3D - U- shaped ConvNet

The second sub-model of the integration is a 3-dimensional U-Net variant that is different from the classic U-ConvNet architecture. The difference is that the ReLU activation function is replaced by Leaky-ReLU, and the instance normalization (instance normalization) is used. , referred to as IN) to replace the batch normalization (batchnormalization, referred to as BN). Use it to train from scratch on the MBTSC-2019 dataset, as shown in Figure 2, where the purple box represents the 3D convolution unit, the red line represents the maximum pooling operation, the blue line represents the trilinear upsampling (trilinearupsampling) operation and the orange Lines represent merge operations. The network is also fine-tuned to improve the segmentation effect.

 Reduce the MRI size by clipping the data. Then resample through the median voxel space of other heterogeneous data, followed by z-score normalization. To train the network, the input block size is 128×128×128 voxels and the batch size is 2. Different image enhancement techniques such as rotation, mirror inversion, and gamma correction are applied to the data during learning to avoid overfitting, thereby improving the segmentation accuracy of the model. The loss function used combines the cross entropy loss function and the dice loss function. Table 2 details the hyperparameters in the training process.

 Validation is performed on image patches where all image patches overlap by half the size and voxels near the center have a higher weight assigned to them. During testing, additional augmentation data were obtained by mirror inversion along the image axis. The output of the 3D-U ConvNet is also a probability map for integration.

1.3 Details of the integration method

The ensemble method proposed by the author is not constructed by a simple average of the predictions (i.e., probability maps) generated by the two models, but a strategy of "optimized integration" is proposed after rigorous testing, after which the two models are combined The output is shown in Figure 3.

 

Test the trained network on the verification set to obtain the corresponding image segmentation results. The predictions of individual models are independently evaluated on the online MBTSC server to determine their effectiveness in segmenting tumor regions. Then, the dice scores of the two models are compared to determine which sub-network is more accurate and better than the other for a specific tumor region. After experiments, it was found that 3D-ConvNet performed better in segmenting enhanced tumors. At the same time, the 3D-U ConvNet can more accurately segment the tumor core. Finally, combining predictions from both networks outperforms independent segmentation results in the case of whole tumors. Therefore, the final integration scheme of three regions can be obtained: (1) For the tumor core, only the output of the 3D-U ConvNet is used; (2) For the enhanced tumor, only the output of the 3D-ConvNet is used; (3 ) for the whole tumor, assign the same weight to the output of the two sub-networks. Note that the prediction stage is evaluated on the online server, and finally the dice score of the author's integrated method is obtained. These results will be discussed in more detail later.

1.4 Introduction to some techniques

Here are some techniques that are not commonly used in general deep networks but have significant effects when applied in the author's method:

(1) z-score normalization

Z-score normalization, also known as standard score normalization, is in the form of the difference between a number and the mean and then divided by the standard deviation, as shown in formula (1)

 Where: μ is the mean and δ is the variance. Through it, the data of different dimensions can be converted into a uniform dimension z-score and then compared, thereby improving the comparability of the data and weakening the interpretability of the data to the greatest extent. In addition, it can also prevent the covariance matrix from being ill-conditioned. It is equivalent to transforming the covariance matrix into a correlation coefficient matrix, which is less susceptible to disturbance.

(2) Gamma correction

Gamma correction is to calibrate the gamma curve of the image (in the field of computer imaging, the conversion relationship curve between the screen output voltage and the corresponding brightness is called the gamma curve) to perform nonlinear tone editing on the image. It can detect the dark part and light part in the image signal, and increase the ratio of the two, so as to improve the image contrast effect, as shown in formula (2), the core of gamma correction is in the brackets.

 Among them: Iout is the output pixel value, Iin is the input pixel value, and gamma is the correction factor.

(3) dice score and dice loss function

The dice score is a function used to measure the similarity of a set, usually used to calculate the similarity of two samples, and the value range is [0, 1], as shown in formula (3)

For the image segmentation task in this paper, X represents the segmented image with the true value label and Y represents the segmented image of the predicted output, the denominator represents the sum of the number of pixels in X and Y, and the numerator represents twice the intersection of X and Y (multiplied by 2 to eliminate the effect of double counting X and Y intersection pixels in the numerator). The dice loss is expressed as

(4) Cross entropy loss and focal loss function

The form of the cross entropy loss function is shown in formula (5)

 Where: d is a Boolean label vector, and y is the output vector of a floating-point activation function. It can be seen that for ordinary cross entropy, for positive samples, the greater the output probability, the smaller the loss. For negative samples, the smaller the output probability, the smaller the loss. At this time, the loss function converges slowly in the iterative process of a large number of simple samples and may not be optimized to the optimum.

Focal loss function improvement 1

 Firstly, a weight factor is added to the original cross-entropy loss function, so that more attention is paid to samples that are difficult to distinguish, where γ is usually set to 2.

Focal loss function improvement 2

 The focal loss function also introduces a balance factor α to balance the uneven proportion between positive and negative samples. The value of α ranges from 0 to 1. When it is 0.5, the proportion of d=1 can be relatively increased to ensure the balance of positive and negative samples.

2 Experimental results and analysis

2.1 Dataset

The multimodal brain tumor segmentation challenge [16] (multimodal brain tumor segmentation challenge, referred to as MBTSC) is a platform for evaluating the development of machine learning-based tumor segmentation methods. The benchmark dataset includes glioma 3D MRI images ( Both LGG and HGG) and ground-truth labels annotated by professional doctors, where the multimodal scan images provided can be used for training and validation of neural networks for specific segmentation tasks [6, 11].

The MBTSC-2019 dataset [16] is used for experiments. The MBTSC-2019 dataset is a comprehensive dataset obtained from 19 different organizations and institutions, including multimodal MRI scans of each patient, namely T1-weighted images, enhanced scans T1-weighted images, T2-weighted images and FLAIR, their Tumor subregions have been segmented. These data were preprocessed for normalization: they were cranially stripped and aligned to match the anatomical template, and resampled at 1 mm3 resolution. The dimension of each sequence is 240×240×155. Example images of the training set and their corresponding ground-truth labels are shown in Fig. 4, where different modalities of tumor regions are color-coded: enhanced tumors are represented in red, whole tumors are represented in green and tumor cores are represented in blue and Red said. It can be seen from Figure 4 that the manual label value highlights three tumor regions: peritumoral edema, enhancing tumor, necrosis and non-enhancing core.

 

The training set in MBTSC-2019 is used to train the model, and the validation set is used to evaluate the proposed ensemble method. The training set consists of 259 HGG patients and 76 LGG patients, all of which contain professionally annotated ground-truth labels. The validation set includes 125 examples of unknown disease levels, and the real labels are not open to the public.

It is worth mentioning that the method proposed by the author can effectively improve the accuracy rate, but unlike some other methods, in the experiment of the training phase, the author did not use any external data set for pre-training. Furthermore, since the access to the MBTSC-2019 test set is limited to those who participated in the competition, the test results are reported based on the MBTSC-2019 validation set. The segmentation results of the proposed ensemble method are first reported on the validation set, and then compared with the existing state-of-the-art methods.

2.2 Experimental results and analysis

The 2 sub-networks are selectively trained on the MBTSC-2019 training set (n = 335) and tested on the provided MBTSC-2019 validation set (n = 125), and then intelligently The segmentation maps of these models are combined, and finally the final prediction of tumor tissue type is given. The final results show that the dice score obtained by our method for enhanced tumor segmentation is 0.749, the dice score for whole tumor segmentation is 0.905, and the dice score for tumor core segmentation is 0.847. Figures 5-7 show an example of segmentation results for MRIFLAIR, which are the transverse section, coronal section and sagittal section in sequence. For this example, first generate segmentation maps from the two sub-models respectively, and then display the final merged output. The dice scores for patients' enhancing tumor, whole tumor and tumor core were 0.927, 0.947 and 0.924, respectively.

 Different integration techniques are further analyzed to determine whether there are differences among these methods and to determine the most accurate method. The analysis results are listed in Table 3.

 The U-shaped ConvNet can more accurately segment the tumor core; in the case of whole tumors, the effects of the two sub-networks are relatively close, and the combined results are slightly improved. Compared with simple average, the proposed "optimal integration" scheme has better accuracy.

2.3 Comparison with the top 3 methods in the final ranking of the MBTSC-2019 competition

The optimized integrated ConvNet method proposed by the author was evaluated on the MBTSC-2019 validation set, and then compared with the dice scores of the top three methods in the MBTSC-2019 competition, the results are listed in Table 4.

 It can be seen from Table 4 that the cascaded U-CovnNet [29] achieved the best results in the challenge, and the results obtained by the author’s method showed a significant gap in enhancing the tumor, while for the whole tumor and tumor core The results are not too bad; better tumor core results than handbag CovnNet [30], and only a small gap in enhancing tumors and tumor cores. Similarly, compared with the three-plane cascaded CovnNet [31] method, our method can more accurately segment the tumor core.

2.4 Comparison with the results of other methods

Table 5 shows the comparison with the method in literature [23-27] (validation is performed on the MBTSC-2019 dataset according to the original settings). As a comparison, no other external data is used during training, and the best method performance is underlined.

 

It can be seen from Table 5 that, in addition to enhancing the tumor, compared with other networks in the whole tumor and tumor core, the proposed optimal integrated ConvNet can obtain better segmentation results. Reasons for not choosing the backbone network with better performance as the sub-network: First, from the point of view of clinical medicine, the frequency of use of tumor core and whole tumor and the importance in diagnosis are greater than that of enhanced tumor; secondly, from the perspective of MBTSC rules From the point of view, the importance of scoring points is tumor core>whole tumor>enhanced tumor; finally, the focus of the author's method is to verify the method of prior integration, so here "Occam's razor" is also fully applicable without having to replace it with other more For advanced networks. From this point of view, through the optimal integration of 3D-ConvNet and 3D-U-shaped ConvNet, it presents a promising future in brain tumor image segmentation.

Its high efficiency finally achieves better segmentation accuracy than the excellent methods of the same period.

3 discussion

The author proposes an ensemble method of 3D-ConvNet and 3D-U-ConvNet for brain tumor segmentation on multimodal MRI data. Competitive classification accuracy is obtained on the set. The method achieves average dice scores of 0.750, 0.906, and 0.846 in terms of tumor enhancement, whole tumor, and tumor core, respectively, outperforming

many current methods.

Although the method performs well on the whole tumor and tumor core categories, the segmentation accuracy of augmented tumor still needs to be improved. An interesting threshold scheme was implemented in [29], in which if the enhancement tumor is smaller than the set threshold, the area will be replaced by gangrene tissue, this operation significantly improves the accuracy of the classification of enhancement tumors. At present, the author's work still needs to be improved: first of all, the proposed segmentation integration is currently only evaluated on the official verification set of MBTSC-2019, and it can be further verified by testing individual clinical MRI data. Second, the author did not perform extensive preprocessing on the dataset and postprocessing on the results. As a comparison, many literatures propose that the model is normalized by intensity

[32] and bias correction [33] to minimize the variability of their data. The same post-processing methods, such as using conditional random fields [34], have also been shown to improve segmentation accuracy. Although there are still the above aspects that need to be supplemented and improved, the optimal ensemble method proposed by the author still shows efficient and robust tumor segmentation accuracy in multimodal regions. In the future work, in addition to the aspects mentioned above, the author plans to add some clinical data to the integration and further adjust the hyperparameters to improve the effect of the model.

4 Conclusion

Automatic and accurate segmentation of brain tumors from multimodal magnetic resonance images is very important for the diagnosis and treatment of related brain cancers. The author proposes an optimal ensemble ConvNet method to integrate the 3D ConvNet that performs well on MBTSC-2018 with the segmentation map of the 3D-U-shaped ConvNet network that also performs well on medical segmentation. Specifically, the two models are trained on the multimodal data set used in MBTSC-2019, and different segmentation paths are performed by analyzing their differential performance. Finally, the optimal ensemble method is verified on the validation set, and the results show that the method proposed by the author has achieved better accuracy than other methods in the same period on multi-modal tumor images.

5 References

slightly.

Interested students can go to Zhiwang to download this paper.


Article source: Han Bing, Wang Peng, Zhou Yi. Brain cancer image segmentation method based on optimized ensemble ConvNet [J]. Journal of Anhui University (Natural Science Edition), 2022,46(06)

Guess you like

Origin blog.csdn.net/Bella_zhang0701/article/details/128135225