Paper reading-Multi-scale Domain-adversarial Multiple-instance CNN for Cancer Subtype Classification

Read the papers of
Multi-Scale Domain-Multiple-adversarial CNN for instance Cancer Subtype
Classification with Unannotated Histopathological ImagesRF Royalty Free
from CVPR2020
units | Nagoya Institute of Technology; Nagoya University Hospital; RIKEN in Japan

Abstract

This paper proposes a method for classification of cancer subtypes based on unlabeled pathological images, which can automatically detect nieces into cancer areas on the full slice image WSI. The main difficulties are:


(1) The size of a complete pathological tissue slice panoramic image is as high as 40000 40000, and the cost of marking cancer areas on WSI is very high;


(2) It is also possible to complete local and global features according to different magnifications. Extraction;


(3) It must also be able to resist the systematic deviation of HE staining and stably detect all levels of feature text. In order to solve the above problems, it combines multi-instance segmentation, domain migration resistance and multi-level learning framework, which can effectively realize the unlabeled Subtype classification of malignant lymphoma in HE pathological tissue sections.

# Section I Introduction


This article designs a new neural network for cancer subtype powders. It takes digital images of pathological tissue slices after HE staining as input, and the panoramic slices (WSI) of the entire tissue are usually very large in size and the processing process The neural network is often fed into the neural network by cutting patches, so it is necessary to label the tumor area for each patch, but the high labeling cost cannot be ignored. If an unlabeled patch is used for classification, there are three problems:


(1) Cancer and normal area are intertwined in a WSI image, and the pathologist needs to identify the tumor area and further subtype classification ;


(2) will have a degree of staining of the experimental conditions acquired great volatility, different batches after HE staining great color differences, pathologists often need to discriminate on the basis of different staining;



(3) pathologists During the detection process, the magnification needs to be adjusted continuously, and a comprehensive conclusion can be drawn by observing various characteristics of the pathological tissue under different magnifications.



Therefore, this article designs a practical CNN model from the process of imitating the actual diagnosis of pathologists. In order to solve the above three major problems at the same time, this article combines multi-instance learning (MIL), domain transfer and multi-scale learning to build a CNN. Framework, and tested the malignant lymphoma pressure type classification of 196 test samples from 80 hospitals. The test results showed that they were close to the pathologist’s diagnosis, and the process also showed attention to the tumor area, which is similar to the actual pathologist.

Therefore, the main work in this paper is:


(1) The CNN network based on MIL+DA+MS is used for the classification of HE pathological staining images;



(2) The network is successfully used in the subtype classification task of 196 cases of malignant lymphoma;



(3) ) The immunostaining process of confirming malignant lymphoma shows the same diagnosis process as the pathologist, focusing on the tumor area under different magnifications.

# Section II Preliminaries


Part A Problem Assuming that the



problem is based on binary pathological WSI image for two classification, still uses the cut patch method, the patch size is 224
224. In view of the fact that a positive WSI image not only contains positive information (also It is the information of the tumor area) Therefore, this article draws on the idea of ​​MIL, and learns a series of example packages. The positive example package contains at least one positive example information, and the negative example package contains all negative examples.


The MIL example


Insert picture description here


completes the classification of unknown packets through multi-instance learning, and in order to imitate different magnifications, there are also pictures of different magnifications in the package. The superscript s represents different magnifications.
Insert picture description here
Part B Domain-adversial training
Why DA is needed to combat domain migration is because pathological staining has great volatility. Different instructions and different experimental environments will lead to inconsistent staining conditions, so it is often before real training Need pre-processing, such as color normalization or color enhancement.
However, DA training will selectively ignore some inconsistent information that is not relevant to the classification task. Previous studies [23] have shown that this is better than simply using image enhancement. Therefore, this article also adopts the idea of ​​DA, selectively discarding the personal information of the image, so that different dyeing settings under different individuals can be ignored.
For Domain-Adversial learning, please refer to:
Domain-Adversarial Training of Neural Networks
Part C Multi-scale pathlogy image analysis
Pathologists need to observe tissue samples at different magnifications for comprehensive analysis, such as observing tissue structures at high magnifications Observe the cell nucleus under magnification, so this hierarchical structure is also used in the pathological image analysis framework, and there are two realization paths. One is to input a low-resolution image, detect the ROI area, and then perform further detailed analysis based on the ROI; the other technical route is to automatically select the appropriate resolution based on the image itself, such as designing a hybrid expert network to choose different Resolution picture.
In order to imitate the process of pathologists constantly adjusting the magnification of the microscope for observation, this article adopts a new method. In the MIL multi-instance learning package, images with different patches and different resolutions are included at the same time, instead of using traditional levels. Or choose a certain resolution.

Section III Proposed Method

In this paper, the HE-stained full slice image is sent to the neural network, and the subtype classification task is performed according to the predicted category of each package output.
The overall network structure is shown in Fig4, which mainly includes 3 modules and 2 stages. Among them, Feature Extractor is responsible for mapping the input image to the Q-dimensional feature space h;
the category label prediction of the Bag class label predictor package is completed by a NN using the attention mechanism to map the feature vector in the package to the category probability value of the entire package ; The Domain predictor is responsible for converting the feature vector into the probability output of a certain domain.
In stage 1: DA-MIL network of the same level, the bag label of each level image is mainly obtained;
stage 2: Multi-stage DA-MIL network, you can see that more feature extractor modules are inserted.
Insert picture description here
The following describes the operations performed by the two stages in detail:
Part A Stage 1-SS-DA-MIL
will calculate the category label of the package for each patch under each magnification. Note that the anti-domain migration is added to the cross-entropy loss function. Therefore, the category label of the entire package depends on the instance with larger attention coefficient; the introduction of DA regularization is mainly to suppress the influence of different dyeing effects.
Part B Stage 2: The
second stage of MS-DA-MIL contains patches collected at different magnifications in different packages. The inserted feature extractor comes from the training in stage 1.
Insert picture description here

Section IV Experiments

Part A Dataset
: The data in this article comes from a total of 196 clinical cases from 80 different institutions, and the diagnosis results come from a diagnostic expert. Diagnosis of subtypes of malignant lymphoma is very complicated and difficult. First of all, it has many subtypes. In addition to HE pathological images, it is necessary to combine the results of immunohistochemical staining of the same patient to get a final conclusion. This article only classifies 5 subtypes, namely:
DLBCL, diffuse large B-cell lymphoma,
angioimmunoblastic T-cell lymphoma ; AITL, angioimmunoblastic T-cell lymphoma, angioimmunoblastic T-cell lymphoma;
HLMC, classical Hodgkin's lymphoma mixed cellularity,
classical Hodgkin's lymphoma nodular
sclerosis ; HLNS, classical Hodgkin's lymphoma nodular sclerosis, classical Hodgkin's lymphoma nodular sclerosis.
Among them, there are two subtypes under DLBCL: GCB and non-GCB.
GCB, germinal center B-cell, germinal center B-cell.
(Pure machine translation, forgive me for the error...) The
experiment first carried out two classifications, which distinguished DLBCL, which contains two subtypes, from the other four types. Therefore, the two classification is responsible for distinguishing DLBCL and non-DLBCL samples.
The examples in the positive case are all collected in DLBCL tumor prefetching, because the samples in the non-tumor area of ​​DLBCL are similar to those of non-DLBCL; the negative case includes patches collected in the non-tumor area of ​​DLBCL and patches collected in non-DLBCL images .
The final DLBCL class contains 98 samples, and non-DLBCL also contains 98 samples.

Run a step back and continue.
you're back.
The
first implementation of Part B Experiment Settings is the second classification, which distinguishes the DLBCL containing two subtypes from the other four types. Therefore, the two classification is responsible for distinguishing DLBCL and non-DLBCL samples.
The examples in the positive case are all collected in DLBCL tumor prefetching, because the samples in the non-tumor area of ​​DLBCL are similar to those of non-DLBCL; the negative case includes patches collected in the non-tumor area of ​​DLBCL and patches collected in non-DLBCL images .
The final DLBCL class contains 98 samples, and non-DLBCL also contains 98 samples.
Experimental settings
Use images with magnifications of 10x and 20x, so S=2.
Data set division: training:validation:test = 60%:20%:20%;
100 images are randomly extracted from each of the two magnifications 224 * 224 size patch, a total of 200 pieces;
subsequent data enhancement.
In the experimental training for 10 epochs, it is also necessary to determine the parameter lambda of DA regularization, which is determined by the following formula:
Insert picture description herewhere alpha is the super parameter.
Feature extractor is used in ImageNet pre-trained VGG16, output dimension: 25088.
Then it becomes a 512-dimensional feature vector after FC processing of label predictor.
Part C Results
Table I shows the experimental results. Compared with other patch-based results, MS-DA-MIL has achieved the best classification accuracy rate, and it is better than the single-stage DA-MIL. It also reflects The necessity of using multi-scale information.
Insert picture description hereIn addition, this article also visualizes the distribution of partial attention weights in the process of distinguishing DIBCL. From left to right, Fig5 shows the original HE staining image, the weight distribution map, and the results of CD20 immunohistochemical staining. row1 represents 10 times magnification, and row2 represents 20 times magnification. In the attention map, the weights are normalized to between 0-1, where blue means the weight is 0, red means the weight is 1 (that is, DLBCL), you can see that the red label shows the DIBCL positive area and CD20 shows the brown area Consistent, showing the effectiveness of the MS-DA-MIL in this paper to calculate the attention area.
Insert picture description hereWhat Fig6 wants to illustrate is that by visualizing the heat map of the attention weight, it is found that some attention is more obvious in the 10x picture, and some in the 20x picture; therefore, pictures of different magnifications are used in the classification process. It is very necessary.
Insert picture description here

Section V Conclusion

The CNN framework designed in this paper effectively combines the multiple advantages of Multi-Instance Learning (MIL), DA (Domain Adversial training), and MS (Multi0Scale learning). The purpose is to imitate the judgment process of real pathologists and effectively complete 196 cases of malignant lymphoma. The task of subtype classification of, reached SOTA in a patch-based method.

Guess you like

Origin blog.csdn.net/qq_37151108/article/details/107230474