Cervical overlapping cell image segmentation method based on deep convolutional neural network

Note: The article is an explanation for reading Accurate segmentation of overlapping cells in cervical cytology with deep convolutional neural networks , which was published in NEUROCOMPUTING

overview

  • Field: Automated cytology segmentation of cervical precancer
  • Methods: A new framework based on convolutional neural networks (DCNNs) is proposed. Cell detection and localization method based on double window,
    1. Image pixels were segmented into nuclei, cytoplasm, and background using TernausNet.
    2. Cytoplasmic segmentation using DeepLab V2 model

introduce

At present, the main overlapping cell segmentation methods mainly use the traditional techniques commonly used in medical image processing or computer vision to segment the nucleus and cytoplasm. The
methods mainly include:

  1. Applying mean shift clustering and some algorithms, Wang et al. proposed a framework for detecting nuclei and overlapping cytoplasm in cervical cytology, extending the depth of field and volumetric images. The resulting cell boundaries are estimated by a defined similarity measure and then refined at the pixel level by a coarse-to-fine strategy
  2. Nosrati and Hamarneh combined a maximally stable extremal region detector and a random decision forest to find possible locations and segmented the corresponding cytoplasm using a master contour model.
  3. Yan et al. developed an automated cell segmentation scheme for genome-wide screening of RNAi targets based on a multidirectional level set approach.
  4. Tareefa et al. devised a two-stage segmentation method to segment nuclei and cell clusters using locally discriminative shape and appearance cues from a set of image superpixels. And a multi-channel fast watershed method was used to segment nuclei and cytoplasm from touching or overlapping cell clumps.

Flawed:

  • When the degree of cell aggregation is relatively dense, the method may fail to segment highly overlapping cells.

cell recognition process

In this paper, a segmentation method of overlapping cells in cervical cytology based on deep learning model and computer image analysis technology is proposed. As shown in Figure 1, the approach is to perform small segmentation through a two-stage framework, in which the region of interest in the cell detection stage serves as the training sample for the subsequent cytoplasmic segmentation stage.
Its main steps are

  • Image pixels were roughly divided into nuclei, cytoplasm, and background using the TernausNet model. The TernausNet model is an improvement of the U-Net structure widely used in image segmentation. This pre-trained network drastically reduces training time while helping to prevent overfitting.
  • Then a double-window-based cell localization method was used to determine overlapping cell regions. Different from the method based on DCNN - qualitative segmentation is segmented by traditional image processing methods; in this paper, DCNN-based instance segmentation is used, combined with cell detection to segment overlapping cells. Instance segmentation needs to be able to correctly detect individual objects and assign different labels to each object, so that the machine can identify and extract each cervical cell from touching or overlapping cell clusters.

The paper uses the mainstream DeepLab V2 model to segment overlapping cells. The DeepLab model has been widely used in semantic image segmentation tasks and tested on a large number of data sets. At the same time, the loss function is also designed to adjust the model so that the image can be better processed. The contributions of this paper mainly include:

  • A segmentation framework that effectively combines DCNN and image processing techniques is designed to accurately separate nuclei and cytoplasm in highly overlapping cell blocks, and can be adapted and extended to other types of cytological cells, such as breast cells.
  • By considering the cell edge information, a new loss function is created to make the DCNN model more sensitive to cell pixels and boundaries, thereby improving segmentation performance.

method

Our deep learning-based segmentation method consists of three modules: cell detection , cytoplasmic segmentation , and boundary refinement . This procedure develops a cell detection method to extract single cells through the TernausNet model and a dual window-based cell localization method . The improved DeepLab V2 model segmented the cytoplasm from the image background, and then used a post-processing method to refine the outer contours of the cells.

cell detection

Nucleus candidate generation

A term network-based classification method is first applied to the entire image. Each pixel is assigned to one of three labels (nucleus, cytoplasm and background). This step is a preliminary step in finding candidate nuclei and excluding background and non-nuclear substances. TernausNet is built based on the classic U-Net architecture and initialized with the weights of the pre-trained VGG11 encoder on the large ImageNet dataset. Different from VGG11, TernausNet replaces the fully connected layer with a single convolutional layer of 512 channels as the central part of the network. The model separates the encoder and decoder, which reduces training time and prevents overfitting. After classification, all labeled nuclei pixels are collected as potential nuclei for the subsequent selection process, as shown in Figure (b) below.
nucleus

Figure 2. Illustration of the dual window-based cell localization method. (a) Two windows located at the center of the N1 nucleus; (b) CellROI (green), determined by considering the positions of N1, N2, R1N1, R2N1, R1N2, and R2N2.

Nucleus candidate selection

In order to reduce the false positive kernel pixels after classification based on TernausNet, this paper uses an Adaboost classifier to filter out possible wrong candidate kernels. Two types of texture features are computed, including gray-scale size region matrix (GLSZM) and histogram of oriented gradients (HoG). These features are used to train a classifier.

The GlSZM descriptor is an advanced statistical matrix. for texture representation. The more uniform the texture (large flat areas and close gray levels), the wider and flatter the matrix. GLSZM provides a statistical representation by estimating the binary conditional probability density function of the image distribution while being robust to image noise [27]. Based on the GLSZM matrix, a total of 13 image attributes are calculated. They are small/large area emphasis, grayscale/area size non-uniformity, area percentage, low/high grayscale, small area low/high grayscale emphasis, large area low/high grayscale emphasis, grayscale/area size variance .

Hog is a popular feature that captures the local texture properties of images. In general, the gradient image G is obtained by convolution by applying gradient filters in the horizontal and vertical directions. The resulting G is divided into N small cells without covering. For each cell C, compute a histogram hc of gradient directions and edge cloud directions. The Hog feature is the concatenation of all these histograms.


Cell localization

According to the position of the identified cell nucleus, the cell region is determined by double-window method, as shown in the following figure (d), these regions of interest ROI are used to form training samples for subsequent cytoplasmic segmentation tasks. ROIs are defined as rectangular windows of six-tuples (N i r , N i c , DIN i , DrN i , DuN i , DdN i ). where i ∈ {1,2,...,C}, C is the total number of cores centered on the core Ni. N i r and N i c represent the coordinates of the center of the nucleus in the image. Dl, Dr, Du and Dd represent the distance between the Ni core and the four sides of the window .

The article defines two initial square ROIs to mark the real cell locations, as shown in Figure 2(a), which are R 1 N 1 and R 2 N 1 . Two rectangular windows representing two initial ROIs are (N i r, N i c, D 1 N i , D 1 N i , D 1 N i , D 1 N i ) and (N i r, N i c , D 2 N i , D 2 N i , D 2 N i , D 2 N i ), where D 1 > D 2. We design a criterion to determine the size of the ROI window by considering the coverage units. Figure 2(a) gives an example showing that two nuclei (N1 and N2) are within a cell cluster. The r1 window of N2 and the r2 window of N1 have an overlapping area. DlN1 is equal to half the horizontal distance between n1 and n2. Otherwise, DlN1 is equal to D1N1. The same rule applies to DdN1. The resulting ROI is shown in Fig. 2(b). This process ensures the extraction of accurate cell regions, including overlapping regions between densely aggregated cells.

Figure 2(a) gives an example showing two nuclei (N1 and N2) within a cell cluster. The r1 window of N2 and the r2 window of N1 have an overlapping area. DlN1 is equal to half the horizontal distance between n1 and n2. Otherwise, DlN1 is equal to D1N1. The same rule applies to DdN1. The resulting ROI is shown in Fig. 2(b). This process ensures that we extract accurate cell regions, including overlapping regions between densely clustered cells.
insert image description here

Figure 3 (a) Original image with human annotation; (b) nucleus candidate detection (green dots); (c) nucleus candidate selection (green dots); (d) cell localization represented by a rectangular window; (e) using Segmentation obtained by Deeplab V2; (f) segmentation after CRFs; (g) final segmentation after cell boundary refinement using weighted cross-entropy; (h) final segmentation using traditional cross-entropy sum

cytoplasmic partition

Data augmentation

Since the densely distributed cells are difficult to manually label, the image is enlarged using the synthesis method, and the segmentation algorithm based on DeepLab V2 is trained. At the same time, data synthesis can provide more precise annotations for individual cell boundaries, avoiding potential errors in manual annotations. The method uses that isolated cell to generate a clustered cell image. Individual cells were randomly selected to synthesize different cell masses according to the following two parameters:

  1. Number of cells per image
  2. The overlap ratio between any pair of cells

Network architecture design

A DeepLab V2 model was built to segment cytoplasm from image background. The model provides the ability to learn multi-scale upper and lower dimensional features through Spatial Pyramid Pooling (ASPP). It is effectively enlarged to incorporate the larger context. At the same time, the pre-trained ResNet-101 is used as the main feature extractor. At the same time, a series of inflation averages with different inflation rates are added to the ASPP layer to replace the last block (Conv5x) of the model.

ResNets is a deep residual learning framework to facilitate the training of deep convolutional neural networks. It contains 4 calculation blocks with different numbers of residual units. These execution units perform a series of convolution operations, which can be defined as
y 1 = F ( x 1 ) + x 1 1 y_1=F(x_1)+x_11y1=F(x1)+x11
wherex 1 x_1x1y 1 y_1y1are the input and output vectors in the lth unit. FFF denotes a residual map with two or more convolutional layers. This formulation can be implemented with feed-forward neural networks (skip one or more layers) with shortcut connections. Shortcut connections can be implemented simply using identity maps, which lead to an optimal model with precision granularity that increases network depth.


Algorithm 1 Cell Synthesis Algorithm


Inputs: Original images(O), nuclei masks(NM), cellmasks(CM),NUM, and OR
Output: Synthesis image(S)

Extracting the isolated cells SCi,(i ∈ {
    
    1, 2, . . ., N}), and obtaining
the background set Bj, (j ∈ {
    
    1, 2, . . ., M}) via a random selection
of background pixels

Step 1: Placing NUM cells with specified OR on a512 × 512 canvas
for i = 1 : NUM do
  Randomly choosing asingle cell sci from SCi
  if i = 1 then
    Translatingand rotating sci with a random distance and angle
    Randomly generating the other cell positions within a regioncenter on sci
  else
    Moving sci to (tx,ty) inthe region and then rotating it with
    a random angle
    Checkingthe validity of the overlapping ratio (or) between
    sci andsci−1 based on OR
    if or < OR then
    	Adding sci onthe canvas
    end if
  end if
end for

Step 2: Computingoverlapping cells’ gray values
for k1 = 1 : NUM do
  for k2 = 1 : NUM do
    Defining a random value α ∈ {
    
    0.8, 0.99}
    Defining a cell intersecting region Rkk
    Rkknew = max(sck1(Rkk)sck2(Rkk), α ×
    min(sck1(Rkk), sck2(Rkk)))
  end for
end for

Creating S by adding background from Bj and Gaussian noise
	

Fully Connected Conditional Random Fields (CRFs)

After running the DeepLab model, fully connected was used as a post-processing method to obtain the final cytoplasmic segmentation. Fully connected CRFs combined with deep neural networks have shown better segmentation performance in both natural and medical images. Pixel-level deep learning models independently classify each pixel based on image-derived features, yielding unsatisfactory segmentation results. Conditional random fields incorporate prior knowledge of the relationship between pixels into the partitioning process. The fully connected crf adopts the given energy function as

E = ∑ p ψ p ( ( xp ) + ∑ pq ( xp , xq ) E=\sum_p \psi_p((x_p)+\sum_{pq} (x_p,x_q)E=ppp((xp)+pq(xp,xq)

where ppp andqqq is the connected image pixel,xxx is the pixel's label assignment. The first term can use the unary potentialψ p ( xp ) = − log ( P ( xp ) ) ψp(xp)=−log(P(xp))ψp(xp)=l o g ( P ( x p )) , whereP ( xp ) P ( xp )P ( xp ) is the pixel ppcalculated by the DeepLab modelLabel assignment probability at p . The second term is the pairwise potential, which has a form that allows efficient inference when using fully connected CRFs, it can be expressed as
ψ pq ( xp , x 1 ) = ζ ( xp , X q ) ∑ m = 1 2 ω mkm ( p , q ) \psi_{pq}(x_p,x_1)=\zeta (x_p,X_q)\sum_{m=1}^2 ω_mk_m(p,q)ppq(xp,x1)=g ( xp,Xq)m=12ohmkm(p,q )
where,ζ ( xp , xq ) = 1 ζ ( xp , xq ) = 1ζ ( x p , x q )=1 x p = x q x_p = x_q xp=xq, otherwise zero, which means only nodes with different labels will be penalized, and ω m ( m ∈ 1 , 2 ) ω_m(m ∈ {1,2})ohmm1,2 ) is a predefined weight to balance these two items. K(p,q)K(p,q)K ( p , q ) is represented by a Gaussian kernel in different feature spaces:
K 1 ( p , q ) = exp ( ∣ ∣ PP − P q ∣ ∣ 2 2 σ 2 γ − ∣ ∣ CP − C q ∣ ∣ 2 2 σ 2 β ) K_1(p,q)=exp(\frac{||P_P-P_q||^2}{2\sigma^2\gamma}-\frac{||C_P-C_q||^2 }{2\sigma^2\beta})K1(p,q)=exp(2 p2 c∣∣PPPq22 p2 b∣∣CPCq2)
K 2 ( p , q ) = exp ( ∣ ∣ PP − P q ∣ ∣ 2 2 σ 2 γ ) K_2(p, q)=exp(\frac{||P_P-P_q||^2}{2\ sigma^2\gamma})K2(p,q)=exp(2 p2 c∣∣PPPq2)
among them,P p PpPpP q PqPq C p Cp Cp andCq CqCq are pixels p, qp, qrespectivelyp q position和RGB color。 parameterσa、σ b、σ γ {σ_α、σ_β、σ_γ }pa, pb, pcControls the scale of the Gaussian kernel. Figure 3(f) shows the segmentation results after performing CRFs. In summary, we combine the ResNet structure, convolution-free, ASPP, and fully connected CRFs into a segmentation network model to achieve superior cytoplasmic segmentation and detailed cell segmentation maps along cell boundaries.

Cell Boundary Refinement

In order to obtain accurate cell contours, a segmentation method based on distance regularized level set evolution (DRLSE) is adopted, in which a distance regularization term and an external energy term are derived to drive the movement of the zero-level contour to the desired position. Energy function E ( φ ) E(φ)E ( φ ) is defined as
E ( φ ) = ω 1 R ( ϕ ) + ω 2 L ( ϕ ) + ω 3 A ( ϕ ) E(φ) = ω_1R(\phi ) + ω_2L (\phi ) + ω_3A (\phi )E ( f )=oh1R ( ϕ )+oh2L ( ϕ )+oh3A ( ϕ )
where,ϕ \phiϕ represents the level set function in the image domain,(ω 1 , ω 2 , ω 3 ) (ω_1, ω_2, ω_3)( oh1, oh2, oh3) is the weight to balance these three items. R is a regularization term,ζ \zetaζ is the distance term and A is the area term. They are given by
R ( ϕ ) ≜ ∫ ϕ R ( ∇ ϕ ) dx R(\phi )\triangleq \int_\phi R(∇\phi )dxR ( ϕ )ϕR(ϕ)dx

KaTeX parse error: No such environment: eqnarray at position 9: \begin{̲e̲q̲n̲a̲r̲r̲a̲y̲}̲ R(\phi ) = …
L ( ϕ ) ≜ ∫ Ω g δ ( ϕ ) ∣ ∇ φ ∣ dx L(\phi)\triangleq \int_\Omega gδ(\phi )| ∇ φ|dxL ( ϕ )Ohg δ ( ϕ ) ∣∇ φ d x
A ( ϕ ) , ≜ ∫ Ω g H ( ϕ ) dx A(\phi),\triangleq \int_\Omega gH(\phi )dxA ( ϕ ) ,OhgH(ϕ)dx

Where H is the multidimensional function, and δ is H . g = 1 / ( 1 + ∣ 1 + ∇ G σ ∗ I ∣ 2 ) H. g = 1/(1 + |1 +∇G_σ*I|^2)H.g=1/1+∣1+GpI2 )derivatives, whereG σ G_σGpis a Gaussian kernel with standard deviation σ. The segmentation map obtained from the DeepLab model serves as the initialization of the DRLSE-based segmentation method. Compared with a single pixel as the initial point, the DRLSE method requires a small number of iterations to move the zero-level set from the initialized boundary to the desired cell boundary. Also, given ω 3 ω_3oh3A small value is assigned to avoid bounds leak issues. The final cell boundary, as shown in Fig. 3(g), is obtained by segmentation and refinement by DCNN and level set method.


epilogue

This article presents a DCNN-based method that addresses the problem of segmenting each individual cell from a cell population in digitized cytology images. The designed framework is divided into three stages: 1. Cell detection 2. Cytoplasmic segmentation 3. Boundary refinement. Combining the improved DeepLab model and computer imaging technology, it can provide superior performance in segmenting heterogeneous cytoplasm. In the experimental results, DSC, FNRo and TPRp are 0.93, 0.11 and 0.93, respectively.

Guess you like

Origin blog.csdn.net/weixin_51717597/article/details/128281285