MS【1】:Metric


foreword

This paper mainly introduces several commonly used evaluation criteria in the field of medical image segmentation: Dice Loss, Sensitivity & Specificity, Hausdorff distance, Average surface distanceetc.


1. Dice Loss

1.1. Say the coefficient

DiceCoefficient, which is a set similarity measurement function, is usually used to calculate the similarity between two sample points (the value range is [ 0 , 1 ] [0, 1][0,1 ] ), the larger the value, the more similar the two samples are

  • For the segmentation problem, it is 1 for the best segmentation and 0 for the worst
  • It is used to solve the problem of sample imbalance, but it is unstable and prone to gradient explosion

Calculation formula:
D ice = 2 ∣ X ∩ Y ∣ ∣ X ∣ + ∣ Y ∣ Dice = \frac{2|X \cap Y|}{|X| + |Y|}Dice=X+Y2∣XY

Parameter meaning:

  • ∣ X ∩ Y ∣ |X\cap Y| XY meansXXXYYThe number of intersection elements between Y
  • ∣ X ∣ |X| X ∣ Y ∣ |Y|Y meansXXXYYthe number of elements in Y
  • Among them, the coefficient 2 in the numerator is due to the double calculation of the denominator XXXYYCause of common elements between Y
  • Sometimes an optional parameter is added to both the numerator and denominator: Laplace smoothing
    • Avoid when ∣ X ∣ |X|X ∣ Y ∣ |Y|Y When both are 0, the numerator is divided by 0
    • reduce overfitting

1.2. F1 score - Dice

insert image description here

Truth\Classified Positive Negative
Positive True Positive False Negative
Negative False Positive True Negative
  • Precision
    • Indicates the probability of actually being 1 in a sample predicted to be 1
      P = TPTP + FPP = \frac{TP}{TP + FP}P=TP+FPTP
  • Recall
    • Indicates the probability of predicting 1 in a sample that is actually 1
      R = TPTP + FNR = \frac{TP}{TP + FN}R=TP+FNTP
  • Precision and Recall often restrict each other
    • If the Precision of the model is increased, the Recall of the model will be reduced ;
    • Increasing the Recall of the model will reduce the Precision of the model

In the binary classification problem, Dice coefficientit can also be written as:
D ice = 2 TPFP + 2 TP + FN = F 1 score Dice = \frac{2TP}{FP + 2TP +FN} = F1scoreDice=FP+2TP+FN2TP=F1score

1.3. Dice Loss

Dice LossThe mathematical expression is as follows:
D ice L oss = 1 − D ice = 1 − 2 ∣ X ∩ Y ∣ ∣ X ∣ + ∣ Y ∣ DiceLoss = 1 - Dice = 1 - \frac{2|X \cap Y|}{ |X| + |Y|}DiceLoss=1Dice=1X+Y2∣XY

When Dice Lossused in medical image segmentation problems, the meaning of the parameters:

  • X X X represents the pixel label of the real segmented image
  • YYY represents the pixel category of the model predicting the segmented image
  • ∣ X ∩ Y ∣ |X \cap Y| XY is approximately the dot product between the pixels of the predicted image and the pixels of the true label image, and the dot product results are summed
  • ∣ X ∣ |X| X ∣ Y ∣ |Y|Y are respectively approximated by the addition of pixels in their respective corresponding images

For the binary classification problem, the pixels of the real segmentation label image are only 0 00 1 1 1 two values, so∣ X ∩ Y ∣ |X \cap Y|XY∣ can effectively zero out all pixel values ​​in the predicted segmentation image that are not activated in the ground truth segmentation label image. For the activated pixels, it mainly penalizes low-confidence predictions, and high-confidence predictions will get higher coefficientDice , so as to get lowerDice Loss, namely:
D ice L oss = 1 − 2 ∑ i = 1 N yiyi ^ ∑ i = 1 N yi + ∑ i = 1 N yi ^ DiceLoss = 1 - \frac{2\sum_{ i=1}^N y_i \hat{y_i}}{\sum_{i=1}^N y_i + \sum_{i=1}^N \hat{y_i}}DiceLoss=1i=1Nyi+i=1Nyi^2i=1Nyiyi^

Parameter meaning:

  • y i y_i yimeans pixel iilabel value of i
  • y i ^ \hat{y_i} yi^means pixel iipredicted value of i
  • N N N is the total number of pixels, equal to the number of pixels of a single image multiplied by batchsize

Dice LossIt can alleviate the negative impact caused by the imbalance of the foreground and background (area) in the sample. The imbalance of the foreground and background means that most areas in the image do not contain the target, and only a small part of the area contains the target. Dice LossThe training pays more attention to the mining of the foreground area, that is, it is guaranteed to have a lower FN, but there will be a loss saturation problem. Therefore, using alone Dice Lossoften does not achieve good results, and needs to be used in combination, such as Dice Loss+CE Lossor Dice Loss+Focal Lossetc.


2. Sensitivity & Specificity

Truth\Classified Positive Negative
Positive True Positive False Negative
Negative False Positive True Negative
  • TP : P means your predicted Positive, T (True) means your prediction is correct, TP means you predict positive samples as positive samples
  • FP : P means your predicted Positive, F (False) means your prediction is wrong, FP means you predict negative samples as positive samples
  • TN : N means your predicted Negative, T (True) means your prediction is correct, TN means you predict negative samples as negative samples
  • FN : N means your predicted Negative, F (False) means your prediction is wrong, FP means you predicted positive samples as negative samples
  • FP + TP = all samples classified as positive
  • TP + FN = True Positives + False Negatives = all samples that are really positive

2.1. Sensitivity

TPR : True positive rate, describing the proportion of all positive examples identified to all positive examples

Calculation formula:
TPR = TPTP + FN TPR = \frac{TP}{TP+ FN}TPR=TP+FNTP

It can be understood as the probability that the patient is actually sick and is correctly diagnosed, that is, high sensitivity = low missed diagnosis rate (but many false ones)

2.2. Specificity

FPR : False positive rate, which describes the proportion of negative cases identified as positive cases to all negative cases

F P R = F P F P + T N FPR = \frac{FP}{FP + TN} FPR=FP+TNFP

It can be understood as the probability that the patient is not sick and is correctly diagnosed, that is, low specificity = high misdiagnosis rate (that is, many false negatives)


3. Hausdorff distance

3.1. Concept

Hausdorff distanceis the distance between two subsets in the metric space, which transforms the non-empty subset of the metric space itself into the metric space.

Informally, two sets are close in if every point of one set is close Hausdorff distanceto . Hausdorff distanceRefers to the longest distance an opponent chooses a point in one of two sets and must then travel from there to the other set. In other words, it is the greatest of all distances from a point in one set to the nearest point in the other set.

Suppose there are two sets:
A = { a 1 , a 2 , ⋯ , ap } , B = { b 1 , b 2 , ⋯ , bp } A = \{ a^1, a^2, \cdots, a^ p \}, \quad B = \{ b^1, b^2, \cdots, b^p \}A={ a1,a2,,ap},B={ b1,b2,,bp}

3.2. One-way Hausdorff distance

计算公式:
h ( A , B ) = max ⁡ a ∈ A min ⁡ b ∈ B ∣ ∣ a − b ∣ ∣ h ( B , A ) = max ⁡ b ∈ A min ⁡ a ∈ B ∣ ∣ b − a ∣ ∣ h(A, B) = \displaystyle\max_{a \in A}\displaystyle\min_{b \in B} || a - b || \\ h(B, A) = \displaystyle\max_{b \in A}\displaystyle\min_{a \in B} || b - a || h(A,B)=aAmaxbBmin∣∣ab∣∣h(B,A)=bAmaxaBmin∣∣ba∣∣

Parameter meaning:

  • ∣ ∣ a − b ∣ ∣ || a - b || ∣∣ab ∣∣ represents the Euclidean distance between a and b
  • h ( A , B ) h(A, B) h(A,B ) is also called forwardHausdorff distance,h ( B , A ) h(B, A)h(B,A ) also called backwardHausdorff distance

h ( A , B ) h(A, B)h(A,B ) Understanding:

  • First take the point bjb^j closest to set A in set Bbj , and then calculate aia^ifor each point in the set Aai andbjb^jbThe distance between j , sort the distance, and then take the value with the largest distance ash ( A , B ) h(A, B)h(A,B ) value
  • h ( A , B ) = d h(A, B) = d h(A,B)=d , meansAAAll points in A to BBThe distance of the set B does not exceedddd
  • It should be noted that the Hausdorff distance is directional (or asymmetric), which means that most cases h ( A , B ) h(A, B)h(A,B ) is not equal toh ( B , A ) h(B, A)h(B,A)

Illustration:

  1. Given two point sets AAA andBBB , find their Hausdorff distanceh ( A , B ) h(A, B)h(A,B)

insert image description here

  1. Calculate a 1 a_1a1and b 1 , b 2 , b 3 b_1, b_2, b_3b1,b2,b3The distances d 11 , d 12 , d 13 d_{11}, d_{12}, d_{13}d11,d12,d13

insert image description here

  1. Keep the shortest distance d 11 d_{11}d11

insert image description here

  1. Calculate a 2 a_2a2and b 1 , b 2 , b 3 b_1, b_2, b_3b1,b2,b3The distances d 21 , d 22 , d 23 d_{21}, d_{22}, d_{23}d21,d22,d23

insert image description here

  1. Keep the shortest distance d 23 d_{23}d23

insert image description here

  1. d 11 d_{11} d11and d 23 d_{23}d23The larger one is the Hausdorff distance h ( A , B ) = d ( a 1 , b 1 ) h(A, B) = d(a_1, b_1)h(A,B)=d(a1,b1)

insert image description here

  1. we can get AAAny point in A to BBThe distance between points in part B , at mosth ( A , B ) h(A, B)h(A,B)

insert image description here

3.3. Two-way Hausdorff distance

Calculation formula:
H ( A , B ) = max { h ( A , B ) , h ( B , A ) } H(A, B) = max\{ h(A, B), h(B, A) \ }H(A,B)=max{ h(A,B),h(B,A)}

The two-way Hausdorff distance takes the maximum value of the one-way Hausdorff distance, which measures the degree of dissimilarity between two point sets (the smaller the two-way Hausdorff distance, the higher the matching degree)

3.4. Partial Hausdorff distance

However, when the image has noise pollution or occlusion, the above-mentioned Hausdorff distance can easily cause a mismatch, as shown in the following figure:

insert image description here

The closest point bj b_j in set B to set Abj, distance bj b_j in set AbjThe farthest point is a 2 a_2a2, but due to noise, the Hausdorff distance does not take a 2 a_2a2with bj b_jbjThe distance between, but noise and bj b_jbjdistance between , resulting in an error.

  • Partial one-way Hausdorff distance:
    • Calculation formula:
      • h f F ( A , B ) = f F th ⁡ a i ∈ A min ⁡ b j ∈ B ∣ ∣ a i − b j ∣ ∣ h^{f_F}(A, B) = f_F \displaystyle\th_{a^i \in A} \displaystyle\min_{b_j \in B}|| a^i - b^j || hfF(A,B)=fFthaiAbjBmin∣∣aibj∣∣
      • h f R ( B , A ) = f R th ⁡ b j ∈ B min ⁡ a i ∈ A ∣ ∣ b j − a i ∣ ∣ h^{f_R}(B, A) = f_R \displaystyle\th_{b^j \in B} \displaystyle\min_{a_i \in A}|| b^j - a^i || hfR(B,A)=fRthbjBaiAmin∣∣bjai∣∣
    • Parameter meaning:
      • f F , f R ∈ [ 0 , 1 ] f_F, f_R \in [0, 1] fF,fR[0,1 ] are called the forward score and the backward score respectively, which control the forward distance and the backward distance
      • t h th t h means sort
      • f F = f R = 1 f_F = f_R = 1 fF=fR=1 , the formula degenerates into the original one-way Hausdorff distance
  • Partial two-way Hausdorff distance:
    • H f F f R ( A , B ) = m a x { h f F ( A , B ) , h f R ( B , A ) } H^{f_F f_R} (A, B) = max\{ h^{f_F}(A, B), h^{f_R}(B, A) \} HfFfR(A,B)=max{ hfF(A,B),hfR(B,A)}

4. Average surface distance

4.1. Concept

Mean surface distance This indicator is the average of the surface distances of all points in P, and this indicator can also be called Average Symmetric Surface Distance (ASSD).

By encoding the voxel data, the distance between voxel points and voxel points, a lookup table is established, which greatly reduces the amount of calculation and algorithm complexity, and thus calculates the distance between points. It should be noted here that the distance calculation between voxels is calculated in unit volume

  • ASD calculation formula:
    • X X All points in X set to YYAverage of Y -set surface distances, point xxx to setYYThe distance of Y , is the pointxxxYYY 最近的距离
      A S D ( X , Y ) = ∑ x ∈ X m i n y ∈ Y d ( x , y ) / ∣ X ∣ ASD(X, Y) = \displaystyle\sum_{x \in X} min_{y \in Y} d(x, y) / |X| ASD(X,Y)=xXminyYd(x,y)/∣X
    • Parameter meaning:
      • d ( x , y ) d(x, y) d(x,y ) is composed of two image volumesXXXYY3D matrix of Euclidean distances between Y
  • ASSSD Calculation Formula
    • X X XYYthe mean surface distance of Y , and YYY toXXAverage ASSSD ( X
      , Y ) = { ASD ( X , Y ) + ASD ( Y , X ) } / 2 ASSSD(X, Y ) = \{ ASD(X, Y ) + ASD(Y , X) \} / 2ASSD(X,Y)={ ASD(X,Y)+ASD(Y,X)}/2

4.2. Calculation process

Enter A and B:

  • Similar to marching cubethe isosurface , establish the normal vector of the plane formed by the intersection points, and obtain
    voxel spacing the lookup table (a list with a length of 256) according to the sum of the normal vectorssurface distance
  • Judgment data type (bool), valid area of ​​crop
  • Encoded by kernelconvolution (denoted as CODE)
  • Change the pixel value greater than 0 to 1, and at the same time require that the pixel value is not 255 ( borders)
  • Distance conversion, calculate the distance from a non-zero point in the image to the nearest background point (ie 0), and construct a distance map
  • Take CODEas input, 0~255 corresponds to the element of the index, map with the lookup table, take the area distance value, and form the area distance map ( surface map)
  • Take the pixels of A bordersgreater than 0 as the index (equivalent to the process of point selection), and obtain the mapping result of the distance of B, that is, borderthe distance map between the voxels in the A set and B (denoted as (distances_ map, 1 dimension))
  • Take the pixel of A bordersgreater than 0 as the index, and obtain surface_ mapthe mapping result of A (denoted as (surfel map, 1 dimension))
    • Equivalent to obtaining a voxel boundary surface area distance map
  • Calculate the mean value: (distances map * srufel map) / sum(surfel map), that is, we getASD
    • It can be seen that the boundary surface area distance map here plays a role of weight, which can effectively smooth the error caused by the sharp area

insert image description here

insert image description here


Summarize

It mainly summarizes several commonly used model evaluation criteria in the field of medical image segmentation, and will be supplemented and modified according to the indicators encountered in subsequent papers

Guess you like

Origin blog.csdn.net/HoraceYan/article/details/128640520
ms