Agent with Warm Start and Adaptive Dynamic Termination for Plane Localization in 3D Ultrasound

Table of contents

Summary

method

frame

 landmark-aware alignment

Plane specific atlas construction

Test Volume Atlas Alignment

Adaptive Dynamic Termination

experiment

Dataset Setting

parameter

RL

RNN (Adaptive Dynamic Termination)

3D U-net (detect fetal brain landmarks)

Evaluation Standards

result

Compare with state-of-the-art methods

Influence of Landmark Alignment Module

Analysis of Adaptive Dynamic Termination

significant difference analysis

Clinical biometric assessment from SP

Qualitative evaluation

Blog related links

Paper References


Agents with Warm Start and Adaptive Dynamic Termination for Planar Localization in 3D Ultrasound [TMI 2021]  [Official Code] Intelligent agent sound pressure level AgentSPL   Feature point alignment, reinforcement learning, CNN, RNN, SP positioning

Summary

In SP localization, the agent may fail to capture the target SP and continue exploring without a termination condition.

The action space can be expanded using further terminating actions [6]. However, expanding the operating space leads to insufficient training.

Some works terminate the proxy search by detecting oscillations [7] or lowest q-values ​​[9]. Although no extra operations are introduced, these methods still require the agent to complete inference with the largest number of steps, which is inefficient.

During the agent search process, it is often trapped in a local minimum, so it is difficult to estimate the optimal termination step.

Therefore, in the SP localization task, it is highly desirable to adopt a dynamic termination strategy to ensure the effectiveness and efficiency of SP localization.

The newly designed adaptive dynamic termination to enhance the previous RL framework enables the agent search to stop early , saving up to 67% of the inference time, thereby improving the accuracy and efficiency of the RL framework at the same time.

method

(This article only talks about the improvement, see the previous RL framework for the unmodified place )

frame

Figure 2 Framework

The framework consists of three modules:

1) Landmark-aware alignment provides a warm start for efficient agent search (left)

2) SP positioning based on deep RL (middle)

3) Learning-based adaptive termination to improve localization efficiency and accuracy (right).

 landmark-aware alignment

Figure 4 Pipeline of the landmark-aware alignment module

A landmark-aware alignment module is proposed in [8] as a dedicated warm start for the search process by dissecting prior knowledge.

A more concrete processing pipeline is detailed in this section . This landmark awareness module aligns the volume of the US to the atlas space, thereby reducing the diversity of fetal posture and ultrasound acquisition.

As shown in Fig. 4, the proposed alignment module consists of two steps, plane-specific atlas construction , test volume-atlas alignment . The details are described below.

Θ calculates the angle between the plane normal vectors, refer to the formula 5 in Section III-C. 

Plane specific atlas construction

In this study, an atlas is constructed to initialize SP localization in the test volume by landmark-based registration.

Therefore, the landmark set selected from the training data set needs to contain both the flags for registration and the SP parameters for plane initialization . As shown in Fig. 4, it is suggested to select a specific atlas for each SP to improve localization accuracy. Instead of choosing a common anatomical model per SP [3]

[3] Diagnostic Plane Extraction from 3DParametric Surface of the Fetal Cranium

To ensure the validity of the initialization, ideally, the specific SP of the selected atlas should be as close as possible to the SPs of other training volumes .

Algorithm 1 shows the determination of the atlas volume for a particular plane from the training dataset based on the minimum plane error (i.e., the sum of angle and distance between two planes).

In the training phase, each volume is firstly used as an initialized agent atlas, and then landmark -based rigid registration is performed on the remaining volumes .

Based on the measured mean planar error between the linear registration plane and the standard plane ground truth for each agent atlas, the volume with the smallest error is selected as the final atlas .

Test Volume Atlas Alignment

The alignment module is based on landmark detection and matching, which is different from direct regression.

Transform landmark detection into a heatmap regression task [37] to avoid learning highly abstract mapping functions (i.e. feature representations as landmark coordinates). A custom 3D U-net [38] is trained with the l2-norm regression loss, expressed as:

Where N = 3 is the number of landmarks, and Hi and ˆHi represent the i-th predicted landmark heatmap and standard landmark heatmap, respectively.

These standard landmark heatmaps are created by placing a Gaussian kernel at the corresponding landmark location.

During inference, the test volume is passed to the landmark detector to get the predicted landmark heatmap.

The coordinate with the largest value in the landmark heatmap is selected as the final prediction .

A bounded environment is created for the agent by mapping the volume to the atlas space via a transformation matrix computed by the landmark .

In addition, the labeled target plane function of the atlas is used as the initial starting plane function of the agent .

Adaptive Dynamic Termination

Considering the sequential nature of the iterative interaction, as shown in Figure 2, an additional RNN model is used to model the mapping between the sequence of q-values ​​and the optimal step size.

The Q value is defined as qt = {q1, qi, ..., q8}, consisting of 8 candidate actions at iteration t;

The sequence of Q values ​​is the time series matrix Q = [q1, q2, ..., qn], where n represents the index of the iteration step.

Taking a sequence of q-values ​​as input, the RNN model can learn the optimal termination step based on the highest angle and distance improvement (ADI).

(ADI is defined by Equation 7 in Section III-C.)

During the training process, we randomly sample subsequences from the q-value sequence as training data , and denote the highest ADI in the sampling interval as the standard slice.

Different from previous studies [8], [9], we design a dynamic termination strategy to improve the inference efficiency of the reinforcement framework.

Specifically, our RNN model performs inference every two iterations based on the current sequence of zero - padded q-values ,

Thus allowing early stopping at iteration steps with predictions for the first three repetitions .

A previous study [8] used the mean absolute error (MAE) loss function to train RNNs in the termination module.

However, it has constant backpropagation gradients and lacks measurement of fine-grained errors .

This study replaces it with a mean squared error (MSE) loss function to revisit this and target a more stable training process.

Since the standard cut plane, that is, the optimal termination step size, is usually greater than 1 (eg, 10∼75), the traditional MSE loss function may be difficult to converge due to too many gradients during training .

Using an MSE loss function with balanced hyperparameters , defined as

Among them, w is the RNN parameter, x is the input sequence of RNN, f(x;w) represents the RNN network, and G represents the optimal termination step.

The balancing hyperparameter δ = 0.01 can approximately normalize the value range of the learned steps to [0,0.75], thus simplifying the training process. The RNN model is trained using the inference results obtained from the training volume. 

experiment

Dataset Setting

The proposed framework was validated using three different 3D US datasets,

Including fetal brain, fetal abdomen, uterus.

Specifically, the goal was to target three SPs:

Transventricular (TV), transthalamic (TT) and transcerebellar (TC),

fetal abdomen (AM) SP,

Sagittal (S), transverse (T) and coronal (C) SPs in the uterus.

Select 3/4 landmarks from each fetal/uterine US volume: genu and splenium of corpus callosum, center of cerebellar vermis as fetal brain volume;

The entrance of the umbilical vein, the center, and the neck of the gallbladder are the volume of the fetal abdomen;

The two endometrial uterine horns, the endometrial fundus and the fundus of the uterine wall form the uterine volume.

A dataset of 1635 prenatal 3D US volumes was collected

433 fetal brains, 519 fetal abdomens and 683 uterine US volumes

Randomly split the dataset for training, validation, and testing

Fetal brain 313, 20, 100, fetal abdomen 389, 20, 110, uterus 519, 20, 144

The average ultrasound volume size of the dataset is 270×207×235 for the fetus and 261×175×277 for the uterus

All images were resampled to an isotropic voxel size , uniform 0.5 × 0.5 × 0.5 mm3

( Isotropy means that the voxels in all directions are the same, such as the voxel space (spacing) of the file is 1mm*1mm*1mm

Anisotropy means that the voxels in each direction are different, such as the voxel space (spacing) of the file is 1mm*1mm*5mm )

Four sonographers with 5 years of experience provided manual annotation of landmarks and standard slices sp for all ultrasound volumes .

All annotation results were reviewed under strict quality control by a senior expert with 20 years of work experience.

All ultrasounds were anonymized and obtained by experts using a Mindray DC-9 ultrasound system with integrated 3D probe, with local institutional review board approval.

  1. train isthe training set,
  2. val is the test set in the training process, in order to see the training results while training, and judge the learning status in time. To verify whether it is overfitting, and to adjust training parameters, etc. There is no intersection with train andno contribution to the final trained model.
  3. test is the test set used to evaluate the model results after the training model is finished.

Only train can be trained, val is not necessary, and the ratio can also be set very small.

test is not necessary for model training, but generally some are reserved for testing, usually the recommended ratio is 8:1:1


But now many models do not need validation . Now the mechanism to prevent overfitting in the model has been relatively perfect, and Dropout\BN has done a good job. And in many cases, fine tuning with the original model is more difficult than starting from scratch. Therefore, everyone generally sets a number of training iterations and directly takes the final model for testing.


parameter

PyTorch

RL

The Adam optimizer [40] trains the entire framework,

DDQN was trained for epoch =100 (about 4 days),

Set the discount factor γ in the loss function (Equation 2) to 0.9. (γ∈[0,1])

The target Q network replicates the parameters of the Q network every 1500 iterations.

The maximum number of iterations was 75 for the fetal dataset and 30 for the uterus dataset, leaving enough room for movement for drug exploration.

− The initial selection strategy [17] is first set to 0.6,

Multiplied by 1.01 every 10000 iterations during training until 0.95. was trained for 100 epochs,

# =============== define training

            # batch size, INT

            self.batch_size = 4

            # target net weight update term, INT

# 目标Q network每1500次迭代复制一次当前Q network的参数

            self.target_step_counter = 1500

            # learning rate, FLOAT

            self.lr = 5e-5

            # weight decay, FLOAT

            self.weight_decay = 1e-4

            # reward decay  γ, FLOAT

            self.gamma = 0.95

            # memory capacity, INT  有优先级的Replay-buffffer

            self.memory_capacity = 15000

            # epsilon for the greedy, FLOAT

            self.epsilon = 0.6



            # =============== define default

            # gpu id

            self.gpu_id = 0

            # total epoch

            self.num_epoch = 100

            # max steps(default fetal,uterus子宫30)

            self.max_step = 75

RNN (Adaptive Dynamic Termination)

Using an MSE loss function with balanced hyperparameters, defined as

Among them, w is the RNN parameter, x is the input sequence of RNN, f(x;w) represents the RNN network, and G represents the optimal termination step. The balancing hyperparameter δ = 0.01 can approximately normalize the value range of the learned steps to [0,0.75], thus simplifying the training process. The RNN model is trained using the inference results obtained from the training volume.

 RNN variants (original RNN and LSTM ( Long Short Term Memory ) [41] ) are trained using the mini-batch stochastic gradient descent (SGD) [ 42] optimizer ,

batch size =100, learning rate = 1e-4, moment is 0.5, epoch=100, it takes about 45 minutes.

The number of hidden units is 64, and the number of RNN layers is 2.

3D U-net (detect fetal brain landmarks)

Adam optimizer, batch size = 1, learning rate = 0.001, moment is 0.5, epoch = 40

Limited by GPU memory, the ultrasound volume is scaled to 0.4 for training.

A Gaussian map of feature points is generated as a canonical slice.

A custom 3D U-net  [38] was trained with the L2-norm regression loss ( three landmarks of the fetal brain were detected , namely the genu of the corpus callosum, the splenium of the corpus callosum, and the vermis of the cerebellum, as shown in (a) three red dots ), shown as:

( Formula 5 )

Where N = 3 is the number of landmarks, and Hi and ˆHi represent the i-th predicted landmark heatmap and standard landmark heatmap, respectively.

Hyperparameters were chosen based on the validation set and several metrics were used on the holdout test set to evaluate the performance of our method.

The model was trained for each hyperparameter with different sizes and the performance on the validation dataset was evaluated.

The hyperparameter values ​​with the best validation performance were chosen as default settings for the training phase.

In this study, three high-impact hyperparameters were searched, including the size of Replay Buffer , γ and .

Evaluation Standards

 Three criteria were used to evaluate
the spatial similarity of planar positioning :

1. Dihedral angle (Ang) between two planes

np, ng represent the normal of the predicted plane and the target plane 

2. The Euclidean distance difference between the two planes and the origin (Dis)

 dp, dg represent the distance from the volume origin to the predicted plane, and the origin to the real plane

(Ang and Dis are based on plane sampling function, namely cos ( α ) x + cos ( β ) y + cos ( γ ) z = d ,

Effective voxel size is 0.5 mm 3 / voxel )

Content similarity :

3. Peak Structural Similarity (SSIM0) [43].

  and define the ADI in iteration t as the sum of the cumulative changes in distance and angle from the starting plane , as follows

result

Table 1 is the comparison results of our proposed method and other existing methods in our fetus (mean ± std, BEST results are highlighted in bold).

Ours:RL(WSADT)

Table II: Comparison results of our proposed method with other existing methods in our uterus (mean ± std, BEST results highlighted in bold).

Table III Comparative results of ablation studies for fetal thermal onset analysis (mean ± standard disease, BEST results highlighted in bold).

 Table iv Comparative results of our uterine hot start assay ablation studies (mean ± std, BEST results highlighted in bold).

 

Table v Comparative results of the ablation studies analyzed for termination strategies in our fetuses (mean ± std, BEST results highlighted in bold).

Table vi presents comparative results from our ablation studies analyzing uterine termination strategies (mean ± std, BEST results highlighted in bold).

 Table 7. Average termination steps for adaptive dynamic termination and proactive termination

 

Table 8: Ablation studies for the number of layers and hidden volumes of LSTMs in the fetal brain dataset. 

 

Table 9. P-values ​​of pairwise t-tests between each method and our method for the three performance metrics on the fetal dataset. Bold results indicate significant differences

 Table X P-values ​​of pairwise t-tests between each method and our method for the three performance metrics in the Uterus dataset. Bold results indicate significant differences

Quantitative analysis of segmentation performance and clinical evaluation of Table xi

Figure 7. Visualization of our method on sampled SPs of an ultrasound fetal dataset. (a) is transcerebellar SP, (b) is transventricular SP, (c) is transthalamic SP, (d) is abdominal SP. For each case, the upper left corner is the predicted standard plane, the upper right corner is the true value, the lower left corner is the inferred curve of the terminated module, and the lower right corner is the predicted plane and the 3D spatial position value of the true value. 

 

 Figure 8. Visualization of our method on sampled SPs from the ultrasound uterus dataset. (a) is the midsagittal plane SP, (b) is the transverse SP, and (c) is the coronal plane SP. For each case, the upper left corner is the predicted standard plane, the upper right corner is the true value, the lower left corner is the inferred curve of the terminated module, and the lower right corner is the predicted plane and the 3D spatial location of the true value.

Compare with state-of-the-art methods

To test the effectiveness of our proposed method in standard plane localization, we conduct comparative experiments with a classical learning-based regression method, denoted as Regression, the state-of-the-art automatic view planning method [9], denoted as AVP, and Our previous method [8], denoted RL-US. To achieve a fair comparison, we used the default flat initialization strategies for both regression and AVP, and retrained all two compared models using a common implementation. We also tuned the training parameters to achieve the best localization results. As shown in Table I and Table II, it can be observed that our method achieves the highest accuracy on almost all metrics. This demonstrates the superior capability of our method in standard planar localization tasks.

Influence of Landmark Alignment Module

To verify the impact of the landmark-aware alignment module of the proposed method, we compare the performance of the framework with and without it. In the pre-regression approach, we set agents with random priming. Plane functions like [9] and choose the lowest q-value [9] as the termination step. The Regist method indicates that the alignment module is equipped, but there is no framework for agent search. The post-regression approach represents search results for agents initialized with warmup using the alignment module. We also chose the lowest q-value termination strategy to implement the post-regression strategy for a fair comparison. As shown in Table III and Table II, the accuracy of the prediction method is significantly lower than that of the prediction method and the post-forecast method. This demonstrates that the landmark-aware alignment module can consistently improve plane detection accuracy. Figure 5 provides the three-dimensional spatial distribution of fetal brain landmarks before and after alignment. It can be observed that all landmarks are mapped to a similar spatial location, suggesting that all fetal poses are roughly aligned

Analysis of Adaptive Dynamic Termination

To demonstrate the impact of the proposed Adaptive Dynamic Termination (ADT) strategy, we conduct comparative experiments with existing popular strategies such as termination using the maximum iteration (Max-Step), lowest Q-value (Low-Q-value [9] ) and active termination using LSTM [8] (AT-LSTM). We also compare our proposed ADT with different backbone networks, including multi-layer perceptron (ADT-MLP), vanilla RNN (ADT-RNN) and LSTM (ADT-LSTM). The superscript ∗ indicates that the model uses a normalized MSE loss function (LMSE, Equation 4). As shown in Table V and Table VI, after adopting an adaptive dynamic termination strategy, the agent can avoid falling into a poor local minimum and obtain better performance. Moreover, from Table VII, we can observe that our proposed dynamic termination can save about 67% of the inference time at most, thus improving the efficiency of the reinforcement framework.

significant difference analysis

To investigate whether the differences between different methods are statistically significant, we performed paired t-tests on the results of our method against regression, AVP [9], and registration. These tests were carried out for all performance indicators. Includes angle, distance and SSIM. We set the significance level at 0.05. The results are shown in Tables IX and x. The comparison and inspection results in Tables I-IV and IX-X show that our method performs best among existing methods (regression, AVP [9]) and registration. Although our method outperforms AT-LSTM [8] without significant difference, our method can save up to 67% inference time, as shown in Table 7.

Clinical biometric assessment from SP

In this section, we further explore whether the detected planes can provide consistent accurate biometrics with those obtained by artificially obtained planes, which is a more clinical concern. To obtain data on the prediction planes (TT and AM), we segment the fetal head and abdomen using the pre-trained DeepLabv3+ [44]. Two minimal ellipses of the fetal head or abdominal circumference and the ground truth marked on the target plane are then generated. We evaluate the performance of biometrics using three metrics, including dice score (Dice), absolute error (AErrorr) and relative error (R-Errorr). As shown in Table 11, the proposed method achieves good performance on Dice scoring. At the same time, the absolute error and relative error of fetal head circumference and abdominal circumference of this method are 1.125mm, 2.05% and 3.608mm, 3.25%, respectively. The p-values ​​in Table XI also show that our predicted biometrics are not significantly different from annotations. This shows similar performance to human-level performance [45], [46] and shows that the proposed method has the potential to be applied in real clinical settings.

Qualitative evaluation

Figure 7 and Figure 8 provide visualization results of the proposed method. It shows predicted planes, true values, termination curves and 3D space visualization for four randomly selected cases. It can be observed that the predictions are spatially close and visually similar to the ground truth. Furthermore, the method can consistently reach an ideal stopping point. Neither the maximum iteration termination strategy nor the lowest Q value termination strategy can find the optimal termination step.

While RL was effective in localizing the field of view plane in MRI [9], it failed to localize the localization of SPs in 3D US. Without the alignment module and early stopping settings, AVP needs to perform agent training and inference in a huge search space. Therefore, it is easier for learning-based localization methods to localize SPs within a limited search space. This might explain the relatively low performance of [9] in Tables III and IV. The proposed landmark-aware alignment module is designed based on the exact concern. It aligns all volumes to the same atlas space using rigid registration, which constrains the environment as in MRI images. Furthermore, our proposed alignment method can be seen as based on pre-agent initialization when testing US volumes, which reduces the search space to a fine-grained subspace.

In deep RL, an appropriate termination strategy is essential, while during iterative search, the agent is often trapped in local minima, making it difficult to estimate the optimal termination step. Previous studies [7], [9]. Several different termination strategies were proposed. However, as shown in Tables V and VI, Figure 7 and Figure 8, the above experiments or previous knowledge-based termination strategies failed to estimate the optimal termination step in this challenging task. Meanwhile, previous studies [9], [8] default the agent to terminate at a fixed maximum step, leading to inefficiencies in the localization system. Our previous study designed a learning-based active termination using RNNs to learn the mapping between q-value sequences and optimal steps. However, it also needs to wait for the agent to complete inference. In contrast, our termination module allows an implicit relationship between the learned q-value curve and the optimal termination step for dynamic agent search using RNNs. The resulting RL framework enables more accurate and efficient predictions. Note that this learning-based termination strategy is a general approach that can be applied to other similar tasks.

Blog related links

Summary of identifying and extracting standard planes in 3D ultrasound + papers + code collection - luemeon's blog - CSDN blog

Searching Collaborative Agents for Multi-plane Localization in 3D Ultrasound Multi-agent Reinforcement Learning (MARL) locates multiple standard slices of ultrasound_luemeon's blog-CSDN blog

Agent with Warm Start and Active Termination for Plane Localization in 3DUltrasound Ultrasound Standard Section Based on Reinforcement Learning_luemeon's Blog-CSDN Blog

Agent with Tangent-based Formulation and Anatomical Perception for Standard Plane Localization in 3D_luemeon's Blog-CSDN Blog

Paper References

R EFERENCES
[1] L. Salomon, Z. Alfifirevic, C. Bilardo, G. Chalouhi, T. Ghi, K. Kagan,
T. Lau, A. Papageorghiou, N. Raine-Fenning, J. Stirnemann, S. Suresh,
A. Tabor, I. Timor-Tritsch, A. Toi, and G. Yeo, “Erratum: ISUOG
practice guidelines: Performance of fifirst-trimester fetal ultrasound scan,”
Ultrasound in Obstetrics and Gynecology , vol. 41, no. 2, pp. 102–113,
2013.
[2] L. Salomon, Z. Alfifirevic, V. Berghella, C. Bilardo, E. Hernandez
Andrade, S. Johnsen, K. Kalache, K.-Y. Leung, G. Malinger, H. Munoz,
et al. , “Practice guidelines for performance of the routine mid-trimester
fetal ultrasound scan,” Ultrasound in Obstetrics & Gynecology , vol. 37,
no. 1, pp. 116–126, 2011.
[3] A. I. Namburete, R. V. Stebbing, and J. A. Noble, “Diagnostic plane
extraction from 3d parametric surface of the fetal cranium,” in MIUA ,
pp. 27–32, 2014.
[4] A. W. Moore, “Effificient memory-based learning for robot control,” 1990.
[5] A. G. Barto, R. S. Sutton, and C. W. Anderson, “Neuronlike adaptive
elements that can solve diffificult learning control problems,” IEEE
Transactions on Systems, Man, and Cybernetics , vol. SMC-13, no. 5,
pp. 834–846, 1983.
[6] J. C. Caicedo and S. Lazebnik, “Active object localization with deep
reinforcement learning,” in 2015 IEEE International Conference on
Computer Vision (ICCV) , p. Rev. 2488–2496.
[7] F.-C. Ghesu, B. Georgescu, Y. Zheng, S. Grbic, AK Maier, J. Horneg
ger, and D. Comaniciu, “Multi-scale deep reinforcement learning for
real-time 3d-landmark detection in ct scans,” IEEE Transactions on
Pattern Analysis and Machine Intelligence , vol. 41, pp. 176–189, 2019.
[8] H. Dou, X. Yang, J. Qian, W. Xue, H. Qin, X. Wang, L. Yu, S. Wang,
Y. Xiong, P.-A. Heng, et al. , “Agent with warm start and active
termination for plane localization in 3d ultrasound,” in International
Conference on Medical Image Computing and Computer-Assisted Inter
vention , pp. 290–298, Springer, 2019.
[9] A. Alansary , L. Le Folgoc , G. Vaillant , O. Oktay , Y. Li , W. Bai , .
J. Passerat-Palmbach, R. Guerrero, K. Kamnitsas, B. Hou, et al. ,
“Automatic view planning with multi-scale deep reinforcement learning
agents,” in International Conference on Medical Image Computing and
Computer-Assisted Intervention , pp. 277–285, Springer, 2018.
[10] D. Ni, X. Yang, X. Chen, C.-T. Chin, S. Chen, PA Heng, S. Li,
J. Qin, and T. Wang, “Standard plane localization in ultrasound by
radial component model and selective search,” Ultrasound in Medicine
& Biology , vol. 40, no. 11, pp. 2728–2742, 2014.
[11] X. Yang, D. Ni, J. Qin, S. Li, T. Wang, S. Chen, and PA Heng,
“Standard plane localization in ultrasound by radial component,” in
2014 IEEE 11th International Symposium on Biomedical Imaging (ISBI) ,
pp. 1180–1183, IEEE, 2014.
[12] B. Lei, L. Zhuo, S. Chen, S. Li, D. Ni, and T. Wang, “Automatic
recognition of fetal standard plane in ultrasound image,” in 2014 IEEE
11th International Symposium on Biomedical Imaging (ISBI) , pp. 85–88,
IEEE, 2014.
[13] L. Zhang, S. Chen, C. T. Chin, T. Wang, and S. Li, “Intelligent scanning:
Automated standard plane selection and biometric measurement of early
gestational sac in routine ultrasound examination,” Medical Physics ,
vol. 39, no. 8, p. 5015–5027, 2012.
[14] H. Chen, D. Ni, J. Qin, S. Li, X. Yang, T. Wang, and PA Heng, “Stan
dard plane localization in fetal ultrasound via domain transferred deep
neural networks,” IEEE Journal of Biomedical and Health Informatics ,
vol. 19, no. 5, p. 1627–1636, 2015.
[15] H. Chen, D. Ni, X. Yang, S. Li, and PA Heng, “Fetal abdominal stan
dard plane localization through representation learning with knowledge
transfer,” in International Workshop on Machine Learning in Medical
Imaging , pp. 125–132, Springer, 2014.
[16] H. Chen, L. Wu, Q. Dou, J. Qin, S. Li, J.-Z. Cheng, D. Ni, and P.-A.
Heng, “Ultrasound standard plane detection using a composite neural
network framework,” IEEE Transactions on Cybernetics , vol. 47, no. 6,
pp. 1576–1586, 2017.
[17] W. Huang, C. P. Bridge, J. A. Noble, and A. Zisserman, “Temporal
heartnet: towards human-level automatic analysis of fetal cardiac screen
ing video,” in International Conference on Medical Image Computing
and Computer-Assisted Intervention , vol. 10434, pp. 341–349, Springer,
2017.
[18] Y. Gao and J. A. Noble, “Detection and characterization of the fetal
heartbeat in free-hand ultrasound sweeps with weakly-supervised two
streams convolutional networks,” in International Conference on Medi
cal Image Computing and Computer-Assisted Intervention , vol. 10434,
pp. 305–313, Springer, 2017.
[19] Baumgartner CF, Kamnitsas K, Matthew J, Fletcher TP, Smith S,
L. M. Koch, B. Kainz, and D. Rueckert, “Sononet: real-time detection
and localisation of fetal standard scan planes in freehand ultrasound,”
IEEE Transactions on Medical Imaging , vol. 36, no. 11, pp. 2204–2215,
2017.
[20] J. Schlemper, O. Oktay, M. Schaap, M. Heinrich, B. Kainz, B. Glocker,
and D. Rueckert, “Attention gated networks: Learning to leverage salient
regions in medical images,” Medical Image Analysis , vol. 53, pp. 197–
207, 2019.
[21] L. Wu, J.-Z. Cheng, S. Li, B. Lei, T. Wang, and D. Ni, “FUIQA: Fetal
ultrasound image quality assessment with deep convolutional networks,”
IEEE Transactions on Cybernetics , vol. 47, no. 5, pp. 1336–1349, 2017.
[22] H. Luo, H. Liu, K. Li, and B. Zhang, “Automatic quality assessment
for 2d fetal sonographic standard plane based on multi-task learning,”
arXiv preprint arXiv:1912.05260 , 2019.
[23] Z. Lin, S. Li, D. Ni, Y. Liao, H. Wen, J. Du, S. Chen, T. Wang,
and B. Lei, “Multi-task learning for quality assessment of fetal head
ultrasound images,” Medical Image Analysis , vol. 58, p. 101548, 2019.
[24] P. Zhu and Z. Li, “Guideline-based machine learning for standard plane
extraction in 3d cardiac ultrasound,” in Medical Computer Vision and
Bayesian and Graphical Models for Biomedical Imaging , pp. 137–147,
Springer, 2016.
[25] S. Nie, J. Yu, P. Chen, Y. Wang, and J. Q. Zhang, “Automatic detection
of standard sagittal plane in the fifirst trimester of pregnancy using 3-
d ultrasound data,” Ultrasound in Medicine & Biology , vol. 43, no. 1,
pp. 286–300, 2017.
[26] C. Lorenz, T. Brosch, C. Ciofolo-Veit, T. Klinder, T. Lefevre, A. Cav
allaro, I. Salim, AT Papageorghiou, C. Raynaud, D. Roundhill,
et al. , “Automated abdominal plane and circumference estimation in
3d us for fetal screening,” in Medical Imaging 2018: Image Processing ,
vol. 10574, p. 105740I, International Society for Optics and Photonics,
2018.
[27] K. Chykeyuk, M. Yaqub, and J. A. Noble, “Class-specifific regression
random forest for accurate extraction of standard planes from 3d
echocardiography,” in International MICCAI Workshop on Medical
Computer Vision , pp. 53–62, Springer, 2013.
[28] H. Ryou, M. Yaqub, A. Cavallaro, F. Roseman, A. Papageorghiou, and
J. A. Noble, “Automated 3d ultrasound biometry planes extraction for
fifirst trimester fetal assessment,” in International Workshop on Machine
Learning in Medical Imaging , pp. 196–204, Springer, 2016.
[29] A. Schmidt-Richberg, N. Schadewaldt, T. Klinder, M. Lenga, R. Trahms,
E. Canfifield, D. Roundhill, and C. Lorenz, “Offset regression networks
for view plane estimation in 3d fetal ultrasound,” in Medical Imaging 12
2019: Image Processing , vol. 10949, p. 109493K, International Society
for Optics and Photonics, 2019.
[30] Y. Li, B. Khanal, B. Hou, A. Alansary, JJ Cerrolaza, M. Sinclair,
J. Matthew, C. Gupta, C. Knight, B. Kainz, et al. , “Standard plane
detection in 3d fetal ultrasound using an iterative transformation net
work,” in International Conference on Medical Image Computing and
Computer-Assisted Intervention , pp. 392–400, Springer, 2018.
[31] V. Mnih, K. Kavukcuoglu, D. Silver, AA Rusu, J. Veness, MG
Bellemare, A. Graves, M. Riedmiller, AK Fidjeland, G. Ostrovski,
et al. , “Human-level control through deep reinforcement learning,”
Nature , vol. 518, no. 7540, pp. 529–533, 2015.
[32] C. J. Watkins and P. Dayan, “Q-learning,” Machine Learning , vol. 8,
no. 3-4, pp. 279–292, 1992.
[33] Z. Wang, T. Schaul, M. Hessel, H. Hasselt, M. Lanctot, and N. Freitas,
“Dueling network architectures for deep reinforcement learning,” in
International Conference on Machine Learning , pp. 1995–2003, 2016.
[34] K. Simonyan and A. Zisserman, “Very deep convolutional networks for
large-scale image recognition,” arXiv preprint arXiv:1409.1556 , 2014.
[35] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep
network training by reducing internal covariate shift,” arXiv preprint
arXiv:1502.03167 , 2015.
[36] T. Schaul, J. Quan, I. Antonoglou, and D. Silver, “Prioritized experience
replay,” International Conference on Learning Representations , 2016.
[37] R. Huang, W. Xie, and J. A. Noble, “VP-Nets: Effificient automatic lo
calization of key brain structures in 3d fetal neurosonography,” Medical
Image Analysis , vol. 47, pp. 127–139, 2018.
[38]
¨
O. C¸ ic¸ek, A. Abdulkadir, SS Lienkamp, ​​T. Brox, and O. Ronneberger,
“3d U-Net: learning dense volumetric segmentation from sparse anno
tation,” in International Conference on Medical Image Computing and
Computer-assisted Intervention , pp. 424–432, Springer, 2016.
[39] A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan,
T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al. , "Pytorch: An
imperative style, high-performance deep learning library,” in Advances
in Neural Information Processing Systems , pp. 8026–8037, 2019.
[40] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”
arXiv preprint arXiv:1412.6980 , 2014.
[41] S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural
Computation , vol. 9, no. 8, pp. 1735–1780, 1997.
[42] L. Bottou and O. Bousquet, “The tradeoffs of large scale learning,”
Advances in Neural Information Processing Systems , vol. 20, pp. 161–
168, 2007.
[43] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image
quality assessment: from error visibility to structural similarity,” IEEE
Transactions on Image Processing , vol. 13, no. 4, pp. 600–612, 2004.
[44] L.-C. Chen, Y. Zhu, G. Papandreou, F. Schroff, and H. Adam, “Encoder
decoder with atrous separable convolution for semantic image segmen
tation,” in Proceedings of the European Conference on Computer Vision
(ECCV) , pp. 801–818, 2018.
[45] L. Wu, Y. Xin, S. Li, T. Wang, P.-A. Heng, and D. Ni, “Cascaded fully
convolutional networks for automatic prenatal ultrasound image seg
mentation,” in 2017 IEEE 14th International Symposium on Biomedical
Imaging (ISBI 2017) , pp. 663–666, IEEE, 2017.
[46] TL van den Heuvel, D. de Bruijn, CL de Korte, and B. van
Ginneken, “Automated measurement of fetal head circumference using
2d ultrasound images,” PloS one , vol. 13, no. 8, p. e0200412, 2018.

Guess you like

Origin blog.csdn.net/qq_28838891/article/details/126758394