Enhancing Deep Learning CT pancreas segmentation with Test-Time Augmentation and Merging Techniques to Leverage Inter-rater variability and uncertainty
Mostra el registre complet Registre parcial de l'ítem
- dc.contributor.author Sastre García, Blanca
- dc.date.accessioned 2023-10-18T14:14:45Z
- dc.date.available 2023-10-18T14:14:45Z
- dc.date.issued 2023-10-18
- dc.description Treball fi de màster de: Master in Computational Biomedical Engineering. Tutors: Miguel Ángel González Ballester, Meritxell Riera i Marín, Javier García Lópezca
- dc.description.abstract In recent years, Deep Learning (DL) models have achieved state-of-the-art performance in automatic segmentation of medical images. However, to obtain good results, large and accurately annotated datasets are required for training. In practice, especially in the field of medicine, acquiring such datasets is challenging as they need to be manually annotated by experts, which is a time-consuming task that requires specific skills. Consequently, these datasets often include noisy labels, where the structures of interest are not well-defined and also present variability among annotations performed by different experts. This introduces uncertainty in the predictions obtained by neural network models, which needs to be identified, measured, and addressed. In this project, a pipeline is proposed to improve the outputs of a 3D U-net model trained with a noisy dataset generated from the public Computed Tomography (CT) pancreas dataset provided by the Medical Segmentation Decathlon (MSD) Challenge. Once the noisy model is trained, Test Time Augmentation (TTA) technique is applied to each test image, generating a set of 50 images with different rotation angles and its corresponding automatic segmentations performed by the noisy model. To obtain a consensus label from them, different merging algorithms are used (intersection, union, majority voting, and STAPLE) and also the aleatoric uncertainty is computed as the variance between them. Finally, in areas of high uncertainty, a relabeling of the noisy output is performed. Each label is compared with the clean label in terms of Dice coefficient (DC) to evaluate if there is an improvement with any merging algorithm or the relabeled output compared to the noisy model's output. The employed 3D-Unet model, with the selected hyperparameters and the available dataset, achieves a DC of 0.6301 over the test set, when trained with the clean labels. When using the noisy dataset in the training process, as expected, the performance decreases to 0.6050. Results show that there is not a significant difference in terms of DC among the different merging algorithms, with a maximum value of 0.6423 for the STAPLE method and a minimum of 0.5740 for the intersection. However, the STAPLE method incorporates the model's variability in its predictions, resulting in a more comprehensive output and obtaining better performance than the clean model. Regarding the relabeled output, it does not improve the result with respect to the noisy output, yielding a DC of 0.5940 but it is 78% similar to the output that the clean model obtains. This result shows the significance of the relabeling process in refining the output of a noisy model bringing it closer to the results obtained when training with correct labels. The main limitations of this project include the difficulty of acquiring large and accurate datasets to investigate the problem, the high computational costs during DL models training and the complex and variable sizes and shapes of the pancreas between patients. More efforts should be done to draw more robust conclusions, for instance, evaluating the results in a larger test set and comparing this methodology when applied to a different dataset.ca
- dc.format.mimetype application/pdf*
- dc.identifier.uri http://hdl.handle.net/10230/58091
- dc.language eng
- dc.language.iso engca
- dc.rights AttributionNonCommercial- NoDerivs 3.0 Spainca
- dc.rights.accessRights info:eu-repo/semantics/openAccessca
- dc.rights.uri https://creativecommons.org/licenses/by-nc-nd/3.0/es/deed.esca
- dc.subject.keyword Imaging segmentation
- dc.subject.keyword 3D U-net
- dc.subject.keyword Noisy labels
- dc.subject.keyword TTA
- dc.subject.keyword Merging algorithms
- dc.title Enhancing Deep Learning CT pancreas segmentation with Test-Time Augmentation and Merging Techniques to Leverage Inter-rater variability and uncertaintyca
- dc.type info:eu-repo/semantics/masterThesisca