publication

Deep learning-based MR-to-CT synthesis: The influence of varying gradient echo-based MR images as input channels

Florkow, Mateusz C, Zijlstra, Frank, Willemsen, Koen, Maspero, Matteo, van den Berg, Cornelis A T, Kerkmeijer, Linda G W, Castelein, René M, Weinans, Harrie, Viergever, Max A, van Stralen, Marijn, Seevinck, Peter R

DOI: https://doi.org/10.1002/mrm.28008

Magnetic Resonance in Medicine 83 (4), p. 1429-1441

Abstract

PURPOSE: To study the influence of gradient echo-based contrasts as input channels to a 3D patch-based neural network trained for synthetic CT (sCT) generation in canine and human populations.

METHODS: Magnetic resonance images and CT scans of human and canine pelvic regions were acquired and paired using nonrigid registration. Magnitude MR images and Dixon reconstructed water, fat, in-phase and opposed-phase images were obtained from a single T1 -weighted multi-echo gradient-echo acquisition. From this set, 6 input configurations were defined, each containing 1 to 4 MR images regarded as input channels. For each configuration, a UNet-derived deep learning model was trained for synthetic CT generation. Reconstructed Hounsfield unit maps were evaluated with peak SNR, mean absolute error, and mean error. Dice similarity coefficient and surface distance maps assessed the geometric fidelity of bones. Repeatability was estimated by replicating the training up to 10 times.

RESULTS: Seventeen canines and 23 human subjects were included in the study. Performance and repeatability of single-channel models were dependent on the TE-related water-fat interference with variations of up to 17% in mean absolute error, and variations of up to 28% specifically in bones. Repeatability, Dice similarity coefficient, and mean absolute error were statistically significantly better in multichannel models with mean absolute error ranging from 33 to 40 Hounsfield units in humans and from 35 to 47 Hounsfield units in canines.

CONCLUSION: Significant differences in performance and robustness of deep learning models for synthetic CT generation were observed depending on the input. In-phase images outperformed opposed-phase images, and Dixon reconstructed multichannel inputs outperformed single-channel inputs.