publication

Deep Learning-Based Segmentation of Locally Advanced Breast Cancer on MRI in Relation to Residual Cancer Burden: A Multiinstitutional Cohort Study

Janse, Markus H.A., Janssen, Liselore M., van der Velden, Bas H.M., Moman, Maaike R., Wolters-van der Ben, Elian J.M., Kock, Marc C.J.M., Viergever, Max A., van Diest, Paul J., Gilhuijs, Kenneth G.A.

DOI: https://doi.org/10.1002/jmri.28679

Journal of Magnetic Resonance Imaging 58 (6), p. 1739-1749

Abstract

Background: While several methods have been proposed for automated assessment of breast-cancer response to neoadjuvant chemotherapy on breast MRI, limited information is available about their performance across multiple institutions. Purpose: To assess the value and robustness of deep learning-derived volumes of locally advanced breast cancer (LABC) on MRI to infer the presence of residual disease after neoadjuvant chemotherapy. Study Type: Retrospective. Subjects: Training cohort: 102 consecutive female patients with LABC scheduled for neoadjuvant chemotherapy (NAC) from a single institution (age: 25–73 years). Independent testing cohort: 55 consecutive female patients with LABC from four institutions (age: 25–72 years). Field Strength/Sequence: Training cohort: single vendor 1.5 T or 3.0 T. Testing cohort: multivendor 3.0 T. Gradient echo dynamic contrast-enhanced sequences. Assessment: A convolutional neural network (nnU-Net) was trained to segment LABC. Based on resulting tumor volumes, an extremely randomized tree model was trained to assess residual cancer burden (RCB)-0/I vs. RCB-II/III. An independent model was developed using functional tumor volume (FTV). Models were tested on an independent testing cohort and response assessment performance and robustness across multiple institutions were assessed. Statistical Tests: The receiver operating characteristic (ROC) was used to calculate the area under the ROC curve (AUC). DeLong's method was used to compare AUCs. Correlations were calculated using Pearson's method. P values <0.05 were considered significant. Results: Automated segmentation resulted in a median (interquartile range [IQR]) Dice score of 0.87 (0.62–0.93), with similar volumetric measurements (R = 0.95, P < 0.05). Automated volumetric measurements were significantly correlated with FTV (R = 0.80). Tumor volume-derived from deep learning of DCE-MRI was associated with RCB, yielding an AUC of 0.76 to discriminate between RCB-0/I and RCB-II/III, performing similar to the FTV-based model (AUC = 0.77, P = 0.66). Performance was comparable across institutions (IQR AUC: 0.71–0.84). Data Conclusion: Deep learning-based segmentation estimates changes in tumor load on DCE-MRI that are associated with RCB after NAC and is robust against variations between institutions. Evidence Level: 2. Technical Efficacy: Stage 4.