A study compared the performance of different models using routinely collected clinicopathological data from women with HR-positive/HER2-negative breast cancer
Machine learning (ML) and deep learning (DL) models can predict survival after neoadjuvant chemotherapy in HR-positive/HER2-negative breast cancer by using routine clinicopathological data, according to findings from a single-centre Italian study published in the ESMO Real World Data and Digital Oncology (ESMO Real World Data and Digital Oncology, Volume 10, 100184 ).
Predicting long-term survival for women with HR-positive/HER2-negative breast cancer undergoing neoadjuvant chemotherapy is challenging due to the heterogeneous nature of the disease and late recurrences, and pathological complete response (pCR) is considered a robust predictor of recurrence risk (Lancet. 2014 Jul 12;384(9938):164-72). While traditional survival models like Cox regression (COX) present limitations, ML and DL methods have emerged as valid alternatives being able to process high-dimensional data and capture complex non-linear relationships.
In the retrospective longitudinal cohort CORALAINE study, five ML and four DL models were trained on the basis of the combination of commonly available clinicopathological features at baseline and post-surgery pathological features collected from 572 women with a confirmed diagnosis of HR-positive/HER2-negative invasive breast cancer. Clinicopathological information and survival data were retrospectively collected from electronic medical records including reports from multidisciplinary tumour boards.
Different models were compared based on their capability to predict disease-free survival (DFS) and overall survival (OS) after neoadjuvant chemotherapy. “A DL model showed the best performance (C-index: 0.70 for DFS, 0.68 for OS), but its advantage over simpler ML models was modest and came at the cost of interpretability,” notes Dr Rodrigo Dienstmann, working at the Oncoclínicas & Co, Brazil, and Vall d’Hebron Institute of Oncology, Spain, and Editor-in-Chief of the ESMO peer-reviewed journal.
As highlighted by the study authors, a good concordance index for survival prediction can be achieved even when complex or high-dimensional data are not available, relying instead on data which are routinely collected in clinical practice.
“These findings suggest that practical, interpretable models may be sufficient for clinical use in small datasets,” concludes Dienstmann.