Machine Learning Algorithm to Estimate Distant Breast Cancer Recurrence at the Population Level with Administrative Data
Hava Izci,1 Gilles Macq,2 Tim Tambuyzer,2 Harlinde De Schutter,2 Hans Wildiers,1,3 Francois P Duhoux,4 Evandro de Azambuja,5 Donatienne Taylor,6 Gracienne Staelens,7 Guy Orye,8 Zuzana Hlavata,9 Helga Hellemans,10 Carine De Rop,11 Patrick Neven,1,3 Freija Verdoodt2 1KU Leuven - University of Leuven,...
Main Authors: | , , , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Dove Medical Press
2023-05-01
|
Series: | Clinical Epidemiology |
Subjects: | |
Online Access: | https://www.dovepress.com/machine-learning-algorithm-to-estimate-distant-breast-cancer-recurrenc-peer-reviewed-fulltext-article-CLEP |
_version_ | 1797831832984616960 |
---|---|
author | Izci H Macq G Tambuyzer T De Schutter H Wildiers H Duhoux FP de Azambuja E Taylor D Staelens G Orye G Hlavata Z Hellemans H De Rop C Neven P Verdoodt F |
author_facet | Izci H Macq G Tambuyzer T De Schutter H Wildiers H Duhoux FP de Azambuja E Taylor D Staelens G Orye G Hlavata Z Hellemans H De Rop C Neven P Verdoodt F |
author_sort | Izci H |
collection | DOAJ |
description | Hava Izci,1 Gilles Macq,2 Tim Tambuyzer,2 Harlinde De Schutter,2 Hans Wildiers,1,3 Francois P Duhoux,4 Evandro de Azambuja,5 Donatienne Taylor,6 Gracienne Staelens,7 Guy Orye,8 Zuzana Hlavata,9 Helga Hellemans,10 Carine De Rop,11 Patrick Neven,1,3 Freija Verdoodt2 1KU Leuven - University of Leuven, Department of Oncology, Leuven, B-3000, Belgium; 2Belgian Cancer Registry, Research Department, Brussels, Belgium; 3University Hospitals Leuven, Multidisciplinary Breast Center, Leuven, B-3000, Belgium; 4Department of Medical Oncology, King Albert II Cancer Institute, Cliniques Universitaires Saint-Luc, Brussels, Belgium; 5Institut Jules Bordet and l’Université Libre de Bruxelles (U.L.B), Brussels, Belgium; 6CHU UCL Namur, Site Sainte-Elisabeth, Namur, Belgium; 7Multidisciplinary Breast Center, General Hospital Groeninge, Kortrijk, Belgium; 8Department of Obstetrics and Gynecology, Jessa Hospital, Hasselt, Belgium; 9Department of Medical Oncology, CHR Mons-Hainaut, Mons, Hainaut, Belgium; 10Department of Obstetrics and Gynaecology, AZ Delta, Roeselaere, Belgium; 11Department of Obstetrics and Gynaecology, Imelda Hospital, Bonheiden, BelgiumCorrespondence: Hava Izci, KU Leuven, Department of oncology, Herestraat 49 Box 7003-06, Leuven, 3000, Belgium, Email hava.izci@kuleuven.bePurpose: High-quality population-based cancer recurrence data are scarcely available, mainly due to complexity and cost of registration. For the first time in Belgium, we developed a tool to estimate distant recurrence after a breast cancer diagnosis at the population level, based on real-world cancer registration and administrative data.Methods: Data on distant cancer recurrence (including progression) from patients diagnosed with breast cancer between 2009– 2014 were collected from medical files at 9 Belgian centers to train, test and externally validate an algorithm (i.e., gold standard). Distant recurrence was defined as the occurrence of distant metastases between 120 days and within 10 years after the primary diagnosis, with follow-up until December 31, 2018. Data from the gold standard were linked to population-based data from the Belgian Cancer Registry (BCR) and administrative data sources. Potential features to detect recurrences in administrative data were defined based on expert opinion from breast oncologists, and subsequently selected using bootstrap aggregation. Based on the selected features, classification and regression tree (CART) analysis was performed to construct an algorithm for classifying patients as having a distant recurrence or not.Results: A total of 2507 patients were included of whom 216 had a distant recurrence in the clinical data set. The performance of the algorithm showed sensitivity of 79.5% (95% CI 68.8– 87.8%), positive predictive value (PPV) of 79.5% (95% CI 68.8– 87.8%), and accuracy of 96.7% (95% CI 95.4– 97.7%). The external validation resulted in a sensitivity of 84.1% (95% CI 74.4– 91.3%), PPV of 84.1% (95% CI 74.4– 91.3%), and an accuracy of 96.8% (95% CI 95.4– 97.9%).Conclusion: Our algorithm detected distant breast cancer recurrences with an overall good accuracy of 96.8% for patients with breast cancer, as observed in the first multi-centric external validation exercise.Keywords: machine learning, breast cancer, distant metastases, recurrences, algorithm, administrative data |
first_indexed | 2024-04-09T13:58:03Z |
format | Article |
id | doaj.art-a45b40a659244509b06f4b8491ebfcaf |
institution | Directory Open Access Journal |
issn | 1179-1349 |
language | English |
last_indexed | 2024-04-09T13:58:03Z |
publishDate | 2023-05-01 |
publisher | Dove Medical Press |
record_format | Article |
series | Clinical Epidemiology |
spelling | doaj.art-a45b40a659244509b06f4b8491ebfcaf2023-05-07T16:04:05ZengDove Medical PressClinical Epidemiology1179-13492023-05-01Volume 1555956883524Machine Learning Algorithm to Estimate Distant Breast Cancer Recurrence at the Population Level with Administrative DataIzci HMacq GTambuyzer TDe Schutter HWildiers HDuhoux FPde Azambuja ETaylor DStaelens GOrye GHlavata ZHellemans HDe Rop CNeven PVerdoodt FHava Izci,1 Gilles Macq,2 Tim Tambuyzer,2 Harlinde De Schutter,2 Hans Wildiers,1,3 Francois P Duhoux,4 Evandro de Azambuja,5 Donatienne Taylor,6 Gracienne Staelens,7 Guy Orye,8 Zuzana Hlavata,9 Helga Hellemans,10 Carine De Rop,11 Patrick Neven,1,3 Freija Verdoodt2 1KU Leuven - University of Leuven, Department of Oncology, Leuven, B-3000, Belgium; 2Belgian Cancer Registry, Research Department, Brussels, Belgium; 3University Hospitals Leuven, Multidisciplinary Breast Center, Leuven, B-3000, Belgium; 4Department of Medical Oncology, King Albert II Cancer Institute, Cliniques Universitaires Saint-Luc, Brussels, Belgium; 5Institut Jules Bordet and l’Université Libre de Bruxelles (U.L.B), Brussels, Belgium; 6CHU UCL Namur, Site Sainte-Elisabeth, Namur, Belgium; 7Multidisciplinary Breast Center, General Hospital Groeninge, Kortrijk, Belgium; 8Department of Obstetrics and Gynecology, Jessa Hospital, Hasselt, Belgium; 9Department of Medical Oncology, CHR Mons-Hainaut, Mons, Hainaut, Belgium; 10Department of Obstetrics and Gynaecology, AZ Delta, Roeselaere, Belgium; 11Department of Obstetrics and Gynaecology, Imelda Hospital, Bonheiden, BelgiumCorrespondence: Hava Izci, KU Leuven, Department of oncology, Herestraat 49 Box 7003-06, Leuven, 3000, Belgium, Email hava.izci@kuleuven.bePurpose: High-quality population-based cancer recurrence data are scarcely available, mainly due to complexity and cost of registration. For the first time in Belgium, we developed a tool to estimate distant recurrence after a breast cancer diagnosis at the population level, based on real-world cancer registration and administrative data.Methods: Data on distant cancer recurrence (including progression) from patients diagnosed with breast cancer between 2009– 2014 were collected from medical files at 9 Belgian centers to train, test and externally validate an algorithm (i.e., gold standard). Distant recurrence was defined as the occurrence of distant metastases between 120 days and within 10 years after the primary diagnosis, with follow-up until December 31, 2018. Data from the gold standard were linked to population-based data from the Belgian Cancer Registry (BCR) and administrative data sources. Potential features to detect recurrences in administrative data were defined based on expert opinion from breast oncologists, and subsequently selected using bootstrap aggregation. Based on the selected features, classification and regression tree (CART) analysis was performed to construct an algorithm for classifying patients as having a distant recurrence or not.Results: A total of 2507 patients were included of whom 216 had a distant recurrence in the clinical data set. The performance of the algorithm showed sensitivity of 79.5% (95% CI 68.8– 87.8%), positive predictive value (PPV) of 79.5% (95% CI 68.8– 87.8%), and accuracy of 96.7% (95% CI 95.4– 97.7%). The external validation resulted in a sensitivity of 84.1% (95% CI 74.4– 91.3%), PPV of 84.1% (95% CI 74.4– 91.3%), and an accuracy of 96.8% (95% CI 95.4– 97.9%).Conclusion: Our algorithm detected distant breast cancer recurrences with an overall good accuracy of 96.8% for patients with breast cancer, as observed in the first multi-centric external validation exercise.Keywords: machine learning, breast cancer, distant metastases, recurrences, algorithm, administrative datahttps://www.dovepress.com/machine-learning-algorithm-to-estimate-distant-breast-cancer-recurrenc-peer-reviewed-fulltext-article-CLEPmachine learningbreast cancerdistant metastasesrecurrencesalgorithmadministrative data |
spellingShingle | Izci H Macq G Tambuyzer T De Schutter H Wildiers H Duhoux FP de Azambuja E Taylor D Staelens G Orye G Hlavata Z Hellemans H De Rop C Neven P Verdoodt F Machine Learning Algorithm to Estimate Distant Breast Cancer Recurrence at the Population Level with Administrative Data Clinical Epidemiology machine learning breast cancer distant metastases recurrences algorithm administrative data |
title | Machine Learning Algorithm to Estimate Distant Breast Cancer Recurrence at the Population Level with Administrative Data |
title_full | Machine Learning Algorithm to Estimate Distant Breast Cancer Recurrence at the Population Level with Administrative Data |
title_fullStr | Machine Learning Algorithm to Estimate Distant Breast Cancer Recurrence at the Population Level with Administrative Data |
title_full_unstemmed | Machine Learning Algorithm to Estimate Distant Breast Cancer Recurrence at the Population Level with Administrative Data |
title_short | Machine Learning Algorithm to Estimate Distant Breast Cancer Recurrence at the Population Level with Administrative Data |
title_sort | machine learning algorithm to estimate distant breast cancer recurrence at the population level with administrative data |
topic | machine learning breast cancer distant metastases recurrences algorithm administrative data |
url | https://www.dovepress.com/machine-learning-algorithm-to-estimate-distant-breast-cancer-recurrenc-peer-reviewed-fulltext-article-CLEP |
work_keys_str_mv | AT izcih machinelearningalgorithmtoestimatedistantbreastcancerrecurrenceatthepopulationlevelwithadministrativedata AT macqg machinelearningalgorithmtoestimatedistantbreastcancerrecurrenceatthepopulationlevelwithadministrativedata AT tambuyzert machinelearningalgorithmtoestimatedistantbreastcancerrecurrenceatthepopulationlevelwithadministrativedata AT deschutterh machinelearningalgorithmtoestimatedistantbreastcancerrecurrenceatthepopulationlevelwithadministrativedata AT wildiersh machinelearningalgorithmtoestimatedistantbreastcancerrecurrenceatthepopulationlevelwithadministrativedata AT duhouxfp machinelearningalgorithmtoestimatedistantbreastcancerrecurrenceatthepopulationlevelwithadministrativedata AT deazambujae machinelearningalgorithmtoestimatedistantbreastcancerrecurrenceatthepopulationlevelwithadministrativedata AT taylord machinelearningalgorithmtoestimatedistantbreastcancerrecurrenceatthepopulationlevelwithadministrativedata AT staelensg machinelearningalgorithmtoestimatedistantbreastcancerrecurrenceatthepopulationlevelwithadministrativedata AT oryeg machinelearningalgorithmtoestimatedistantbreastcancerrecurrenceatthepopulationlevelwithadministrativedata AT hlavataz machinelearningalgorithmtoestimatedistantbreastcancerrecurrenceatthepopulationlevelwithadministrativedata AT hellemansh machinelearningalgorithmtoestimatedistantbreastcancerrecurrenceatthepopulationlevelwithadministrativedata AT deropc machinelearningalgorithmtoestimatedistantbreastcancerrecurrenceatthepopulationlevelwithadministrativedata AT nevenp machinelearningalgorithmtoestimatedistantbreastcancerrecurrenceatthepopulationlevelwithadministrativedata AT verdoodtf machinelearningalgorithmtoestimatedistantbreastcancerrecurrenceatthepopulationlevelwithadministrativedata |