Feature Selection is Critical for 2-Year Prognosis in Advanced Stage High Grade Serous Ovarian Cancer by Using Machine Learning
Introduction Accurate prediction of patient prognosis can be especially useful for the selection of best treatment protocols. Machine Learning can serve this purpose by making predictions based upon generalizable clinical patterns embedded within learning datasets. We designed a study to support the...
Main Authors: | , , , , , , , , , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
SAGE Publishing
2021-10-01
|
Series: | Cancer Control |
Online Access: | https://doi.org/10.1177/10732748211044678 |
_version_ | 1818740041905602560 |
---|---|
author | Alexandros Laios MD, PhD Angeliki Katsenou PhD Yong Sheng Tan MBBS Racheal Johnson MBChB, MRCOG Mohamed Otify MBBCh, MRCOG, MSc Angelika Kaufmann MBBS, MRCOG, PhD Sarika Munot MBBS Amudha Thangavelu MBChB, MRCOG, MD Richard Hutson MBBS, MRCOG, MD Tim Broadhead MBChB, MRCOG Georgios Theophilou MBBS, MRCOG, MD David Nugent MBChB, MRCOG, PhD Diederick De Jong MBBCh, PhD, MSc |
author_facet | Alexandros Laios MD, PhD Angeliki Katsenou PhD Yong Sheng Tan MBBS Racheal Johnson MBChB, MRCOG Mohamed Otify MBBCh, MRCOG, MSc Angelika Kaufmann MBBS, MRCOG, PhD Sarika Munot MBBS Amudha Thangavelu MBChB, MRCOG, MD Richard Hutson MBBS, MRCOG, MD Tim Broadhead MBChB, MRCOG Georgios Theophilou MBBS, MRCOG, MD David Nugent MBChB, MRCOG, PhD Diederick De Jong MBBCh, PhD, MSc |
author_sort | Alexandros Laios MD, PhD |
collection | DOAJ |
description | Introduction Accurate prediction of patient prognosis can be especially useful for the selection of best treatment protocols. Machine Learning can serve this purpose by making predictions based upon generalizable clinical patterns embedded within learning datasets. We designed a study to support the feature selection for the 2-year prognostic period and compared the performance of several Machine Learning prediction algorithms for accurate 2-year prognosis estimation in advanced-stage high grade serous ovarian cancer (HGSOC) patients. Methods The prognosis estimation was formulated as a binary classification problem. Dataset was split into training and test cohorts with repeated random sampling until there was no significant difference (p = 0.20) between the two cohorts. A ten-fold cross-validation was applied. Various state-of-the-art supervised classifiers were used. For feature selection, in addition to the exhaustive search for the best combination of features, we used the-chi square test of independence and the MRMR method. Results Two hundred nine patients were identified. The model's mean prediction accuracy reached 73%. We demonstrated that Support-Vector-Machine and Ensemble Subspace Discriminant algorithms outperformed Logistic Regression in accuracy indices. The probability of achieving a cancer-free state was maximised with a combination of primary cytoreduction, good performance status and maximal surgical effort (AUC 0.63). Standard chemotherapy, performance status, tumour load and residual disease were consistently predictive of the mid-term overall survival (AUC 0.63–0.66). The model recall and precision were greater than 80%. Conclusion Machine Learning appears to be promising for accurate prognosis estimation. Appropriate feature selection is required when building an HGSOC model for 2-year prognosis prediction. We provide evidence as to what combination of prognosticators leads to the largest impact on the HGSOC 2-year prognosis. |
first_indexed | 2024-12-18T01:34:26Z |
format | Article |
id | doaj.art-ccc639714bb1408c8eb74fdc4be154b5 |
institution | Directory Open Access Journal |
issn | 1073-2748 |
language | English |
last_indexed | 2024-12-18T01:34:26Z |
publishDate | 2021-10-01 |
publisher | SAGE Publishing |
record_format | Article |
series | Cancer Control |
spelling | doaj.art-ccc639714bb1408c8eb74fdc4be154b52022-12-21T21:25:30ZengSAGE PublishingCancer Control1073-27482021-10-012810.1177/10732748211044678Feature Selection is Critical for 2-Year Prognosis in Advanced Stage High Grade Serous Ovarian Cancer by Using Machine LearningAlexandros Laios MD, PhDAngeliki Katsenou PhDYong Sheng Tan MBBSRacheal Johnson MBChB, MRCOGMohamed Otify MBBCh, MRCOG, MScAngelika Kaufmann MBBS, MRCOG, PhDSarika Munot MBBSAmudha Thangavelu MBChB, MRCOG, MDRichard Hutson MBBS, MRCOG, MDTim Broadhead MBChB, MRCOGGeorgios Theophilou MBBS, MRCOG, MDDavid Nugent MBChB, MRCOG, PhDDiederick De Jong MBBCh, PhD, MScIntroduction Accurate prediction of patient prognosis can be especially useful for the selection of best treatment protocols. Machine Learning can serve this purpose by making predictions based upon generalizable clinical patterns embedded within learning datasets. We designed a study to support the feature selection for the 2-year prognostic period and compared the performance of several Machine Learning prediction algorithms for accurate 2-year prognosis estimation in advanced-stage high grade serous ovarian cancer (HGSOC) patients. Methods The prognosis estimation was formulated as a binary classification problem. Dataset was split into training and test cohorts with repeated random sampling until there was no significant difference (p = 0.20) between the two cohorts. A ten-fold cross-validation was applied. Various state-of-the-art supervised classifiers were used. For feature selection, in addition to the exhaustive search for the best combination of features, we used the-chi square test of independence and the MRMR method. Results Two hundred nine patients were identified. The model's mean prediction accuracy reached 73%. We demonstrated that Support-Vector-Machine and Ensemble Subspace Discriminant algorithms outperformed Logistic Regression in accuracy indices. The probability of achieving a cancer-free state was maximised with a combination of primary cytoreduction, good performance status and maximal surgical effort (AUC 0.63). Standard chemotherapy, performance status, tumour load and residual disease were consistently predictive of the mid-term overall survival (AUC 0.63–0.66). The model recall and precision were greater than 80%. Conclusion Machine Learning appears to be promising for accurate prognosis estimation. Appropriate feature selection is required when building an HGSOC model for 2-year prognosis prediction. We provide evidence as to what combination of prognosticators leads to the largest impact on the HGSOC 2-year prognosis.https://doi.org/10.1177/10732748211044678 |
spellingShingle | Alexandros Laios MD, PhD Angeliki Katsenou PhD Yong Sheng Tan MBBS Racheal Johnson MBChB, MRCOG Mohamed Otify MBBCh, MRCOG, MSc Angelika Kaufmann MBBS, MRCOG, PhD Sarika Munot MBBS Amudha Thangavelu MBChB, MRCOG, MD Richard Hutson MBBS, MRCOG, MD Tim Broadhead MBChB, MRCOG Georgios Theophilou MBBS, MRCOG, MD David Nugent MBChB, MRCOG, PhD Diederick De Jong MBBCh, PhD, MSc Feature Selection is Critical for 2-Year Prognosis in Advanced Stage High Grade Serous Ovarian Cancer by Using Machine Learning Cancer Control |
title | Feature Selection is Critical for 2-Year Prognosis in Advanced Stage High Grade Serous Ovarian Cancer by Using Machine Learning |
title_full | Feature Selection is Critical for 2-Year Prognosis in Advanced Stage High Grade Serous Ovarian Cancer by Using Machine Learning |
title_fullStr | Feature Selection is Critical for 2-Year Prognosis in Advanced Stage High Grade Serous Ovarian Cancer by Using Machine Learning |
title_full_unstemmed | Feature Selection is Critical for 2-Year Prognosis in Advanced Stage High Grade Serous Ovarian Cancer by Using Machine Learning |
title_short | Feature Selection is Critical for 2-Year Prognosis in Advanced Stage High Grade Serous Ovarian Cancer by Using Machine Learning |
title_sort | feature selection is critical for 2 year prognosis in advanced stage high grade serous ovarian cancer by using machine learning |
url | https://doi.org/10.1177/10732748211044678 |
work_keys_str_mv | AT alexandroslaiosmdphd featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning AT angelikikatsenouphd featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning AT yongshengtanmbbs featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning AT rachealjohnsonmbchbmrcog featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning AT mohamedotifymbbchmrcogmsc featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning AT angelikakaufmannmbbsmrcogphd featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning AT sarikamunotmbbs featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning AT amudhathangavelumbchbmrcogmd featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning AT richardhutsonmbbsmrcogmd featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning AT timbroadheadmbchbmrcog featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning AT georgiostheophiloumbbsmrcogmd featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning AT davidnugentmbchbmrcogphd featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning AT diederickdejongmbbchphdmsc featureselectioniscriticalfor2yearprognosisinadvancedstagehighgradeserousovariancancerbyusingmachinelearning |