Prediction of clinical trial enrollment rates

Clinical trials represent a critical milestone of translational and clinical sciences. However, poor recruitment to clinical trials has been a long standing problem affecting institutions all over the world. One way to reduce the cost incurred by insufficient enrollment is to minimize initiating tri...

Full description

Bibliographic Details
Main Authors: Cameron Bieganek, Constantin Aliferis, Sisi Ma
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2022-01-01
Series:PLoS ONE
Online Access:https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8870517/?tool=EBI
_version_ 1818985426111692800
author Cameron Bieganek
Constantin Aliferis
Sisi Ma
author_facet Cameron Bieganek
Constantin Aliferis
Sisi Ma
author_sort Cameron Bieganek
collection DOAJ
description Clinical trials represent a critical milestone of translational and clinical sciences. However, poor recruitment to clinical trials has been a long standing problem affecting institutions all over the world. One way to reduce the cost incurred by insufficient enrollment is to minimize initiating trials that are most likely to fall short of their enrollment goal. Hence, the ability to predict which proposed trials will meet enrollment goals prior to the start of the trial is highly beneficial. In the current study, we leveraged a data set extracted from ClinicalTrials.gov that consists of 46,724 U.S. based clinical trials from 1990 to 2020. We constructed 4,636 candidate predictors based on data collected by ClinicalTrials.gov and external sources for enrollment rate prediction using various state-of-the-art machine learning methods. Taking advantage of a nested time series cross-validation design, our models resulted in good predictive performance that is generalizable to future data and stable over time. Moreover, information content analysis revealed the study design related features to be the most informative feature type regarding enrollment. Compared to the performance of models built with all features, the performance of models built with study design related features is only marginally worse (AUC = 0.78 ± 0.03 vs. AUC = 0.76 ± 0.02). The results presented can form the basis for data-driven decision support systems to assess whether proposed clinical trials would likely meet their enrollment goal.
first_indexed 2024-12-20T18:34:42Z
format Article
id doaj.art-7b5edd38916c4334afcd89ea07db766e
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-20T18:34:42Z
publishDate 2022-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-7b5edd38916c4334afcd89ea07db766e2022-12-21T19:29:58ZengPublic Library of Science (PLoS)PLoS ONE1932-62032022-01-01172Prediction of clinical trial enrollment ratesCameron BieganekConstantin AliferisSisi MaClinical trials represent a critical milestone of translational and clinical sciences. However, poor recruitment to clinical trials has been a long standing problem affecting institutions all over the world. One way to reduce the cost incurred by insufficient enrollment is to minimize initiating trials that are most likely to fall short of their enrollment goal. Hence, the ability to predict which proposed trials will meet enrollment goals prior to the start of the trial is highly beneficial. In the current study, we leveraged a data set extracted from ClinicalTrials.gov that consists of 46,724 U.S. based clinical trials from 1990 to 2020. We constructed 4,636 candidate predictors based on data collected by ClinicalTrials.gov and external sources for enrollment rate prediction using various state-of-the-art machine learning methods. Taking advantage of a nested time series cross-validation design, our models resulted in good predictive performance that is generalizable to future data and stable over time. Moreover, information content analysis revealed the study design related features to be the most informative feature type regarding enrollment. Compared to the performance of models built with all features, the performance of models built with study design related features is only marginally worse (AUC = 0.78 ± 0.03 vs. AUC = 0.76 ± 0.02). The results presented can form the basis for data-driven decision support systems to assess whether proposed clinical trials would likely meet their enrollment goal.https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8870517/?tool=EBI
spellingShingle Cameron Bieganek
Constantin Aliferis
Sisi Ma
Prediction of clinical trial enrollment rates
PLoS ONE
title Prediction of clinical trial enrollment rates
title_full Prediction of clinical trial enrollment rates
title_fullStr Prediction of clinical trial enrollment rates
title_full_unstemmed Prediction of clinical trial enrollment rates
title_short Prediction of clinical trial enrollment rates
title_sort prediction of clinical trial enrollment rates
url https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8870517/?tool=EBI
work_keys_str_mv AT cameronbieganek predictionofclinicaltrialenrollmentrates
AT constantinaliferis predictionofclinicaltrialenrollmentrates
AT sisima predictionofclinicaltrialenrollmentrates