Evaluation of machine learning methods for covariate data imputation in pharmacometrics

Abstract Missing data create challenges in clinical research because they lead to loss of statistical power and potentially to biased results. Missing covariate data must be handled with suitable approaches to prepare datasets for pharmacometric analyses, such as population pharmacokinetic and pharm...

Full description

Bibliographic Details
Main Authors: Dominic Stefan Bräm, Uri Nahum, Andrew Atkinson, Gilbert Koch, Marc Pfister
Format: Article
Language:English
Published: Wiley 2022-12-01
Series:CPT: Pharmacometrics & Systems Pharmacology
Online Access:https://doi.org/10.1002/psp4.12874
_version_ 1811198867603980288
author Dominic Stefan Bräm
Uri Nahum
Andrew Atkinson
Gilbert Koch
Marc Pfister
author_facet Dominic Stefan Bräm
Uri Nahum
Andrew Atkinson
Gilbert Koch
Marc Pfister
author_sort Dominic Stefan Bräm
collection DOAJ
description Abstract Missing data create challenges in clinical research because they lead to loss of statistical power and potentially to biased results. Missing covariate data must be handled with suitable approaches to prepare datasets for pharmacometric analyses, such as population pharmacokinetic and pharmacodynamic analyses. To this end, various statistical methods have been widely adopted. Here, we introduce two machine‐learning (ML) methods capable of imputing missing covariate data in a pharmacometric setting. Based on a previously published pharmacometric analysis, we simulated multiple missing data scenarios. We compared the performance of four established statistical methods, listwise deletion, mean imputation, standard multiple imputation (hereafter “Norm”), and predictive mean matching (PMM) and two ML based methods, random forest (RF) and artificial neural networks (ANNs), to handle missing covariate data in a statistically plausible manner. The investigated ML‐based methods can be used to impute missing covariate data in a pharmacometric setting. Both traditional imputation approaches and ML‐based methods perform well in the scenarios studied, with some restrictions for individual methods. The three methods exhibiting the best performance in terms of least bias for the investigated scenarios are the statistical method PMM and the two ML‐based methods RF and ANN. ML‐based approaches had comparable good results to the best performing established method PMM. Furthermore, ML methods provide added flexibility when encountering more complex nonlinear relationships, especially when associated parameters are suitably tuned to enhance predictive performance.
first_indexed 2024-04-12T01:38:26Z
format Article
id doaj.art-41d2fb2990be42f0ad388eb4544e9aab
institution Directory Open Access Journal
issn 2163-8306
language English
last_indexed 2024-04-12T01:38:26Z
publishDate 2022-12-01
publisher Wiley
record_format Article
series CPT: Pharmacometrics & Systems Pharmacology
spelling doaj.art-41d2fb2990be42f0ad388eb4544e9aab2022-12-22T03:53:16ZengWileyCPT: Pharmacometrics & Systems Pharmacology2163-83062022-12-0111121638164810.1002/psp4.12874Evaluation of machine learning methods for covariate data imputation in pharmacometricsDominic Stefan Bräm0Uri Nahum1Andrew Atkinson2Gilbert Koch3Marc Pfister4Pediatric Pharmacology and Pharmacometrics University Children's Hospital Basel (UKBB), University of Basel Basel SwitzerlandPediatric Pharmacology and Pharmacometrics University Children's Hospital Basel (UKBB), University of Basel Basel SwitzerlandPediatric Pharmacology and Pharmacometrics University Children's Hospital Basel (UKBB), University of Basel Basel SwitzerlandPediatric Pharmacology and Pharmacometrics University Children's Hospital Basel (UKBB), University of Basel Basel SwitzerlandPediatric Pharmacology and Pharmacometrics University Children's Hospital Basel (UKBB), University of Basel Basel SwitzerlandAbstract Missing data create challenges in clinical research because they lead to loss of statistical power and potentially to biased results. Missing covariate data must be handled with suitable approaches to prepare datasets for pharmacometric analyses, such as population pharmacokinetic and pharmacodynamic analyses. To this end, various statistical methods have been widely adopted. Here, we introduce two machine‐learning (ML) methods capable of imputing missing covariate data in a pharmacometric setting. Based on a previously published pharmacometric analysis, we simulated multiple missing data scenarios. We compared the performance of four established statistical methods, listwise deletion, mean imputation, standard multiple imputation (hereafter “Norm”), and predictive mean matching (PMM) and two ML based methods, random forest (RF) and artificial neural networks (ANNs), to handle missing covariate data in a statistically plausible manner. The investigated ML‐based methods can be used to impute missing covariate data in a pharmacometric setting. Both traditional imputation approaches and ML‐based methods perform well in the scenarios studied, with some restrictions for individual methods. The three methods exhibiting the best performance in terms of least bias for the investigated scenarios are the statistical method PMM and the two ML‐based methods RF and ANN. ML‐based approaches had comparable good results to the best performing established method PMM. Furthermore, ML methods provide added flexibility when encountering more complex nonlinear relationships, especially when associated parameters are suitably tuned to enhance predictive performance.https://doi.org/10.1002/psp4.12874
spellingShingle Dominic Stefan Bräm
Uri Nahum
Andrew Atkinson
Gilbert Koch
Marc Pfister
Evaluation of machine learning methods for covariate data imputation in pharmacometrics
CPT: Pharmacometrics & Systems Pharmacology
title Evaluation of machine learning methods for covariate data imputation in pharmacometrics
title_full Evaluation of machine learning methods for covariate data imputation in pharmacometrics
title_fullStr Evaluation of machine learning methods for covariate data imputation in pharmacometrics
title_full_unstemmed Evaluation of machine learning methods for covariate data imputation in pharmacometrics
title_short Evaluation of machine learning methods for covariate data imputation in pharmacometrics
title_sort evaluation of machine learning methods for covariate data imputation in pharmacometrics
url https://doi.org/10.1002/psp4.12874
work_keys_str_mv AT dominicstefanbram evaluationofmachinelearningmethodsforcovariatedataimputationinpharmacometrics
AT urinahum evaluationofmachinelearningmethodsforcovariatedataimputationinpharmacometrics
AT andrewatkinson evaluationofmachinelearningmethodsforcovariatedataimputationinpharmacometrics
AT gilbertkoch evaluationofmachinelearningmethodsforcovariatedataimputationinpharmacometrics
AT marcpfister evaluationofmachinelearningmethodsforcovariatedataimputationinpharmacometrics