$Effects of influential points and sample size on the selection and replicability of multivariable fractional polynomial models$

Effects of influential points and sample size on the selection and replicability of multivariable fractional polynomial models

Abstract Background The multivariable fractional polynomial (MFP) approach combines variable selection using backward elimination with a function selection procedure (FSP) for fractional polynomial (FP) functions. It is a relatively simple approach which can be easily understood without advanced tra...

Full description

Bibliographic Details
Main Authors:	Willi Sauerbrei, Edwin Kipruto, James Balmford
Format:	Article
Language:	English
Published:	BMC 2023-04-01
Series:	Diagnostic and Prognostic Research
Subjects:	Continuous variable Fractional polynomial Influential point Model building Sample size Simulated data
Online Access:	https://doi.org/10.1186/s41512-023-00145-1

_version_	1797840801786494976
author	Willi Sauerbrei Edwin Kipruto James Balmford
author_facet	Willi Sauerbrei Edwin Kipruto James Balmford
author_sort	Willi Sauerbrei
collection	DOAJ
description	Abstract Background The multivariable fractional polynomial (MFP) approach combines variable selection using backward elimination with a function selection procedure (FSP) for fractional polynomial (FP) functions. It is a relatively simple approach which can be easily understood without advanced training in statistical modeling. For continuous variables, a closed test procedure is used to decide between no effect, linear, FP1, or FP2 functions. Influential points (IPs) and small sample sizes can both have a strong impact on a selected function and MFP model. Methods We used simulated data with six continuous and four categorical predictors to illustrate approaches which can help to identify IPs with an influence on function selection and the MFP model. Approaches use leave-one or two-out and two related techniques for a multivariable assessment. In eight subsamples, we also investigated the effects of sample size and model replicability, the latter by using three non-overlapping subsamples with the same sample size. For better illustration, a structured profile was used to provide an overview of all analyses conducted. Results The results showed that one or more IPs can drive the functions and models selected. In addition, with a small sample size, MFP was not able to detect some non-linear functions and the selected model differed substantially from the true underlying model. However, when the sample size was relatively large and regression diagnostics were carefully conducted, MFP selected functions or models that were similar to the underlying true model. Conclusions For smaller sample size, IPs and low power are important reasons that the MFP approach may not be able to identify underlying functional relationships for continuous variables and selected models might differ substantially from the true model. However, for larger sample sizes, a carefully conducted MFP analysis is often a suitable way to select a multivariable regression model which includes continuous variables. In such a case, MFP can be the preferred approach to derive a multivariable descriptive model.
first_indexed	2024-04-09T16:20:36Z
format	Article
id	doaj.art-95dfafb53054422485a7efb4799a79fe
institution	Directory Open Access Journal
issn	2397-7523
language	English
last_indexed	2024-04-09T16:20:36Z
publishDate	2023-04-01
publisher	BMC
record_format	Article
series	Diagnostic and Prognostic Research
spelling	doaj.art-95dfafb53054422485a7efb4799a79fe2023-04-23T11:31:25ZengBMCDiagnostic and Prognostic Research2397-75232023-04-017111710.1186/s41512-023-00145-1Effects of influential points and sample size on the selection and replicability of multivariable fractional polynomial modelsWilli Sauerbrei0Edwin Kipruto1James Balmford2Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of FreiburgInstitute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of FreiburgInstitute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center, University of FreiburgAbstract Background The multivariable fractional polynomial (MFP) approach combines variable selection using backward elimination with a function selection procedure (FSP) for fractional polynomial (FP) functions. It is a relatively simple approach which can be easily understood without advanced training in statistical modeling. For continuous variables, a closed test procedure is used to decide between no effect, linear, FP1, or FP2 functions. Influential points (IPs) and small sample sizes can both have a strong impact on a selected function and MFP model. Methods We used simulated data with six continuous and four categorical predictors to illustrate approaches which can help to identify IPs with an influence on function selection and the MFP model. Approaches use leave-one or two-out and two related techniques for a multivariable assessment. In eight subsamples, we also investigated the effects of sample size and model replicability, the latter by using three non-overlapping subsamples with the same sample size. For better illustration, a structured profile was used to provide an overview of all analyses conducted. Results The results showed that one or more IPs can drive the functions and models selected. In addition, with a small sample size, MFP was not able to detect some non-linear functions and the selected model differed substantially from the true underlying model. However, when the sample size was relatively large and regression diagnostics were carefully conducted, MFP selected functions or models that were similar to the underlying true model. Conclusions For smaller sample size, IPs and low power are important reasons that the MFP approach may not be able to identify underlying functional relationships for continuous variables and selected models might differ substantially from the true model. However, for larger sample sizes, a carefully conducted MFP analysis is often a suitable way to select a multivariable regression model which includes continuous variables. In such a case, MFP can be the preferred approach to derive a multivariable descriptive model.https://doi.org/10.1186/s41512-023-00145-1Continuous variableFractional polynomialInfluential pointModel buildingSample sizeSimulated data
spellingShingle	Willi Sauerbrei Edwin Kipruto James Balmford Effects of influential points and sample size on the selection and replicability of multivariable fractional polynomial models Diagnostic and Prognostic Research Continuous variable Fractional polynomial Influential point Model building Sample size Simulated data
title	Effects of influential points and sample size on the selection and replicability of multivariable fractional polynomial models
title_full	Effects of influential points and sample size on the selection and replicability of multivariable fractional polynomial models
title_fullStr	Effects of influential points and sample size on the selection and replicability of multivariable fractional polynomial models
title_full_unstemmed	Effects of influential points and sample size on the selection and replicability of multivariable fractional polynomial models
title_short	Effects of influential points and sample size on the selection and replicability of multivariable fractional polynomial models
title_sort	effects of influential points and sample size on the selection and replicability of multivariable fractional polynomial models
topic	Continuous variable Fractional polynomial Influential point Model building Sample size Simulated data
url	https://doi.org/10.1186/s41512-023-00145-1
work_keys_str_mv	AT willisauerbrei effectsofinfluentialpointsandsamplesizeontheselectionandreplicabilityofmultivariablefractionalpolynomialmodels AT edwinkipruto effectsofinfluentialpointsandsamplesizeontheselectionandreplicabilityofmultivariablefractionalpolynomialmodels AT jamesbalmford effectsofinfluentialpointsandsamplesizeontheselectionandreplicabilityofmultivariablefractionalpolynomialmodels

Effects of influential points and sample size on the selection and replicability of multivariable fractional polynomial models

Similar Items