Preferring Box-Cox transformation, instead of log transformation to convert skewed distribution of outcomes to normal in medical research

Background: While dealing with skewed outcome, researchers often use log-transformation to convert the data into normal and apply commonly used statistical tests like t-test, linear regression, etc. However, the log-transformed data will not be normal at all times. In such situations, Box-Cox transf...

Full description

Bibliographic Details
Main Authors: S. Marimuthu, Thenmozhi Mani, Thambu David Sudarsanam, Sebastian George, L. Jeyaseelan
Format: Article
Language:English
Published: Elsevier 2022-05-01
Series:Clinical Epidemiology and Global Health
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2213398422000859
_version_ 1811239496135475200
author S. Marimuthu
Thenmozhi Mani
Thambu David Sudarsanam
Sebastian George
L. Jeyaseelan
author_facet S. Marimuthu
Thenmozhi Mani
Thambu David Sudarsanam
Sebastian George
L. Jeyaseelan
author_sort S. Marimuthu
collection DOAJ
description Background: While dealing with skewed outcome, researchers often use log-transformation to convert the data into normal and apply commonly used statistical tests like t-test, linear regression, etc. However, the log-transformed data will not be normal at all times. In such situations, Box-Cox transformation (BCT) can be used to transform skewed data into normal. However, the problem arises when researcher wanted to predict the outcome in original scale. Therefore the aim of this paper is to demonstrated the use of BCT for a skewed outcome and predict the outcome in original scale, using regression method. Materials and method: The Cost of Pyelonephritis in Type-2 Diabetes (COPID) study data was used to demonstrate the BCT and back transformation method. This study conducted among patients admitted in the general medical wards in a tertiary care hospital from south India. The BCT was applied for total cost to convert it into normal. The multiple linear regression method was used and the predicted values were back transformed into original scale. Results: The estimated lambda was −0.36. After BCT, total cost was approximately normal (p-value = 0.621). The residual plots suggested that the error follows normal and the variance is constant. The median (IQR) of the observed total cost was 57694(42405, 98621) whereas predicted total cost was 58317(44270, 95375). Conclusion: When the data is skewed, the log-transformation is not appropriate in all scenarios. However, BCT will ensure normal distribution after transformation and also we can back transform the outcome in original scale given the covariates.
first_indexed 2024-04-12T13:01:54Z
format Article
id doaj.art-d4355158bee94340b6d89060c1c55da6
institution Directory Open Access Journal
issn 2213-3984
language English
last_indexed 2024-04-12T13:01:54Z
publishDate 2022-05-01
publisher Elsevier
record_format Article
series Clinical Epidemiology and Global Health
spelling doaj.art-d4355158bee94340b6d89060c1c55da62022-12-22T03:32:09ZengElsevierClinical Epidemiology and Global Health2213-39842022-05-0115101043Preferring Box-Cox transformation, instead of log transformation to convert skewed distribution of outcomes to normal in medical researchS. Marimuthu0Thenmozhi Mani1Thambu David Sudarsanam2Sebastian George3L. Jeyaseelan4Department of Biostatistics, Christian Medical College, Vellore, Tamil Nadu, IndiaDepartment of Biostatistics, Christian Medical College, Vellore, Tamil Nadu, IndiaDepartment of Medicine, and Clinical Epidemiology Unit, Christian Medical College, Vellore, Tamil Nadu, IndiaDepartment of Statistical SciencesMangattuparamba Campus, Kannur University, Kerala, 670567, IndiaCollege of Medicine, MBR University for Medical and Health Sciences, Dubai, 505055, United Arab Emirates; Corresponding author. College Of Medicine, MBR University for Medical and Health Sciences, Dubai, 505055, United Arab Emirates.Background: While dealing with skewed outcome, researchers often use log-transformation to convert the data into normal and apply commonly used statistical tests like t-test, linear regression, etc. However, the log-transformed data will not be normal at all times. In such situations, Box-Cox transformation (BCT) can be used to transform skewed data into normal. However, the problem arises when researcher wanted to predict the outcome in original scale. Therefore the aim of this paper is to demonstrated the use of BCT for a skewed outcome and predict the outcome in original scale, using regression method. Materials and method: The Cost of Pyelonephritis in Type-2 Diabetes (COPID) study data was used to demonstrate the BCT and back transformation method. This study conducted among patients admitted in the general medical wards in a tertiary care hospital from south India. The BCT was applied for total cost to convert it into normal. The multiple linear regression method was used and the predicted values were back transformed into original scale. Results: The estimated lambda was −0.36. After BCT, total cost was approximately normal (p-value = 0.621). The residual plots suggested that the error follows normal and the variance is constant. The median (IQR) of the observed total cost was 57694(42405, 98621) whereas predicted total cost was 58317(44270, 95375). Conclusion: When the data is skewed, the log-transformation is not appropriate in all scenarios. However, BCT will ensure normal distribution after transformation and also we can back transform the outcome in original scale given the covariates.http://www.sciencedirect.com/science/article/pii/S2213398422000859An approximate estimatorBack transformationBox-Cox transformationLog-transformationSkewed-outcome data
spellingShingle S. Marimuthu
Thenmozhi Mani
Thambu David Sudarsanam
Sebastian George
L. Jeyaseelan
Preferring Box-Cox transformation, instead of log transformation to convert skewed distribution of outcomes to normal in medical research
Clinical Epidemiology and Global Health
An approximate estimator
Back transformation
Box-Cox transformation
Log-transformation
Skewed-outcome data
title Preferring Box-Cox transformation, instead of log transformation to convert skewed distribution of outcomes to normal in medical research
title_full Preferring Box-Cox transformation, instead of log transformation to convert skewed distribution of outcomes to normal in medical research
title_fullStr Preferring Box-Cox transformation, instead of log transformation to convert skewed distribution of outcomes to normal in medical research
title_full_unstemmed Preferring Box-Cox transformation, instead of log transformation to convert skewed distribution of outcomes to normal in medical research
title_short Preferring Box-Cox transformation, instead of log transformation to convert skewed distribution of outcomes to normal in medical research
title_sort preferring box cox transformation instead of log transformation to convert skewed distribution of outcomes to normal in medical research
topic An approximate estimator
Back transformation
Box-Cox transformation
Log-transformation
Skewed-outcome data
url http://www.sciencedirect.com/science/article/pii/S2213398422000859
work_keys_str_mv AT smarimuthu preferringboxcoxtransformationinsteadoflogtransformationtoconvertskeweddistributionofoutcomestonormalinmedicalresearch
AT thenmozhimani preferringboxcoxtransformationinsteadoflogtransformationtoconvertskeweddistributionofoutcomestonormalinmedicalresearch
AT thambudavidsudarsanam preferringboxcoxtransformationinsteadoflogtransformationtoconvertskeweddistributionofoutcomestonormalinmedicalresearch
AT sebastiangeorge preferringboxcoxtransformationinsteadoflogtransformationtoconvertskeweddistributionofoutcomestonormalinmedicalresearch
AT ljeyaseelan preferringboxcoxtransformationinsteadoflogtransformationtoconvertskeweddistributionofoutcomestonormalinmedicalresearch