Determining Dimensionality with Dichotomous Variables: A Monte Carlo Simulation Study and Applications to Missing Data in Longitudinal Research

Dichotomous data correspond with various types of commonly encountered data, e.g., positive/negative, case/control, missing/observed, in many fields, including medicine, health, and social sciences. Despite their ubiquity, criteria for determining dimensionality from dichotomous variables are not ye...

Full description

Bibliographic Details
Main Authors: Ting Dai, Adam Davey
Format: Article
Language:English
Published: MDPI AG 2023-03-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/11/6/1411
_version_ 1797610382912651264
author Ting Dai
Adam Davey
author_facet Ting Dai
Adam Davey
author_sort Ting Dai
collection DOAJ
description Dichotomous data correspond with various types of commonly encountered data, e.g., positive/negative, case/control, missing/observed, in many fields, including medicine, health, and social sciences. Despite their ubiquity, criteria for determining dimensionality from dichotomous variables are not yet established. We conducted a large-scale simulation (Study 1) to evaluate four criteria—Kaiser, empirical Kaiser, parallel analysis, and profile likelihood—to determine the dimensionality of dichotomous data across combinations of correlation matrices (Pearson r or tetrachoric ρ) and analysis methods (principal component analysis or exploratory factor analysis), and combinations of study characteristics: sample sizes (100, 250, and 1000), variable splits (10%/90%, 25%/75%, and 50%/50%), dimensions (1, 3, 5, and 10), and items per dimension (3, 5, and 10) with 1000 replications per condition. Parallel analysis performed best, recovering dimensionality in 87.9% of replications when using principal component analysis with Pearson correlations. Guidance for selecting criteria is provided. In Study 2, we applied this dimensionality reduction approach to two different longitudinal data sets where missing data posed difficulty for multivariate data analysis. The applications of this approach to longitudinal data suggest that the exploration of resulting missing data meta-patterns is useful in practice.
first_indexed 2024-03-11T06:13:38Z
format Article
id doaj.art-b2205ec6fde5493d921cc7705d8d6ddf
institution Directory Open Access Journal
issn 2227-7390
language English
last_indexed 2024-03-11T06:13:38Z
publishDate 2023-03-01
publisher MDPI AG
record_format Article
series Mathematics
spelling doaj.art-b2205ec6fde5493d921cc7705d8d6ddf2023-11-17T12:28:12ZengMDPI AGMathematics2227-73902023-03-01116141110.3390/math11061411Determining Dimensionality with Dichotomous Variables: A Monte Carlo Simulation Study and Applications to Missing Data in Longitudinal ResearchTing Dai0Adam Davey1Department of Educational Psychology, University of Illinois Chicago, Chicago, IL 60607, USADepartment of Behavioral Health and Nutrition, University of Delaware, Newark, DE 19716, USADichotomous data correspond with various types of commonly encountered data, e.g., positive/negative, case/control, missing/observed, in many fields, including medicine, health, and social sciences. Despite their ubiquity, criteria for determining dimensionality from dichotomous variables are not yet established. We conducted a large-scale simulation (Study 1) to evaluate four criteria—Kaiser, empirical Kaiser, parallel analysis, and profile likelihood—to determine the dimensionality of dichotomous data across combinations of correlation matrices (Pearson r or tetrachoric ρ) and analysis methods (principal component analysis or exploratory factor analysis), and combinations of study characteristics: sample sizes (100, 250, and 1000), variable splits (10%/90%, 25%/75%, and 50%/50%), dimensions (1, 3, 5, and 10), and items per dimension (3, 5, and 10) with 1000 replications per condition. Parallel analysis performed best, recovering dimensionality in 87.9% of replications when using principal component analysis with Pearson correlations. Guidance for selecting criteria is provided. In Study 2, we applied this dimensionality reduction approach to two different longitudinal data sets where missing data posed difficulty for multivariate data analysis. The applications of this approach to longitudinal data suggest that the exploration of resulting missing data meta-patterns is useful in practice.https://www.mdpi.com/2227-7390/11/6/1411dimensionality determinationbinary variabledichotomous variableprincipal component analysisparallel analysisfactor analysis
spellingShingle Ting Dai
Adam Davey
Determining Dimensionality with Dichotomous Variables: A Monte Carlo Simulation Study and Applications to Missing Data in Longitudinal Research
Mathematics
dimensionality determination
binary variable
dichotomous variable
principal component analysis
parallel analysis
factor analysis
title Determining Dimensionality with Dichotomous Variables: A Monte Carlo Simulation Study and Applications to Missing Data in Longitudinal Research
title_full Determining Dimensionality with Dichotomous Variables: A Monte Carlo Simulation Study and Applications to Missing Data in Longitudinal Research
title_fullStr Determining Dimensionality with Dichotomous Variables: A Monte Carlo Simulation Study and Applications to Missing Data in Longitudinal Research
title_full_unstemmed Determining Dimensionality with Dichotomous Variables: A Monte Carlo Simulation Study and Applications to Missing Data in Longitudinal Research
title_short Determining Dimensionality with Dichotomous Variables: A Monte Carlo Simulation Study and Applications to Missing Data in Longitudinal Research
title_sort determining dimensionality with dichotomous variables a monte carlo simulation study and applications to missing data in longitudinal research
topic dimensionality determination
binary variable
dichotomous variable
principal component analysis
parallel analysis
factor analysis
url https://www.mdpi.com/2227-7390/11/6/1411
work_keys_str_mv AT tingdai determiningdimensionalitywithdichotomousvariablesamontecarlosimulationstudyandapplicationstomissingdatainlongitudinalresearch
AT adamdavey determiningdimensionalitywithdichotomousvariablesamontecarlosimulationstudyandapplicationstomissingdatainlongitudinalresearch