Mixed cumulative probit: a multivariate generalization of transition analysis that accommodates variation in the shape, spread and structure of data
Biological data are frequently nonlinear, heteroscedastic and conditionally dependent, and often researchers deal with missing data. To account for characteristics common in biological data in one algorithm, we developed the mixed cumulative probit (MCP), a novel latent trait model that is a formal...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
The Royal Society
2023-03-01
|
Series: | Royal Society Open Science |
Subjects: | |
Online Access: | https://royalsocietypublishing.org/doi/10.1098/rsos.220963 |
_version_ | 1797858273645297664 |
---|---|
author | Kyra E. Stull Elaine Y. Chu Louise K. Corron Michael H. Price |
author_facet | Kyra E. Stull Elaine Y. Chu Louise K. Corron Michael H. Price |
author_sort | Kyra E. Stull |
collection | DOAJ |
description | Biological data are frequently nonlinear, heteroscedastic and conditionally dependent, and often researchers deal with missing data. To account for characteristics common in biological data in one algorithm, we developed the mixed cumulative probit (MCP), a novel latent trait model that is a formal generalization of the cumulative probit model usually used in transition analysis. Specifically, the MCP accommodates heteroscedasticity, mixtures of ordinal and continuous variables, missing values, conditional dependence and alternative specifications of the mean response and noise response. Cross-validation selects the best model parameters (mean response and the noise response for simple models, as well as conditional dependence for multivariate models), and the Kullback–Leibler divergence evaluates information gain during posterior inference to quantify mis-specified models (conditionally dependent versus conditionally independent). Two continuous and four ordinal skeletal and dental variables collected from 1296 individuals (aged birth to 22 years) from the Subadult Virtual Anthropology Database are used to introduce and demonstrate the algorithm. In addition to describing the features of the MCP, we provide material to help fit novel datasets using the MCP. The flexible, general formulation with model selection provides a process to robustly identify the modelling assumptions that are best suited for the data at hand. |
first_indexed | 2024-04-09T21:10:40Z |
format | Article |
id | doaj.art-3be6093f9ca54a3ca76be89ba2e264e8 |
institution | Directory Open Access Journal |
issn | 2054-5703 |
language | English |
last_indexed | 2024-04-09T21:10:40Z |
publishDate | 2023-03-01 |
publisher | The Royal Society |
record_format | Article |
series | Royal Society Open Science |
spelling | doaj.art-3be6093f9ca54a3ca76be89ba2e264e82023-03-28T20:17:14ZengThe Royal SocietyRoyal Society Open Science2054-57032023-03-0110310.1098/rsos.220963Mixed cumulative probit: a multivariate generalization of transition analysis that accommodates variation in the shape, spread and structure of dataKyra E. Stull0Elaine Y. Chu1Louise K. Corron2Michael H. Price3Department of Anthropology, University of Nevada, 1664 North Virginia Street, Stop 0096, Reno, NV 89557, USADepartment of Anthropology, University of Nevada, 1664 North Virginia Street, Stop 0096, Reno, NV 89557, USADepartment of Anthropology, University of Nevada, 1664 North Virginia Street, Stop 0096, Reno, NV 89557, USAComplexity Nexus LLC, Pittsburg, PA, USABiological data are frequently nonlinear, heteroscedastic and conditionally dependent, and often researchers deal with missing data. To account for characteristics common in biological data in one algorithm, we developed the mixed cumulative probit (MCP), a novel latent trait model that is a formal generalization of the cumulative probit model usually used in transition analysis. Specifically, the MCP accommodates heteroscedasticity, mixtures of ordinal and continuous variables, missing values, conditional dependence and alternative specifications of the mean response and noise response. Cross-validation selects the best model parameters (mean response and the noise response for simple models, as well as conditional dependence for multivariate models), and the Kullback–Leibler divergence evaluates information gain during posterior inference to quantify mis-specified models (conditionally dependent versus conditionally independent). Two continuous and four ordinal skeletal and dental variables collected from 1296 individuals (aged birth to 22 years) from the Subadult Virtual Anthropology Database are used to introduce and demonstrate the algorithm. In addition to describing the features of the MCP, we provide material to help fit novel datasets using the MCP. The flexible, general formulation with model selection provides a process to robustly identify the modelling assumptions that are best suited for the data at hand.https://royalsocietypublishing.org/doi/10.1098/rsos.220963Bayesian statisticsinformation theoryheteroscedasticityconditional dependenceage estimationsubadult |
spellingShingle | Kyra E. Stull Elaine Y. Chu Louise K. Corron Michael H. Price Mixed cumulative probit: a multivariate generalization of transition analysis that accommodates variation in the shape, spread and structure of data Royal Society Open Science Bayesian statistics information theory heteroscedasticity conditional dependence age estimation subadult |
title | Mixed cumulative probit: a multivariate generalization of transition analysis that accommodates variation in the shape, spread and structure of data |
title_full | Mixed cumulative probit: a multivariate generalization of transition analysis that accommodates variation in the shape, spread and structure of data |
title_fullStr | Mixed cumulative probit: a multivariate generalization of transition analysis that accommodates variation in the shape, spread and structure of data |
title_full_unstemmed | Mixed cumulative probit: a multivariate generalization of transition analysis that accommodates variation in the shape, spread and structure of data |
title_short | Mixed cumulative probit: a multivariate generalization of transition analysis that accommodates variation in the shape, spread and structure of data |
title_sort | mixed cumulative probit a multivariate generalization of transition analysis that accommodates variation in the shape spread and structure of data |
topic | Bayesian statistics information theory heteroscedasticity conditional dependence age estimation subadult |
url | https://royalsocietypublishing.org/doi/10.1098/rsos.220963 |
work_keys_str_mv | AT kyraestull mixedcumulativeprobitamultivariategeneralizationoftransitionanalysisthataccommodatesvariationintheshapespreadandstructureofdata AT elaineychu mixedcumulativeprobitamultivariategeneralizationoftransitionanalysisthataccommodatesvariationintheshapespreadandstructureofdata AT louisekcorron mixedcumulativeprobitamultivariategeneralizationoftransitionanalysisthataccommodatesvariationintheshapespreadandstructureofdata AT michaelhprice mixedcumulativeprobitamultivariategeneralizationoftransitionanalysisthataccommodatesvariationintheshapespreadandstructureofdata |