Mixed cumulative probit: a multivariate generalization of transition analysis that accommodates variation in the shape, spread and structure of data

Biological data are frequently nonlinear, heteroscedastic and conditionally dependent, and often researchers deal with missing data. To account for characteristics common in biological data in one algorithm, we developed the mixed cumulative probit (MCP), a novel latent trait model that is a formal...

Full description

Bibliographic Details
Main Authors: Kyra E. Stull, Elaine Y. Chu, Louise K. Corron, Michael H. Price
Format: Article
Language:English
Published: The Royal Society 2023-03-01
Series:Royal Society Open Science
Subjects:
Online Access:https://royalsocietypublishing.org/doi/10.1098/rsos.220963
_version_ 1797858273645297664
author Kyra E. Stull
Elaine Y. Chu
Louise K. Corron
Michael H. Price
author_facet Kyra E. Stull
Elaine Y. Chu
Louise K. Corron
Michael H. Price
author_sort Kyra E. Stull
collection DOAJ
description Biological data are frequently nonlinear, heteroscedastic and conditionally dependent, and often researchers deal with missing data. To account for characteristics common in biological data in one algorithm, we developed the mixed cumulative probit (MCP), a novel latent trait model that is a formal generalization of the cumulative probit model usually used in transition analysis. Specifically, the MCP accommodates heteroscedasticity, mixtures of ordinal and continuous variables, missing values, conditional dependence and alternative specifications of the mean response and noise response. Cross-validation selects the best model parameters (mean response and the noise response for simple models, as well as conditional dependence for multivariate models), and the Kullback–Leibler divergence evaluates information gain during posterior inference to quantify mis-specified models (conditionally dependent versus conditionally independent). Two continuous and four ordinal skeletal and dental variables collected from 1296 individuals (aged birth to 22 years) from the Subadult Virtual Anthropology Database are used to introduce and demonstrate the algorithm. In addition to describing the features of the MCP, we provide material to help fit novel datasets using the MCP. The flexible, general formulation with model selection provides a process to robustly identify the modelling assumptions that are best suited for the data at hand.
first_indexed 2024-04-09T21:10:40Z
format Article
id doaj.art-3be6093f9ca54a3ca76be89ba2e264e8
institution Directory Open Access Journal
issn 2054-5703
language English
last_indexed 2024-04-09T21:10:40Z
publishDate 2023-03-01
publisher The Royal Society
record_format Article
series Royal Society Open Science
spelling doaj.art-3be6093f9ca54a3ca76be89ba2e264e82023-03-28T20:17:14ZengThe Royal SocietyRoyal Society Open Science2054-57032023-03-0110310.1098/rsos.220963Mixed cumulative probit: a multivariate generalization of transition analysis that accommodates variation in the shape, spread and structure of dataKyra E. Stull0Elaine Y. Chu1Louise K. Corron2Michael H. Price3Department of Anthropology, University of Nevada, 1664 North Virginia Street, Stop 0096, Reno, NV 89557, USADepartment of Anthropology, University of Nevada, 1664 North Virginia Street, Stop 0096, Reno, NV 89557, USADepartment of Anthropology, University of Nevada, 1664 North Virginia Street, Stop 0096, Reno, NV 89557, USAComplexity Nexus LLC, Pittsburg, PA, USABiological data are frequently nonlinear, heteroscedastic and conditionally dependent, and often researchers deal with missing data. To account for characteristics common in biological data in one algorithm, we developed the mixed cumulative probit (MCP), a novel latent trait model that is a formal generalization of the cumulative probit model usually used in transition analysis. Specifically, the MCP accommodates heteroscedasticity, mixtures of ordinal and continuous variables, missing values, conditional dependence and alternative specifications of the mean response and noise response. Cross-validation selects the best model parameters (mean response and the noise response for simple models, as well as conditional dependence for multivariate models), and the Kullback–Leibler divergence evaluates information gain during posterior inference to quantify mis-specified models (conditionally dependent versus conditionally independent). Two continuous and four ordinal skeletal and dental variables collected from 1296 individuals (aged birth to 22 years) from the Subadult Virtual Anthropology Database are used to introduce and demonstrate the algorithm. In addition to describing the features of the MCP, we provide material to help fit novel datasets using the MCP. The flexible, general formulation with model selection provides a process to robustly identify the modelling assumptions that are best suited for the data at hand.https://royalsocietypublishing.org/doi/10.1098/rsos.220963Bayesian statisticsinformation theoryheteroscedasticityconditional dependenceage estimationsubadult
spellingShingle Kyra E. Stull
Elaine Y. Chu
Louise K. Corron
Michael H. Price
Mixed cumulative probit: a multivariate generalization of transition analysis that accommodates variation in the shape, spread and structure of data
Royal Society Open Science
Bayesian statistics
information theory
heteroscedasticity
conditional dependence
age estimation
subadult
title Mixed cumulative probit: a multivariate generalization of transition analysis that accommodates variation in the shape, spread and structure of data
title_full Mixed cumulative probit: a multivariate generalization of transition analysis that accommodates variation in the shape, spread and structure of data
title_fullStr Mixed cumulative probit: a multivariate generalization of transition analysis that accommodates variation in the shape, spread and structure of data
title_full_unstemmed Mixed cumulative probit: a multivariate generalization of transition analysis that accommodates variation in the shape, spread and structure of data
title_short Mixed cumulative probit: a multivariate generalization of transition analysis that accommodates variation in the shape, spread and structure of data
title_sort mixed cumulative probit a multivariate generalization of transition analysis that accommodates variation in the shape spread and structure of data
topic Bayesian statistics
information theory
heteroscedasticity
conditional dependence
age estimation
subadult
url https://royalsocietypublishing.org/doi/10.1098/rsos.220963
work_keys_str_mv AT kyraestull mixedcumulativeprobitamultivariategeneralizationoftransitionanalysisthataccommodatesvariationintheshapespreadandstructureofdata
AT elaineychu mixedcumulativeprobitamultivariategeneralizationoftransitionanalysisthataccommodatesvariationintheshapespreadandstructureofdata
AT louisekcorron mixedcumulativeprobitamultivariategeneralizationoftransitionanalysisthataccommodatesvariationintheshapespreadandstructureofdata
AT michaelhprice mixedcumulativeprobitamultivariategeneralizationoftransitionanalysisthataccommodatesvariationintheshapespreadandstructureofdata