Bayesian Nonparametric Mixture Estimation for Time-Indexed Functional Data in R

We present growfunctions for R that offers Bayesian nonparametric estimation models for analysis of dependent, noisy time series data indexed by a collection of domains. This data structure arises from combining periodically published government survey statistics, such as are reported in the Current...

Full description

Bibliographic Details
Main Author: Terrance D. Savitsky
Format: Article
Language:English
Published: Foundation for Open Access Statistics 2016-08-01
Series:Journal of Statistical Software
Subjects:
Online Access:https://www.jstatsoft.org/index.php/jss/article/view/2800
_version_ 1828436010818600960
author Terrance D. Savitsky
author_facet Terrance D. Savitsky
author_sort Terrance D. Savitsky
collection DOAJ
description We present growfunctions for R that offers Bayesian nonparametric estimation models for analysis of dependent, noisy time series data indexed by a collection of domains. This data structure arises from combining periodically published government survey statistics, such as are reported in the Current Population Study (CPS). The CPS publishes monthly, by-state estimates of employment levels, where each state expresses a noisy time series. Published state-level estimates from the CPS are composed from household survey responses in a model-free manner and express high levels of volatility due to insufficient sample sizes. Existing software solutions borrow information over a modeled time-based dependence to extract a de-noised time series for each domain. These solutions, however, ignore the dependence among the domains that may be additionally leveraged to improve estimation efficiency. The growfunctions package offers two fully nonparametric mixture models that simultaneously estimate both a time and domain-indexed dependence structure for a collection of time series: (1) A Gaussian process (GP) construction, which is parameterized through the covariance matrix, estimates a latent function for each domain. The covariance parameters of the latent functions are indexed by domain under a Dirichlet process prior that permits estimation of the dependence among functions across the domains: (2) An intrinsic Gaussian Markov random field prior construction provides an alternative to the GP that expresses different computation and estimation properties. In addition to performing denoised estimation of latent functions from published domain estimates, growfunctions allows estimation of collections of functions for observation units (e.g., households), rather than aggregated domains, by accounting for an informative sampling design under which the probabilities for inclusion of observation units are related to the response variable. growfunctions includes plot functions that allow visual assessments of the fit performance and dependence structure of the estimated functions. Computational efficiency is achieved by performing the sampling for estimation functions using compiled C++.
first_indexed 2024-12-10T19:19:15Z
format Article
id doaj.art-150209287e934b1f86df0077751f51e9
institution Directory Open Access Journal
issn 1548-7660
language English
last_indexed 2024-12-10T19:19:15Z
publishDate 2016-08-01
publisher Foundation for Open Access Statistics
record_format Article
series Journal of Statistical Software
spelling doaj.art-150209287e934b1f86df0077751f51e92022-12-22T01:36:32ZengFoundation for Open Access StatisticsJournal of Statistical Software1548-76602016-08-0172113410.18637/jss.v072.i021029Bayesian Nonparametric Mixture Estimation for Time-Indexed Functional Data in RTerrance D. SavitskyWe present growfunctions for R that offers Bayesian nonparametric estimation models for analysis of dependent, noisy time series data indexed by a collection of domains. This data structure arises from combining periodically published government survey statistics, such as are reported in the Current Population Study (CPS). The CPS publishes monthly, by-state estimates of employment levels, where each state expresses a noisy time series. Published state-level estimates from the CPS are composed from household survey responses in a model-free manner and express high levels of volatility due to insufficient sample sizes. Existing software solutions borrow information over a modeled time-based dependence to extract a de-noised time series for each domain. These solutions, however, ignore the dependence among the domains that may be additionally leveraged to improve estimation efficiency. The growfunctions package offers two fully nonparametric mixture models that simultaneously estimate both a time and domain-indexed dependence structure for a collection of time series: (1) A Gaussian process (GP) construction, which is parameterized through the covariance matrix, estimates a latent function for each domain. The covariance parameters of the latent functions are indexed by domain under a Dirichlet process prior that permits estimation of the dependence among functions across the domains: (2) An intrinsic Gaussian Markov random field prior construction provides an alternative to the GP that expresses different computation and estimation properties. In addition to performing denoised estimation of latent functions from published domain estimates, growfunctions allows estimation of collections of functions for observation units (e.g., households), rather than aggregated domains, by accounting for an informative sampling design under which the probabilities for inclusion of observation units are related to the response variable. growfunctions includes plot functions that allow visual assessments of the fit performance and dependence structure of the estimated functions. Computational efficiency is achieved by performing the sampling for estimation functions using compiled C++.https://www.jstatsoft.org/index.php/jss/article/view/2800Gaussian processGaussian Markov random fieldDirichlet processBayesian hierarchical modelstime seriesfunctional dataRC++
spellingShingle Terrance D. Savitsky
Bayesian Nonparametric Mixture Estimation for Time-Indexed Functional Data in R
Journal of Statistical Software
Gaussian process
Gaussian Markov random field
Dirichlet process
Bayesian hierarchical models
time series
functional data
R
C++
title Bayesian Nonparametric Mixture Estimation for Time-Indexed Functional Data in R
title_full Bayesian Nonparametric Mixture Estimation for Time-Indexed Functional Data in R
title_fullStr Bayesian Nonparametric Mixture Estimation for Time-Indexed Functional Data in R
title_full_unstemmed Bayesian Nonparametric Mixture Estimation for Time-Indexed Functional Data in R
title_short Bayesian Nonparametric Mixture Estimation for Time-Indexed Functional Data in R
title_sort bayesian nonparametric mixture estimation for time indexed functional data in r
topic Gaussian process
Gaussian Markov random field
Dirichlet process
Bayesian hierarchical models
time series
functional data
R
C++
url https://www.jstatsoft.org/index.php/jss/article/view/2800
work_keys_str_mv AT terrancedsavitsky bayesiannonparametricmixtureestimationfortimeindexedfunctionaldatainr