Automated Model Selection of the Two-Layer Mixtures of Gaussian Process Functional Regressions for Curve Clustering and Prediction

As a reasonable statistical learning model for curve clustering analysis, the two-layer mixtures of Gaussian process functional regressions (TMGPFR) model has been developed to fit the data of sample curves from a number of independent information sources or stochastic processes. Since the sample cu...

Full description

Bibliographic Details
Main Authors: Chengxin Gong, Jinwen Ma
Format: Article
Language:English
Published: MDPI AG 2023-06-01
Series:Mathematics
Subjects:
Online Access:https://www.mdpi.com/2227-7390/11/12/2592
Description
Summary:As a reasonable statistical learning model for curve clustering analysis, the two-layer mixtures of Gaussian process functional regressions (TMGPFR) model has been developed to fit the data of sample curves from a number of independent information sources or stochastic processes. Since the sample curves from a certain stochastic process naturally form a curve cluster, the model selection of TMGPFRs, i.e., the selection of the number of mixtures of Gaussian process functional regressions (MGPFRs) in the upper layer, corresponds to the discovery of the cluster number and structure of the curve data. In fact, this is rather challenging because the conventional model selection criteria, such as BIC and cross-validation, cannot lead to a stable result in practice even with a heavy burden of repetitive computation. In this paper, we improve the original TMGPFR model and propose a Bayesian Ying-Yang (BYY) annealing learning algorithm for the parameter learning of the improved model with automated model selection. The experimental results of both synthetic and realistic datasets demonstrate that our proposed algorithm can make correct model selection automatically during parameter learning of the model.
ISSN:2227-7390