Posterior contraction rate of sparse latent feature models with application to proteomics

The Indian buffet process (IBP) and phylogenetic Indian buffet process (pIBP) can be used as prior models to infer latent features in a data set. The theoretical properties of these models are under-explored, however, especially in high dimensional settings. In this paper, we show that under mild sp...

Full description

Bibliographic Details
Main Authors:	Tong Li, Tianjian Zhou, Kam-Wah Tsui, Lin Wei, Yuan Ji
Format:	Article
Language:	English
Published:	Taylor & Francis Group 2022-01-01
Series:	Statistical Theory and Related Fields
Subjects:	high dimension indian buffet process latent feature markov chain monte carlo posterior convergence reverse phase protein arrays
Online Access:	http://dx.doi.org/10.1080/24754269.2021.1974664

_version_	1827809220382162944
author	Tong Li Tianjian Zhou Kam-Wah Tsui Lin Wei Yuan Ji
author_facet	Tong Li Tianjian Zhou Kam-Wah Tsui Lin Wei Yuan Ji
author_sort	Tong Li
collection	DOAJ
description	The Indian buffet process (IBP) and phylogenetic Indian buffet process (pIBP) can be used as prior models to infer latent features in a data set. The theoretical properties of these models are under-explored, however, especially in high dimensional settings. In this paper, we show that under mild sparsity condition, the posterior distribution of the latent feature matrix, generated via IBP or pIBP priors, converges to the true latent feature matrix asymptotically. We derive the posterior convergence rate, referred to as the contraction rate. We show that the convergence results remain valid even when the dimensionality of the latent feature matrix increases with the sample size, therefore making the posterior inference valid in high dimensional settings. We demonstrate the theoretical results using computer simulation, in which the parallel-tempering Markov chain Monte Carlo method is applied to overcome computational hurdles. The practical utility of the derived properties is demonstrated by inferring the latent features in a reverse phase protein arrays (RPPA) dataset under the IBP prior model.
first_indexed	2024-03-11T22:38:29Z
format	Article
id	doaj.art-0c95c9b856d3488292e996311217c0b3
institution	Directory Open Access Journal
issn	2475-4269 2475-4277
language	English
last_indexed	2024-03-11T22:38:29Z
publishDate	2022-01-01
publisher	Taylor & Francis Group
record_format	Article
series	Statistical Theory and Related Fields
spelling	doaj.art-0c95c9b856d3488292e996311217c0b32023-09-22T09:19:46ZengTaylor & Francis GroupStatistical Theory and Related Fields2475-42692475-42772022-01-0161293910.1080/24754269.2021.19746641974664Posterior contraction rate of sparse latent feature models with application to proteomicsTong Li0Tianjian Zhou1Kam-Wah Tsui2Lin Wei3Yuan Ji4Columbia UniversityColorado State UniversityUniversity of Wisconsin–MadisonNorthShore University HealthSystemUniversity of ChicagoThe Indian buffet process (IBP) and phylogenetic Indian buffet process (pIBP) can be used as prior models to infer latent features in a data set. The theoretical properties of these models are under-explored, however, especially in high dimensional settings. In this paper, we show that under mild sparsity condition, the posterior distribution of the latent feature matrix, generated via IBP or pIBP priors, converges to the true latent feature matrix asymptotically. We derive the posterior convergence rate, referred to as the contraction rate. We show that the convergence results remain valid even when the dimensionality of the latent feature matrix increases with the sample size, therefore making the posterior inference valid in high dimensional settings. We demonstrate the theoretical results using computer simulation, in which the parallel-tempering Markov chain Monte Carlo method is applied to overcome computational hurdles. The practical utility of the derived properties is demonstrated by inferring the latent features in a reverse phase protein arrays (RPPA) dataset under the IBP prior model.http://dx.doi.org/10.1080/24754269.2021.1974664high dimensionindian buffet processlatent featuremarkov chain monte carloposterior convergencereverse phase protein arrays
spellingShingle	Tong Li Tianjian Zhou Kam-Wah Tsui Lin Wei Yuan Ji Posterior contraction rate of sparse latent feature models with application to proteomics Statistical Theory and Related Fields high dimension indian buffet process latent feature markov chain monte carlo posterior convergence reverse phase protein arrays
title	Posterior contraction rate of sparse latent feature models with application to proteomics
title_full	Posterior contraction rate of sparse latent feature models with application to proteomics
title_fullStr	Posterior contraction rate of sparse latent feature models with application to proteomics
title_full_unstemmed	Posterior contraction rate of sparse latent feature models with application to proteomics
title_short	Posterior contraction rate of sparse latent feature models with application to proteomics
title_sort	posterior contraction rate of sparse latent feature models with application to proteomics
topic	high dimension indian buffet process latent feature markov chain monte carlo posterior convergence reverse phase protein arrays
url	http://dx.doi.org/10.1080/24754269.2021.1974664
work_keys_str_mv	AT tongli posteriorcontractionrateofsparselatentfeaturemodelswithapplicationtoproteomics AT tianjianzhou posteriorcontractionrateofsparselatentfeaturemodelswithapplicationtoproteomics AT kamwahtsui posteriorcontractionrateofsparselatentfeaturemodelswithapplicationtoproteomics AT linwei posteriorcontractionrateofsparselatentfeaturemodelswithapplicationtoproteomics AT yuanji posteriorcontractionrateofsparselatentfeaturemodelswithapplicationtoproteomics

Posterior contraction rate of sparse latent feature models with application to proteomics

Similar Items