Replicability analysis in genome-wide association studies via Cartesian hidden Markov models

Abstract Background Replicability analysis which aims to detect replicated signals attracts more and more attentions in modern scientific applications. For example, in genome-wide association studies (GWAS), it would be of convincing to detect an association which can be replicated in more than one...

Full description

Bibliographic Details
Main Authors:	Pengfei Wang, Wensheng Zhu
Format:	Article
Language:	English
Published:	BMC 2019-03-01
Series:	BMC Bioinformatics
Subjects:	GWAS Cartesian hidden Markov model Replicability analysis
Online Access:	http://link.springer.com/article/10.1186/s12859-019-2707-7

_version_	1811339840837386240
author	Pengfei Wang Wensheng Zhu
author_facet	Pengfei Wang Wensheng Zhu
author_sort	Pengfei Wang
collection	DOAJ
description	Abstract Background Replicability analysis which aims to detect replicated signals attracts more and more attentions in modern scientific applications. For example, in genome-wide association studies (GWAS), it would be of convincing to detect an association which can be replicated in more than one study. Since the neighboring single nucleotide polymorphisms (SNPs) often exhibit high correlation, it is desirable to exploit the dependency information among adjacent SNPs properly in replicability analysis. In this paper, we propose a novel multiple testing procedure based on the Cartesian hidden Markov model (CHMM), called repLIS procedure, for replicability analysis across two studies, which can characterize the local dependence structure among adjacent SNPs via a four-state Markov chain. Results Theoretical results show that the repLIS procedure can control the false discovery rate (FDR) at the nominal level α and is shown to be optimal in the sense that it has the smallest false non-discovery rate (FNR) among all α-level multiple testing procedures. We carry out simulation studies to compare our repLIS procedure with the existing methods, including the Benjamini-Hochberg (BH) procedure and the empirical Bayes approach, called repfdr. Finally, we apply our repLIS procedure and repfdr procedure in the replicability analyses of psychiatric disorders data sets collected by Psychiatric Genomics Consortium (PGC) and Wellcome Trust Case Control Consortium (WTCCC). Both the simulation studies and real data analysis show that the repLIS procedure is valid and achieves a higher efficiency compared with its competitors. Conclusions In replicability analysis, our repLIS procedure controls the FDR at the pre-specified level α and can achieve more efficiency by exploiting the dependency information among adjacent SNPs.
first_indexed	2024-04-13T18:32:33Z
format	Article
id	doaj.art-bc036cff8d2742bdb84361d4f7493793
institution	Directory Open Access Journal
issn	1471-2105
language	English
last_indexed	2024-04-13T18:32:33Z
publishDate	2019-03-01
publisher	BMC
record_format	Article
series	BMC Bioinformatics
spelling	doaj.art-bc036cff8d2742bdb84361d4f74937932022-12-22T02:35:01ZengBMCBMC Bioinformatics1471-21052019-03-0120111210.1186/s12859-019-2707-7Replicability analysis in genome-wide association studies via Cartesian hidden Markov modelsPengfei Wang0Wensheng Zhu1Key Laboratory for Applied Statistics of MOE, School of Mathematics and Statistics, Northeast Normal UniversityKey Laboratory for Applied Statistics of MOE, School of Mathematics and Statistics, Northeast Normal UniversityAbstract Background Replicability analysis which aims to detect replicated signals attracts more and more attentions in modern scientific applications. For example, in genome-wide association studies (GWAS), it would be of convincing to detect an association which can be replicated in more than one study. Since the neighboring single nucleotide polymorphisms (SNPs) often exhibit high correlation, it is desirable to exploit the dependency information among adjacent SNPs properly in replicability analysis. In this paper, we propose a novel multiple testing procedure based on the Cartesian hidden Markov model (CHMM), called repLIS procedure, for replicability analysis across two studies, which can characterize the local dependence structure among adjacent SNPs via a four-state Markov chain. Results Theoretical results show that the repLIS procedure can control the false discovery rate (FDR) at the nominal level α and is shown to be optimal in the sense that it has the smallest false non-discovery rate (FNR) among all α-level multiple testing procedures. We carry out simulation studies to compare our repLIS procedure with the existing methods, including the Benjamini-Hochberg (BH) procedure and the empirical Bayes approach, called repfdr. Finally, we apply our repLIS procedure and repfdr procedure in the replicability analyses of psychiatric disorders data sets collected by Psychiatric Genomics Consortium (PGC) and Wellcome Trust Case Control Consortium (WTCCC). Both the simulation studies and real data analysis show that the repLIS procedure is valid and achieves a higher efficiency compared with its competitors. Conclusions In replicability analysis, our repLIS procedure controls the FDR at the pre-specified level α and can achieve more efficiency by exploiting the dependency information among adjacent SNPs.http://link.springer.com/article/10.1186/s12859-019-2707-7GWASCartesian hidden Markov modelReplicability analysis
spellingShingle	Pengfei Wang Wensheng Zhu Replicability analysis in genome-wide association studies via Cartesian hidden Markov models BMC Bioinformatics GWAS Cartesian hidden Markov model Replicability analysis
title	Replicability analysis in genome-wide association studies via Cartesian hidden Markov models
title_full	Replicability analysis in genome-wide association studies via Cartesian hidden Markov models
title_fullStr	Replicability analysis in genome-wide association studies via Cartesian hidden Markov models
title_full_unstemmed	Replicability analysis in genome-wide association studies via Cartesian hidden Markov models
title_short	Replicability analysis in genome-wide association studies via Cartesian hidden Markov models
title_sort	replicability analysis in genome wide association studies via cartesian hidden markov models
topic	GWAS Cartesian hidden Markov model Replicability analysis
url	http://link.springer.com/article/10.1186/s12859-019-2707-7
work_keys_str_mv	AT pengfeiwang replicabilityanalysisingenomewideassociationstudiesviacartesianhiddenmarkovmodels AT wenshengzhu replicabilityanalysisingenomewideassociationstudiesviacartesianhiddenmarkovmodels

Replicability analysis in genome-wide association studies via Cartesian hidden Markov models

Similar Items