Replicability analysis in genome-wide association studies via Cartesian hidden Markov models

Abstract Background Replicability analysis which aims to detect replicated signals attracts more and more attentions in modern scientific applications. For example, in genome-wide association studies (GWAS), it would be of convincing to detect an association which can be replicated in more than one...

Full description

Bibliographic Details
Main Authors: Pengfei Wang, Wensheng Zhu
Format: Article
Language:English
Published: BMC 2019-03-01
Series:BMC Bioinformatics
Subjects:
Online Access:http://link.springer.com/article/10.1186/s12859-019-2707-7
_version_ 1811339840837386240
author Pengfei Wang
Wensheng Zhu
author_facet Pengfei Wang
Wensheng Zhu
author_sort Pengfei Wang
collection DOAJ
description Abstract Background Replicability analysis which aims to detect replicated signals attracts more and more attentions in modern scientific applications. For example, in genome-wide association studies (GWAS), it would be of convincing to detect an association which can be replicated in more than one study. Since the neighboring single nucleotide polymorphisms (SNPs) often exhibit high correlation, it is desirable to exploit the dependency information among adjacent SNPs properly in replicability analysis. In this paper, we propose a novel multiple testing procedure based on the Cartesian hidden Markov model (CHMM), called repLIS procedure, for replicability analysis across two studies, which can characterize the local dependence structure among adjacent SNPs via a four-state Markov chain. Results Theoretical results show that the repLIS procedure can control the false discovery rate (FDR) at the nominal level α and is shown to be optimal in the sense that it has the smallest false non-discovery rate (FNR) among all α-level multiple testing procedures. We carry out simulation studies to compare our repLIS procedure with the existing methods, including the Benjamini-Hochberg (BH) procedure and the empirical Bayes approach, called repfdr. Finally, we apply our repLIS procedure and repfdr procedure in the replicability analyses of psychiatric disorders data sets collected by Psychiatric Genomics Consortium (PGC) and Wellcome Trust Case Control Consortium (WTCCC). Both the simulation studies and real data analysis show that the repLIS procedure is valid and achieves a higher efficiency compared with its competitors. Conclusions In replicability analysis, our repLIS procedure controls the FDR at the pre-specified level α and can achieve more efficiency by exploiting the dependency information among adjacent SNPs.
first_indexed 2024-04-13T18:32:33Z
format Article
id doaj.art-bc036cff8d2742bdb84361d4f7493793
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-04-13T18:32:33Z
publishDate 2019-03-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-bc036cff8d2742bdb84361d4f74937932022-12-22T02:35:01ZengBMCBMC Bioinformatics1471-21052019-03-0120111210.1186/s12859-019-2707-7Replicability analysis in genome-wide association studies via Cartesian hidden Markov modelsPengfei Wang0Wensheng Zhu1Key Laboratory for Applied Statistics of MOE, School of Mathematics and Statistics, Northeast Normal UniversityKey Laboratory for Applied Statistics of MOE, School of Mathematics and Statistics, Northeast Normal UniversityAbstract Background Replicability analysis which aims to detect replicated signals attracts more and more attentions in modern scientific applications. For example, in genome-wide association studies (GWAS), it would be of convincing to detect an association which can be replicated in more than one study. Since the neighboring single nucleotide polymorphisms (SNPs) often exhibit high correlation, it is desirable to exploit the dependency information among adjacent SNPs properly in replicability analysis. In this paper, we propose a novel multiple testing procedure based on the Cartesian hidden Markov model (CHMM), called repLIS procedure, for replicability analysis across two studies, which can characterize the local dependence structure among adjacent SNPs via a four-state Markov chain. Results Theoretical results show that the repLIS procedure can control the false discovery rate (FDR) at the nominal level α and is shown to be optimal in the sense that it has the smallest false non-discovery rate (FNR) among all α-level multiple testing procedures. We carry out simulation studies to compare our repLIS procedure with the existing methods, including the Benjamini-Hochberg (BH) procedure and the empirical Bayes approach, called repfdr. Finally, we apply our repLIS procedure and repfdr procedure in the replicability analyses of psychiatric disorders data sets collected by Psychiatric Genomics Consortium (PGC) and Wellcome Trust Case Control Consortium (WTCCC). Both the simulation studies and real data analysis show that the repLIS procedure is valid and achieves a higher efficiency compared with its competitors. Conclusions In replicability analysis, our repLIS procedure controls the FDR at the pre-specified level α and can achieve more efficiency by exploiting the dependency information among adjacent SNPs.http://link.springer.com/article/10.1186/s12859-019-2707-7GWASCartesian hidden Markov modelReplicability analysis
spellingShingle Pengfei Wang
Wensheng Zhu
Replicability analysis in genome-wide association studies via Cartesian hidden Markov models
BMC Bioinformatics
GWAS
Cartesian hidden Markov model
Replicability analysis
title Replicability analysis in genome-wide association studies via Cartesian hidden Markov models
title_full Replicability analysis in genome-wide association studies via Cartesian hidden Markov models
title_fullStr Replicability analysis in genome-wide association studies via Cartesian hidden Markov models
title_full_unstemmed Replicability analysis in genome-wide association studies via Cartesian hidden Markov models
title_short Replicability analysis in genome-wide association studies via Cartesian hidden Markov models
title_sort replicability analysis in genome wide association studies via cartesian hidden markov models
topic GWAS
Cartesian hidden Markov model
Replicability analysis
url http://link.springer.com/article/10.1186/s12859-019-2707-7
work_keys_str_mv AT pengfeiwang replicabilityanalysisingenomewideassociationstudiesviacartesianhiddenmarkovmodels
AT wenshengzhu replicabilityanalysisingenomewideassociationstudiesviacartesianhiddenmarkovmodels