Replicability analysis in genome-wide association studies via Cartesian hidden Markov models
Abstract Background Replicability analysis which aims to detect replicated signals attracts more and more attentions in modern scientific applications. For example, in genome-wide association studies (GWAS), it would be of convincing to detect an association which can be replicated in more than one...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
BMC
2019-03-01
|
Series: | BMC Bioinformatics |
Subjects: | |
Online Access: | http://link.springer.com/article/10.1186/s12859-019-2707-7 |
_version_ | 1811339840837386240 |
---|---|
author | Pengfei Wang Wensheng Zhu |
author_facet | Pengfei Wang Wensheng Zhu |
author_sort | Pengfei Wang |
collection | DOAJ |
description | Abstract Background Replicability analysis which aims to detect replicated signals attracts more and more attentions in modern scientific applications. For example, in genome-wide association studies (GWAS), it would be of convincing to detect an association which can be replicated in more than one study. Since the neighboring single nucleotide polymorphisms (SNPs) often exhibit high correlation, it is desirable to exploit the dependency information among adjacent SNPs properly in replicability analysis. In this paper, we propose a novel multiple testing procedure based on the Cartesian hidden Markov model (CHMM), called repLIS procedure, for replicability analysis across two studies, which can characterize the local dependence structure among adjacent SNPs via a four-state Markov chain. Results Theoretical results show that the repLIS procedure can control the false discovery rate (FDR) at the nominal level α and is shown to be optimal in the sense that it has the smallest false non-discovery rate (FNR) among all α-level multiple testing procedures. We carry out simulation studies to compare our repLIS procedure with the existing methods, including the Benjamini-Hochberg (BH) procedure and the empirical Bayes approach, called repfdr. Finally, we apply our repLIS procedure and repfdr procedure in the replicability analyses of psychiatric disorders data sets collected by Psychiatric Genomics Consortium (PGC) and Wellcome Trust Case Control Consortium (WTCCC). Both the simulation studies and real data analysis show that the repLIS procedure is valid and achieves a higher efficiency compared with its competitors. Conclusions In replicability analysis, our repLIS procedure controls the FDR at the pre-specified level α and can achieve more efficiency by exploiting the dependency information among adjacent SNPs. |
first_indexed | 2024-04-13T18:32:33Z |
format | Article |
id | doaj.art-bc036cff8d2742bdb84361d4f7493793 |
institution | Directory Open Access Journal |
issn | 1471-2105 |
language | English |
last_indexed | 2024-04-13T18:32:33Z |
publishDate | 2019-03-01 |
publisher | BMC |
record_format | Article |
series | BMC Bioinformatics |
spelling | doaj.art-bc036cff8d2742bdb84361d4f74937932022-12-22T02:35:01ZengBMCBMC Bioinformatics1471-21052019-03-0120111210.1186/s12859-019-2707-7Replicability analysis in genome-wide association studies via Cartesian hidden Markov modelsPengfei Wang0Wensheng Zhu1Key Laboratory for Applied Statistics of MOE, School of Mathematics and Statistics, Northeast Normal UniversityKey Laboratory for Applied Statistics of MOE, School of Mathematics and Statistics, Northeast Normal UniversityAbstract Background Replicability analysis which aims to detect replicated signals attracts more and more attentions in modern scientific applications. For example, in genome-wide association studies (GWAS), it would be of convincing to detect an association which can be replicated in more than one study. Since the neighboring single nucleotide polymorphisms (SNPs) often exhibit high correlation, it is desirable to exploit the dependency information among adjacent SNPs properly in replicability analysis. In this paper, we propose a novel multiple testing procedure based on the Cartesian hidden Markov model (CHMM), called repLIS procedure, for replicability analysis across two studies, which can characterize the local dependence structure among adjacent SNPs via a four-state Markov chain. Results Theoretical results show that the repLIS procedure can control the false discovery rate (FDR) at the nominal level α and is shown to be optimal in the sense that it has the smallest false non-discovery rate (FNR) among all α-level multiple testing procedures. We carry out simulation studies to compare our repLIS procedure with the existing methods, including the Benjamini-Hochberg (BH) procedure and the empirical Bayes approach, called repfdr. Finally, we apply our repLIS procedure and repfdr procedure in the replicability analyses of psychiatric disorders data sets collected by Psychiatric Genomics Consortium (PGC) and Wellcome Trust Case Control Consortium (WTCCC). Both the simulation studies and real data analysis show that the repLIS procedure is valid and achieves a higher efficiency compared with its competitors. Conclusions In replicability analysis, our repLIS procedure controls the FDR at the pre-specified level α and can achieve more efficiency by exploiting the dependency information among adjacent SNPs.http://link.springer.com/article/10.1186/s12859-019-2707-7GWASCartesian hidden Markov modelReplicability analysis |
spellingShingle | Pengfei Wang Wensheng Zhu Replicability analysis in genome-wide association studies via Cartesian hidden Markov models BMC Bioinformatics GWAS Cartesian hidden Markov model Replicability analysis |
title | Replicability analysis in genome-wide association studies via Cartesian hidden Markov models |
title_full | Replicability analysis in genome-wide association studies via Cartesian hidden Markov models |
title_fullStr | Replicability analysis in genome-wide association studies via Cartesian hidden Markov models |
title_full_unstemmed | Replicability analysis in genome-wide association studies via Cartesian hidden Markov models |
title_short | Replicability analysis in genome-wide association studies via Cartesian hidden Markov models |
title_sort | replicability analysis in genome wide association studies via cartesian hidden markov models |
topic | GWAS Cartesian hidden Markov model Replicability analysis |
url | http://link.springer.com/article/10.1186/s12859-019-2707-7 |
work_keys_str_mv | AT pengfeiwang replicabilityanalysisingenomewideassociationstudiesviacartesianhiddenmarkovmodels AT wenshengzhu replicabilityanalysisingenomewideassociationstudiesviacartesianhiddenmarkovmodels |