Reliable and fast estimation of recombination rates by convergence diagnosis and parallel Markov Chain Monte Carlo

Genetic recombination is an essential event during the process of meiosis resulting in an exchange of segments between paired chromosomes. Estimating recombination rate is crucial for understanding evolution. Experimental methods are normally difficult and limited to small scale estimations. Thus st...

Full description

Bibliographic Details
Main Authors: Guo, Jing, Jain, Ritika, Yang, Peng, Fan, Rui, Kwoh, Chee Keong, Zheng, Jie
Other Authors: School of Computer Engineering
Format: Journal Article
Language:English
Published: 2013
Subjects:
Online Access:https://hdl.handle.net/10356/80080
http://hdl.handle.net/10220/17663
_version_ 1826112892575940608
author Guo, Jing
Jain, Ritika
Yang, Peng
Fan, Rui
Kwoh, Chee Keong
Zheng, Jie
author2 School of Computer Engineering
author_facet School of Computer Engineering
Guo, Jing
Jain, Ritika
Yang, Peng
Fan, Rui
Kwoh, Chee Keong
Zheng, Jie
author_sort Guo, Jing
collection NTU
description Genetic recombination is an essential event during the process of meiosis resulting in an exchange of segments between paired chromosomes. Estimating recombination rate is crucial for understanding evolution. Experimental methods are normally difficult and limited to small scale estimations. Thus statistical methods using population genetic data are important for large-scale analysis. LDhat is an extensively used statistical method using rjMCMC algorithm to predict recombination rates. Due to the complexity of rjMCMC scheme, LDhat may take a long time to generate results for large SNP data. In addition, rjMCMC parameters should be manually defined in the original program that directly impact results. To address these issues, we designed an improved algorithm based on LDhat implementing MCMC convergence diagnostic algorithms to automatically predict values of parameters and monitor the mixing process. Then parallel computation methods were employed to further accelerate the new program. The new algorithms have been tested on ten samples from HapMap phase 2 datasets. The results were compared with previous code and showed nearly identical outputs, however our new methods achieved significant acceleration proving that they are more efficient and reliable for the estimation of recombination rates. The stand-alone package is freely available for download at the link below.
first_indexed 2024-10-01T03:14:12Z
format Journal Article
id ntu-10356/80080
institution Nanyang Technological University
language English
last_indexed 2024-10-01T03:14:12Z
publishDate 2013
record_format dspace
spelling ntu-10356/800802020-05-28T07:18:48Z Reliable and fast estimation of recombination rates by convergence diagnosis and parallel Markov Chain Monte Carlo Guo, Jing Jain, Ritika Yang, Peng Fan, Rui Kwoh, Chee Keong Zheng, Jie School of Computer Engineering DRNTU::Engineering::Computer science and engineering::Theory of computation::Analysis of algorithms and problem complexity Genetic recombination is an essential event during the process of meiosis resulting in an exchange of segments between paired chromosomes. Estimating recombination rate is crucial for understanding evolution. Experimental methods are normally difficult and limited to small scale estimations. Thus statistical methods using population genetic data are important for large-scale analysis. LDhat is an extensively used statistical method using rjMCMC algorithm to predict recombination rates. Due to the complexity of rjMCMC scheme, LDhat may take a long time to generate results for large SNP data. In addition, rjMCMC parameters should be manually defined in the original program that directly impact results. To address these issues, we designed an improved algorithm based on LDhat implementing MCMC convergence diagnostic algorithms to automatically predict values of parameters and monitor the mixing process. Then parallel computation methods were employed to further accelerate the new program. The new algorithms have been tested on ten samples from HapMap phase 2 datasets. The results were compared with previous code and showed nearly identical outputs, however our new methods achieved significant acceleration proving that they are more efficient and reliable for the estimation of recombination rates. The stand-alone package is freely available for download at the link below. MOE (Min. of Education, S’pore) Accepted version 2013-11-15T04:31:57Z 2019-12-06T13:40:18Z 2013-11-15T04:31:57Z 2019-12-06T13:40:18Z 2013 2013 Journal Article Guo, J., Jain, R., Yang, P., Fan, R., Kwoh, C. K., & Zheng, J. (2013). Reliable and Fast Estimation of Recombination Rates by Convergence Diagnosis and Parallel Markov Chain Monte Carlo. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 99, 1. 1545-5963 https://hdl.handle.net/10356/80080 http://hdl.handle.net/10220/17663 10.1109/TCBB.2013.133 en IEEE/ACM transactions on computational biology and bioinformatics © 2013 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. Published version of this article is available at http://dx.doi.org/10.1109/TCBB.2013.133. CPLDhat is an open source Java program. application/pdf application/octet-stream application/pdf application/pdf
spellingShingle DRNTU::Engineering::Computer science and engineering::Theory of computation::Analysis of algorithms and problem complexity
Guo, Jing
Jain, Ritika
Yang, Peng
Fan, Rui
Kwoh, Chee Keong
Zheng, Jie
Reliable and fast estimation of recombination rates by convergence diagnosis and parallel Markov Chain Monte Carlo
title Reliable and fast estimation of recombination rates by convergence diagnosis and parallel Markov Chain Monte Carlo
title_full Reliable and fast estimation of recombination rates by convergence diagnosis and parallel Markov Chain Monte Carlo
title_fullStr Reliable and fast estimation of recombination rates by convergence diagnosis and parallel Markov Chain Monte Carlo
title_full_unstemmed Reliable and fast estimation of recombination rates by convergence diagnosis and parallel Markov Chain Monte Carlo
title_short Reliable and fast estimation of recombination rates by convergence diagnosis and parallel Markov Chain Monte Carlo
title_sort reliable and fast estimation of recombination rates by convergence diagnosis and parallel markov chain monte carlo
topic DRNTU::Engineering::Computer science and engineering::Theory of computation::Analysis of algorithms and problem complexity
url https://hdl.handle.net/10356/80080
http://hdl.handle.net/10220/17663
work_keys_str_mv AT guojing reliableandfastestimationofrecombinationratesbyconvergencediagnosisandparallelmarkovchainmontecarlo
AT jainritika reliableandfastestimationofrecombinationratesbyconvergencediagnosisandparallelmarkovchainmontecarlo
AT yangpeng reliableandfastestimationofrecombinationratesbyconvergencediagnosisandparallelmarkovchainmontecarlo
AT fanrui reliableandfastestimationofrecombinationratesbyconvergencediagnosisandparallelmarkovchainmontecarlo
AT kwohcheekeong reliableandfastestimationofrecombinationratesbyconvergencediagnosisandparallelmarkovchainmontecarlo
AT zhengjie reliableandfastestimationofrecombinationratesbyconvergencediagnosisandparallelmarkovchainmontecarlo