A Survey of SNP Data Analysis
Every person differs from every other person regarding their physical appearance, susceptibility to disease, response to medications, and so on. However, 99.9 percent of human DNA is the same. As such, differences in human genomes are very worthy of study. Single-Nucleotide Polymorphisms (SNPs) are...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Tsinghua University Press
2018-09-01
|
Series: | Big Data Mining and Analytics |
Subjects: | |
Online Access: | https://www.sciopen.com/article/10.26599/BDMA.2018.9020015 |
_version_ | 1811250416989503488 |
---|---|
author | Xiaojun Ding Xuan Guo |
author_facet | Xiaojun Ding Xuan Guo |
author_sort | Xiaojun Ding |
collection | DOAJ |
description | Every person differs from every other person regarding their physical appearance, susceptibility to disease, response to medications, and so on. However, 99.9 percent of human DNA is the same. As such, differences in human genomes are very worthy of study. Single-Nucleotide Polymorphisms (SNPs) are the simplest form and most common source of genetic polymorphism. SNPs have been used to successfully identify defective genes that cause Mendelian diseases. However, most common human diseases are complex and are caused by multiple SNPs. Each SNP explains only a small fraction of genetic causes. Experiments on individual SNPs may reveal their non-detectable effects on complex diseases. Pathogenesis is a complicated topic, and it is difficult to correctly predict multiple SNPs. As such, the analysis of SNP data is a critical task in the study of genetic diseases. In this paper, we divide the methods for genome-wide SNP data analysis into two categories: single-trait Genome-Wide Association Studies (GWAS) in which pathology is mined from data of a single phenotype, and multiple-trait GWAS which identifies cross-phenotype associations. For single-trait GWAS, we review methods ranging from the simple to the complex, including TEAM, BOOST, AntEpiSeeker, SNPRuler, EDCF, HiSeeker, ORF, MLR-tagging, MSCD, and MIC. For multiple-trait GWAS, we describe methods in terms of their employed regression models, dimension-reduction methods, and meta-analysis methods. We also list the advantages and disadvantages of these methods. Finally, we discuss the future directions of SNP data analysis for genome-wide association. |
first_indexed | 2024-04-12T16:04:23Z |
format | Article |
id | doaj.art-57deddb47a3a4e45be5ec65d5c0d3cb7 |
institution | Directory Open Access Journal |
issn | 2096-0654 |
language | English |
last_indexed | 2024-04-12T16:04:23Z |
publishDate | 2018-09-01 |
publisher | Tsinghua University Press |
record_format | Article |
series | Big Data Mining and Analytics |
spelling | doaj.art-57deddb47a3a4e45be5ec65d5c0d3cb72022-12-22T03:26:07ZengTsinghua University PressBig Data Mining and Analytics2096-06542018-09-011317319010.26599/BDMA.2018.9020015A Survey of SNP Data AnalysisXiaojun Ding0Xuan Guo1<institution content-type="dept">School of Computer Science and Engineering</institution>, <institution>Yulin Normal University</institution>, <city>Yulin</city> <postal-code>537000</postal-code>, and <institution content-type="dept">School of Information Engineering</institution>, <institution>Jiangxi University of Science and Technology</institution>, <city>Ganzhou </city><postal-code>341000</postal-code>, <country>China</country>.<institution content-type="dept">Department of Computer Science and Engineering</institution>, <institution>University of North Texas</institution>, <city>Denton</city>, <state>TX</state> <postal-code>76203-5017</postal-code>, <country>USA</country>.Every person differs from every other person regarding their physical appearance, susceptibility to disease, response to medications, and so on. However, 99.9 percent of human DNA is the same. As such, differences in human genomes are very worthy of study. Single-Nucleotide Polymorphisms (SNPs) are the simplest form and most common source of genetic polymorphism. SNPs have been used to successfully identify defective genes that cause Mendelian diseases. However, most common human diseases are complex and are caused by multiple SNPs. Each SNP explains only a small fraction of genetic causes. Experiments on individual SNPs may reveal their non-detectable effects on complex diseases. Pathogenesis is a complicated topic, and it is difficult to correctly predict multiple SNPs. As such, the analysis of SNP data is a critical task in the study of genetic diseases. In this paper, we divide the methods for genome-wide SNP data analysis into two categories: single-trait Genome-Wide Association Studies (GWAS) in which pathology is mined from data of a single phenotype, and multiple-trait GWAS which identifies cross-phenotype associations. For single-trait GWAS, we review methods ranging from the simple to the complex, including TEAM, BOOST, AntEpiSeeker, SNPRuler, EDCF, HiSeeker, ORF, MLR-tagging, MSCD, and MIC. For multiple-trait GWAS, we describe methods in terms of their employed regression models, dimension-reduction methods, and meta-analysis methods. We also list the advantages and disadvantages of these methods. Finally, we discuss the future directions of SNP data analysis for genome-wide association.https://www.sciopen.com/article/10.26599/BDMA.2018.9020015snp interactionssnp combinationsgwascase-control studydisease association analysiscross-phenotype association studies |
spellingShingle | Xiaojun Ding Xuan Guo A Survey of SNP Data Analysis Big Data Mining and Analytics snp interactions snp combinations gwas case-control study disease association analysis cross-phenotype association studies |
title | A Survey of SNP Data Analysis |
title_full | A Survey of SNP Data Analysis |
title_fullStr | A Survey of SNP Data Analysis |
title_full_unstemmed | A Survey of SNP Data Analysis |
title_short | A Survey of SNP Data Analysis |
title_sort | survey of snp data analysis |
topic | snp interactions snp combinations gwas case-control study disease association analysis cross-phenotype association studies |
url | https://www.sciopen.com/article/10.26599/BDMA.2018.9020015 |
work_keys_str_mv | AT xiaojunding asurveyofsnpdataanalysis AT xuanguo asurveyofsnpdataanalysis AT xiaojunding surveyofsnpdataanalysis AT xuanguo surveyofsnpdataanalysis |