EpiMOGA: An Epistasis Detection Method Based on a Multi-Objective Genetic Algorithm

In genome-wide association studies, detecting high-order epistasis is important for analyzing the occurrence of complex human diseases and explaining missing heritability. However, there are various challenges in the actual high-order epistasis detection process due to the large amount of data, “sma...

Full description

Bibliographic Details
Main Authors: Yuanyuan Chen, Fengjiao Xu, Cong Pian, Mingmin Xu, Lingpeng Kong, Jingya Fang, Zutan Li, Liangyun Zhang
Format: Article
Language:English
Published: MDPI AG 2021-01-01
Series:Genes
Subjects:
Online Access:https://www.mdpi.com/2073-4425/12/2/191
_version_ 1797406305083719680
author Yuanyuan Chen
Fengjiao Xu
Cong Pian
Mingmin Xu
Lingpeng Kong
Jingya Fang
Zutan Li
Liangyun Zhang
author_facet Yuanyuan Chen
Fengjiao Xu
Cong Pian
Mingmin Xu
Lingpeng Kong
Jingya Fang
Zutan Li
Liangyun Zhang
author_sort Yuanyuan Chen
collection DOAJ
description In genome-wide association studies, detecting high-order epistasis is important for analyzing the occurrence of complex human diseases and explaining missing heritability. However, there are various challenges in the actual high-order epistasis detection process due to the large amount of data, “small sample size problem”, diversity of disease models, etc. This paper proposes a multi-objective genetic algorithm (EpiMOGA) for single nucleotide polymorphism (SNP) epistasis detection. The K2 score based on the Bayesian network criterion and the Gini index of the diversity of the binary classification problem were used to guide the search process of the genetic algorithm. Experiments were performed on 26 simulated datasets of different models and a real Alzheimer’s disease dataset. The results indicated that EpiMOGA was obviously superior to other related and competitive methods in both detection efficiency and accuracy, especially for small-sample-size datasets, and the performance of EpiMOGA remained stable across datasets of different disease models. At the same time, a number of SNP loci and 2-order epistasis associated with Alzheimer’s disease were identified by the EpiMOGA method, indicating that this method is capable of identifying high-order epistasis from genome-wide data and can be applied in the study of complex diseases.
first_indexed 2024-03-09T03:24:29Z
format Article
id doaj.art-6e1f4da2c8654240a016dc705927d037
institution Directory Open Access Journal
issn 2073-4425
language English
last_indexed 2024-03-09T03:24:29Z
publishDate 2021-01-01
publisher MDPI AG
record_format Article
series Genes
spelling doaj.art-6e1f4da2c8654240a016dc705927d0372023-12-03T15:03:58ZengMDPI AGGenes2073-44252021-01-0112219110.3390/genes12020191EpiMOGA: An Epistasis Detection Method Based on a Multi-Objective Genetic AlgorithmYuanyuan Chen0Fengjiao Xu1Cong Pian2Mingmin Xu3Lingpeng Kong4Jingya Fang5Zutan Li6Liangyun Zhang7Department of Mathematics, College of Science, Nanjing Agricultural University, Nanjing 210095, ChinaDepartment of Mathematics, College of Science, Nanjing Agricultural University, Nanjing 210095, ChinaDepartment of Mathematics, College of Science, Nanjing Agricultural University, Nanjing 210095, ChinaCollege of Agriculture, Nanjing Agricultural University, Nanjing 210095, ChinaCollege of Agriculture, Nanjing Agricultural University, Nanjing 210095, ChinaCollege of Agriculture, Nanjing Agricultural University, Nanjing 210095, ChinaCollege of Agriculture, Nanjing Agricultural University, Nanjing 210095, ChinaDepartment of Mathematics, College of Science, Nanjing Agricultural University, Nanjing 210095, ChinaIn genome-wide association studies, detecting high-order epistasis is important for analyzing the occurrence of complex human diseases and explaining missing heritability. However, there are various challenges in the actual high-order epistasis detection process due to the large amount of data, “small sample size problem”, diversity of disease models, etc. This paper proposes a multi-objective genetic algorithm (EpiMOGA) for single nucleotide polymorphism (SNP) epistasis detection. The K2 score based on the Bayesian network criterion and the Gini index of the diversity of the binary classification problem were used to guide the search process of the genetic algorithm. Experiments were performed on 26 simulated datasets of different models and a real Alzheimer’s disease dataset. The results indicated that EpiMOGA was obviously superior to other related and competitive methods in both detection efficiency and accuracy, especially for small-sample-size datasets, and the performance of EpiMOGA remained stable across datasets of different disease models. At the same time, a number of SNP loci and 2-order epistasis associated with Alzheimer’s disease were identified by the EpiMOGA method, indicating that this method is capable of identifying high-order epistasis from genome-wide data and can be applied in the study of complex diseases.https://www.mdpi.com/2073-4425/12/2/191genome-wide association studieshigh-order epistasisgenetic algorithmsmulti-objective optimizationAlzheimer’s disease
spellingShingle Yuanyuan Chen
Fengjiao Xu
Cong Pian
Mingmin Xu
Lingpeng Kong
Jingya Fang
Zutan Li
Liangyun Zhang
EpiMOGA: An Epistasis Detection Method Based on a Multi-Objective Genetic Algorithm
Genes
genome-wide association studies
high-order epistasis
genetic algorithms
multi-objective optimization
Alzheimer’s disease
title EpiMOGA: An Epistasis Detection Method Based on a Multi-Objective Genetic Algorithm
title_full EpiMOGA: An Epistasis Detection Method Based on a Multi-Objective Genetic Algorithm
title_fullStr EpiMOGA: An Epistasis Detection Method Based on a Multi-Objective Genetic Algorithm
title_full_unstemmed EpiMOGA: An Epistasis Detection Method Based on a Multi-Objective Genetic Algorithm
title_short EpiMOGA: An Epistasis Detection Method Based on a Multi-Objective Genetic Algorithm
title_sort epimoga an epistasis detection method based on a multi objective genetic algorithm
topic genome-wide association studies
high-order epistasis
genetic algorithms
multi-objective optimization
Alzheimer’s disease
url https://www.mdpi.com/2073-4425/12/2/191
work_keys_str_mv AT yuanyuanchen epimogaanepistasisdetectionmethodbasedonamultiobjectivegeneticalgorithm
AT fengjiaoxu epimogaanepistasisdetectionmethodbasedonamultiobjectivegeneticalgorithm
AT congpian epimogaanepistasisdetectionmethodbasedonamultiobjectivegeneticalgorithm
AT mingminxu epimogaanepistasisdetectionmethodbasedonamultiobjectivegeneticalgorithm
AT lingpengkong epimogaanepistasisdetectionmethodbasedonamultiobjectivegeneticalgorithm
AT jingyafang epimogaanepistasisdetectionmethodbasedonamultiobjectivegeneticalgorithm
AT zutanli epimogaanepistasisdetectionmethodbasedonamultiobjectivegeneticalgorithm
AT liangyunzhang epimogaanepistasisdetectionmethodbasedonamultiobjectivegeneticalgorithm