Capturing the spectrum of interaction effects in genetic association studies by simulated evaporative cooling network analysis.

Evidence from human genetic studies of several disorders suggests that interactions between alleles at multiple genes play an important role in influencing phenotypic expression. Analytical methods for identifying Mendelian disease genes are not appropriate when applied to common multigenic diseases...

Full description

Bibliographic Details
Main Authors: Brett A McKinney, James E Crowe, Jingyu Guo, Dehua Tian
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2009-03-01
Series:PLoS Genetics
Online Access:http://europepmc.org/articles/PMC2653647?pdf=render
_version_ 1828187044935892992
author Brett A McKinney
James E Crowe
Jingyu Guo
Dehua Tian
author_facet Brett A McKinney
James E Crowe
Jingyu Guo
Dehua Tian
author_sort Brett A McKinney
collection DOAJ
description Evidence from human genetic studies of several disorders suggests that interactions between alleles at multiple genes play an important role in influencing phenotypic expression. Analytical methods for identifying Mendelian disease genes are not appropriate when applied to common multigenic diseases, because such methods investigate association with the phenotype only one genetic locus at a time. New strategies are needed that can capture the spectrum of genetic effects, from Mendelian to multifactorial epistasis. Random Forests (RF) and Relief-F are two powerful machine-learning methods that have been studied as filters for genetic case-control data due to their ability to account for the context of alleles at multiple genes when scoring the relevance of individual genetic variants to the phenotype. However, when variants interact strongly, the independence assumption of RF in the tree node-splitting criterion leads to diminished importance scores for relevant variants. Relief-F, on the other hand, was designed to detect strong interactions but is sensitive to large backgrounds of variants that are irrelevant to classification of the phenotype, which is an acute problem in genome-wide association studies. To overcome the weaknesses of these data mining approaches, we develop Evaporative Cooling (EC) feature selection, a flexible machine learning method that can integrate multiple importance scores while removing irrelevant genetic variants. To characterize detailed interactions, we construct a genetic-association interaction network (GAIN), whose edges quantify the synergy between variants with respect to the phenotype. We use simulation analysis to show that EC is able to identify a wide range of interaction effects in genetic association data. We apply the EC filter to a smallpox vaccine cohort study of single nucleotide polymorphisms (SNPs) and infer a GAIN for a collection of SNPs associated with adverse events. Our results suggest an important role for hubs in SNP disease susceptibility networks. The software is available at (http://sites.google.com/site/McKinneyLab/software).
first_indexed 2024-04-12T07:29:07Z
format Article
id doaj.art-65d68866e1cd476fa7e913bc7ab3da4a
institution Directory Open Access Journal
issn 1553-7390
1553-7404
language English
last_indexed 2024-04-12T07:29:07Z
publishDate 2009-03-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS Genetics
spelling doaj.art-65d68866e1cd476fa7e913bc7ab3da4a2022-12-22T03:42:07ZengPublic Library of Science (PLoS)PLoS Genetics1553-73901553-74042009-03-0153e100043210.1371/journal.pgen.1000432Capturing the spectrum of interaction effects in genetic association studies by simulated evaporative cooling network analysis.Brett A McKinneyJames E CroweJingyu GuoDehua TianEvidence from human genetic studies of several disorders suggests that interactions between alleles at multiple genes play an important role in influencing phenotypic expression. Analytical methods for identifying Mendelian disease genes are not appropriate when applied to common multigenic diseases, because such methods investigate association with the phenotype only one genetic locus at a time. New strategies are needed that can capture the spectrum of genetic effects, from Mendelian to multifactorial epistasis. Random Forests (RF) and Relief-F are two powerful machine-learning methods that have been studied as filters for genetic case-control data due to their ability to account for the context of alleles at multiple genes when scoring the relevance of individual genetic variants to the phenotype. However, when variants interact strongly, the independence assumption of RF in the tree node-splitting criterion leads to diminished importance scores for relevant variants. Relief-F, on the other hand, was designed to detect strong interactions but is sensitive to large backgrounds of variants that are irrelevant to classification of the phenotype, which is an acute problem in genome-wide association studies. To overcome the weaknesses of these data mining approaches, we develop Evaporative Cooling (EC) feature selection, a flexible machine learning method that can integrate multiple importance scores while removing irrelevant genetic variants. To characterize detailed interactions, we construct a genetic-association interaction network (GAIN), whose edges quantify the synergy between variants with respect to the phenotype. We use simulation analysis to show that EC is able to identify a wide range of interaction effects in genetic association data. We apply the EC filter to a smallpox vaccine cohort study of single nucleotide polymorphisms (SNPs) and infer a GAIN for a collection of SNPs associated with adverse events. Our results suggest an important role for hubs in SNP disease susceptibility networks. The software is available at (http://sites.google.com/site/McKinneyLab/software).http://europepmc.org/articles/PMC2653647?pdf=render
spellingShingle Brett A McKinney
James E Crowe
Jingyu Guo
Dehua Tian
Capturing the spectrum of interaction effects in genetic association studies by simulated evaporative cooling network analysis.
PLoS Genetics
title Capturing the spectrum of interaction effects in genetic association studies by simulated evaporative cooling network analysis.
title_full Capturing the spectrum of interaction effects in genetic association studies by simulated evaporative cooling network analysis.
title_fullStr Capturing the spectrum of interaction effects in genetic association studies by simulated evaporative cooling network analysis.
title_full_unstemmed Capturing the spectrum of interaction effects in genetic association studies by simulated evaporative cooling network analysis.
title_short Capturing the spectrum of interaction effects in genetic association studies by simulated evaporative cooling network analysis.
title_sort capturing the spectrum of interaction effects in genetic association studies by simulated evaporative cooling network analysis
url http://europepmc.org/articles/PMC2653647?pdf=render
work_keys_str_mv AT brettamckinney capturingthespectrumofinteractioneffectsingeneticassociationstudiesbysimulatedevaporativecoolingnetworkanalysis
AT jamesecrowe capturingthespectrumofinteractioneffectsingeneticassociationstudiesbysimulatedevaporativecoolingnetworkanalysis
AT jingyuguo capturingthespectrumofinteractioneffectsingeneticassociationstudiesbysimulatedevaporativecoolingnetworkanalysis
AT dehuatian capturingthespectrumofinteractioneffectsingeneticassociationstudiesbysimulatedevaporativecoolingnetworkanalysis