NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.

One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene express...

Full description

Bibliographic Details
Main Authors: Joeri Ruyssinck, Vân Anh Huynh-Thu, Pierre Geurts, Tom Dhaene, Piet Demeester, Yvan Saeys
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2014-01-01
Series:PLoS ONE
Online Access:http://europepmc.org/articles/PMC3965471?pdf=render
_version_ 1811274009162022912
author Joeri Ruyssinck
Vân Anh Huynh-Thu
Pierre Geurts
Tom Dhaene
Piet Demeester
Yvan Saeys
author_facet Joeri Ruyssinck
Vân Anh Huynh-Thu
Pierre Geurts
Tom Dhaene
Piet Demeester
Yvan Saeys
author_sort Joeri Ruyssinck
collection DOAJ
description One of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms) and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made publicly available.
first_indexed 2024-04-12T23:11:17Z
format Article
id doaj.art-1003171ba3b04017858801a530216305
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-04-12T23:11:17Z
publishDate 2014-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-1003171ba3b04017858801a5302163052022-12-22T03:12:48ZengPublic Library of Science (PLoS)PLoS ONE1932-62032014-01-0193e9270910.1371/journal.pone.0092709NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.Joeri RuyssinckVân Anh Huynh-ThuPierre GeurtsTom DhaenePiet DemeesterYvan SaeysOne of the long-standing open challenges in computational systems biology is the topology inference of gene regulatory networks from high-throughput omics data. Recently, two community-wide efforts, DREAM4 and DREAM5, have been established to benchmark network inference techniques using gene expression measurements. In these challenges the overall top performer was the GENIE3 algorithm. This method decomposes the network inference task into separate regression problems for each gene in the network in which the expression values of a particular target gene are predicted using all other genes as possible predictors. Next, using tree-based ensemble methods, an importance measure for each predictor gene is calculated with respect to the target gene and a high feature importance is considered as putative evidence of a regulatory link existing between both genes. The contribution of this work is twofold. First, we generalize the regression decomposition strategy of GENIE3 to other feature importance methods. We compare the performance of support vector regression, the elastic net, random forest regression, symbolic regression and their ensemble variants in this setting to the original GENIE3 algorithm. To create the ensemble variants, we propose a subsampling approach which allows us to cast any feature selection algorithm that produces a feature ranking into an ensemble feature importance algorithm. We demonstrate that the ensemble setting is key to the network inference task, as only ensemble variants achieve top performance. As second contribution, we explore the effect of using rankwise averaged predictions of multiple ensemble algorithms as opposed to only one. We name this approach NIMEFI (Network Inference using Multiple Ensemble Feature Importance algorithms) and show that this approach outperforms all individual methods in general, although on a specific network a single method can perform better. An implementation of NIMEFI has been made publicly available.http://europepmc.org/articles/PMC3965471?pdf=render
spellingShingle Joeri Ruyssinck
Vân Anh Huynh-Thu
Pierre Geurts
Tom Dhaene
Piet Demeester
Yvan Saeys
NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.
PLoS ONE
title NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.
title_full NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.
title_fullStr NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.
title_full_unstemmed NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.
title_short NIMEFI: gene regulatory network inference using multiple ensemble feature importance algorithms.
title_sort nimefi gene regulatory network inference using multiple ensemble feature importance algorithms
url http://europepmc.org/articles/PMC3965471?pdf=render
work_keys_str_mv AT joeriruyssinck nimefigeneregulatorynetworkinferenceusingmultipleensemblefeatureimportancealgorithms
AT vananhhuynhthu nimefigeneregulatorynetworkinferenceusingmultipleensemblefeatureimportancealgorithms
AT pierregeurts nimefigeneregulatorynetworkinferenceusingmultipleensemblefeatureimportancealgorithms
AT tomdhaene nimefigeneregulatorynetworkinferenceusingmultipleensemblefeatureimportancealgorithms
AT pietdemeester nimefigeneregulatorynetworkinferenceusingmultipleensemblefeatureimportancealgorithms
AT yvansaeys nimefigeneregulatorynetworkinferenceusingmultipleensemblefeatureimportancealgorithms