Analysis of merged whole blood transcriptomic datasets to identify circulating molecular biomarkers of feed efficiency in growing pigs

Abstract Background Improving feed efficiency (FE) is an important goal due to its economic and environmental significance for farm animal production. The FE phenotype is complex and based on the measurements of the individual feed consumption and average daily gain during a test period, which is co...

Full description

Bibliographic Details
Main Authors: Farouk Messad, Isabelle Louveau, David Renaudeau, Hélène Gilbert, Florence Gondret
Format: Article
Language:English
Published: BMC 2021-07-01
Series:BMC Genomics
Subjects:
Online Access:https://doi.org/10.1186/s12864-021-07843-4
_version_ 1818456563837304832
author Farouk Messad
Isabelle Louveau
David Renaudeau
Hélène Gilbert
Florence Gondret
author_facet Farouk Messad
Isabelle Louveau
David Renaudeau
Hélène Gilbert
Florence Gondret
author_sort Farouk Messad
collection DOAJ
description Abstract Background Improving feed efficiency (FE) is an important goal due to its economic and environmental significance for farm animal production. The FE phenotype is complex and based on the measurements of the individual feed consumption and average daily gain during a test period, which is costly and time-consuming. The identification of reliable predictors of FE is a strategy to reduce phenotyping efforts. Results Gene expression data of the whole blood from three independent experiments were combined and analyzed by machine learning algorithms to propose molecular biomarkers of FE traits in growing pigs. These datasets included Large White pigs from two lines divergently selected for residual feed intake (RFI), a measure of net FE, and in which individual feed conversion ratio (FCR) and blood microarray data were available. Merging the three datasets allowed considering FCR values (Mean = 2.85; Min = 1.92; Max = 5.00) for a total of n = 148 pigs, with a large range of body weight (15 to 115 kg) and different test period duration (2 to 9 weeks). Random forest (RF) and gradient tree boosting (GTB) were applied on the whole blood transcripts (26,687 annotated molecular probes) to identify the most important variables for binary classification on RFI groups and a quantitative prediction of FCR, respectively. The dataset was split into learning (n = 74) and validation sets (n = 74). With iterative steps for variable selection, about three hundred’s (328 to 391) molecular probes participating in various biological pathways, were identified as important predictors of RFI or FCR. With the GTB algorithm, simpler models were proposed combining 34 expressed unique genes to classify pigs into RFI groups (100% of success), and 25 expressed unique genes to predict FCR values (R 2 = 0.80, RMSE = 8%). The accuracy performance of RF models was slightly lower in classification and markedly lower in regression. Conclusion From small subsets of genes expressed in the whole blood, it is possible to predict the binary class and the individual value of feed efficiency. These predictive models offer good perspectives to identify animals with higher feed efficiency in precision farming applications.
first_indexed 2024-12-14T22:28:40Z
format Article
id doaj.art-e334b1ba05ae45c48e3243432de32a56
institution Directory Open Access Journal
issn 1471-2164
language English
last_indexed 2024-12-14T22:28:40Z
publishDate 2021-07-01
publisher BMC
record_format Article
series BMC Genomics
spelling doaj.art-e334b1ba05ae45c48e3243432de32a562022-12-21T22:45:18ZengBMCBMC Genomics1471-21642021-07-0122111410.1186/s12864-021-07843-4Analysis of merged whole blood transcriptomic datasets to identify circulating molecular biomarkers of feed efficiency in growing pigsFarouk Messad0Isabelle Louveau1David Renaudeau2Hélène Gilbert3Florence Gondret4PEGASE, INRAE, Institut AgroPEGASE, INRAE, Institut AgroPEGASE, INRAE, Institut AgroGenPhySE, INRAE, INP-ENVTPEGASE, INRAE, Institut AgroAbstract Background Improving feed efficiency (FE) is an important goal due to its economic and environmental significance for farm animal production. The FE phenotype is complex and based on the measurements of the individual feed consumption and average daily gain during a test period, which is costly and time-consuming. The identification of reliable predictors of FE is a strategy to reduce phenotyping efforts. Results Gene expression data of the whole blood from three independent experiments were combined and analyzed by machine learning algorithms to propose molecular biomarkers of FE traits in growing pigs. These datasets included Large White pigs from two lines divergently selected for residual feed intake (RFI), a measure of net FE, and in which individual feed conversion ratio (FCR) and blood microarray data were available. Merging the three datasets allowed considering FCR values (Mean = 2.85; Min = 1.92; Max = 5.00) for a total of n = 148 pigs, with a large range of body weight (15 to 115 kg) and different test period duration (2 to 9 weeks). Random forest (RF) and gradient tree boosting (GTB) were applied on the whole blood transcripts (26,687 annotated molecular probes) to identify the most important variables for binary classification on RFI groups and a quantitative prediction of FCR, respectively. The dataset was split into learning (n = 74) and validation sets (n = 74). With iterative steps for variable selection, about three hundred’s (328 to 391) molecular probes participating in various biological pathways, were identified as important predictors of RFI or FCR. With the GTB algorithm, simpler models were proposed combining 34 expressed unique genes to classify pigs into RFI groups (100% of success), and 25 expressed unique genes to predict FCR values (R 2 = 0.80, RMSE = 8%). The accuracy performance of RF models was slightly lower in classification and markedly lower in regression. Conclusion From small subsets of genes expressed in the whole blood, it is possible to predict the binary class and the individual value of feed efficiency. These predictive models offer good perspectives to identify animals with higher feed efficiency in precision farming applications.https://doi.org/10.1186/s12864-021-07843-4BiomarkersBloodFeed efficiencyGradient TreeNet boostingMicroarrayRandom Forest
spellingShingle Farouk Messad
Isabelle Louveau
David Renaudeau
Hélène Gilbert
Florence Gondret
Analysis of merged whole blood transcriptomic datasets to identify circulating molecular biomarkers of feed efficiency in growing pigs
BMC Genomics
Biomarkers
Blood
Feed efficiency
Gradient TreeNet boosting
Microarray
Random Forest
title Analysis of merged whole blood transcriptomic datasets to identify circulating molecular biomarkers of feed efficiency in growing pigs
title_full Analysis of merged whole blood transcriptomic datasets to identify circulating molecular biomarkers of feed efficiency in growing pigs
title_fullStr Analysis of merged whole blood transcriptomic datasets to identify circulating molecular biomarkers of feed efficiency in growing pigs
title_full_unstemmed Analysis of merged whole blood transcriptomic datasets to identify circulating molecular biomarkers of feed efficiency in growing pigs
title_short Analysis of merged whole blood transcriptomic datasets to identify circulating molecular biomarkers of feed efficiency in growing pigs
title_sort analysis of merged whole blood transcriptomic datasets to identify circulating molecular biomarkers of feed efficiency in growing pigs
topic Biomarkers
Blood
Feed efficiency
Gradient TreeNet boosting
Microarray
Random Forest
url https://doi.org/10.1186/s12864-021-07843-4
work_keys_str_mv AT faroukmessad analysisofmergedwholebloodtranscriptomicdatasetstoidentifycirculatingmolecularbiomarkersoffeedefficiencyingrowingpigs
AT isabellelouveau analysisofmergedwholebloodtranscriptomicdatasetstoidentifycirculatingmolecularbiomarkersoffeedefficiencyingrowingpigs
AT davidrenaudeau analysisofmergedwholebloodtranscriptomicdatasetstoidentifycirculatingmolecularbiomarkersoffeedefficiencyingrowingpigs
AT helenegilbert analysisofmergedwholebloodtranscriptomicdatasetstoidentifycirculatingmolecularbiomarkersoffeedefficiencyingrowingpigs
AT florencegondret analysisofmergedwholebloodtranscriptomicdatasetstoidentifycirculatingmolecularbiomarkersoffeedefficiencyingrowingpigs