Genomic prediction through machine learning and neural networks for traits with epistasis

Genomic wide selection (GWS) is one contributions of molecular genetics to breeding. Machine learning (ML) and artificial neural networks (ANN) methods are non-parameterized and can develop more accurate and parsimonious models for GWS analysis. Multivariate Adaptive Regression Splines (MARS) is con...

Full description

Bibliographic Details
Main Authors: Weverton Gomes da Costa, Maurício de Oliveira Celeri, Ivan de Paiva Barbosa, Gabi Nunes Silva, Camila Ferreira Azevedo, Aluizio Borem, Moysés Nascimento, Cosme Damião Cruz
Format: Article
Language:English
Published: Elsevier 2022-01-01
Series:Computational and Structural Biotechnology Journal
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2001037022004342
_version_ 1797978170788413440
author Weverton Gomes da Costa
Maurício de Oliveira Celeri
Ivan de Paiva Barbosa
Gabi Nunes Silva
Camila Ferreira Azevedo
Aluizio Borem
Moysés Nascimento
Cosme Damião Cruz
author_facet Weverton Gomes da Costa
Maurício de Oliveira Celeri
Ivan de Paiva Barbosa
Gabi Nunes Silva
Camila Ferreira Azevedo
Aluizio Borem
Moysés Nascimento
Cosme Damião Cruz
author_sort Weverton Gomes da Costa
collection DOAJ
description Genomic wide selection (GWS) is one contributions of molecular genetics to breeding. Machine learning (ML) and artificial neural networks (ANN) methods are non-parameterized and can develop more accurate and parsimonious models for GWS analysis. Multivariate Adaptive Regression Splines (MARS) is considered one of the most flexible ML methods, automatically modeling nonlinearities and interactions of the predictor variables. This study aimed to evaluate and compare methods based on ANN, ML, including MARS, and G-BLUP through GWS. An F2 population formed by 1000 individuals and genotyped for 4010 SNP markers and twelve traits from a model considering epistatic effect, with QTL numbers ranging from eight to 480 and heritability (h2) of 0.3, 0.5 or 0.8 were simulated. Variation in heritability and number of QTL impacts the performance of methods. About quantitative traits (40, 80, 120, 240, and 480 QTLs) was observed highest R2 to Radial Base Network (RBF) and G-BLUP, followed by Random Forest (RF), Bagging (BA), and Boosting (BO). RF and BA also showed better results for traits to h2 of 0.3 with R2 values 16.51% and 16.30%, respectively, while MARS methods showed better results for oligogenic traits with R2 values ranging from 39,12 % to 43,20 % in h2 of 0.5 and from 59.92% to 78,56% in h2 of 0.8. Non-additive MARS methods also showed high R2 for traits with high heritability and 240 QTLs or more. ANN and ML methods are powerful tools to predict genetic values in traits with epistatic effect, for different degrees of heritability and QTL numbers.
first_indexed 2024-04-11T05:18:44Z
format Article
id doaj.art-b43284a44af440cdbded5705888b9609
institution Directory Open Access Journal
issn 2001-0370
language English
last_indexed 2024-04-11T05:18:44Z
publishDate 2022-01-01
publisher Elsevier
record_format Article
series Computational and Structural Biotechnology Journal
spelling doaj.art-b43284a44af440cdbded5705888b96092022-12-24T04:54:34ZengElsevierComputational and Structural Biotechnology Journal2001-03702022-01-012054905499Genomic prediction through machine learning and neural networks for traits with epistasisWeverton Gomes da Costa0Maurício de Oliveira Celeri1Ivan de Paiva Barbosa2Gabi Nunes Silva3Camila Ferreira Azevedo4Aluizio Borem5Moysés Nascimento6Cosme Damião Cruz7Department of General Biology, Bioinformatics Laboratory, Federal University of Viçosa, Viçosa, MG, Brazil; Corresponding author.Department of Statistics, Laboratory of Computational Intelligence and Statistical Learning, Federal University of Viçosa – UFV, Viçosa, MG, BrazilDepartment of Agronomy, Federal University of Viçosa, Viçosa, MG, BrazilDepartment of Mathematics and Statistics, Federal University of Rondônia, Ji-Paraná Campus, RO, BrazilDepartment of Agronomy, Federal University of Viçosa, Viçosa, MG, BrazilDepartment of Agronomy, Federal University of Viçosa, Viçosa, MG, BrazilDepartment of Statistics, Laboratory of Computational Intelligence and Statistical Learning, Federal University of Viçosa – UFV, Viçosa, MG, BrazilDepartment of General Biology, Bioinformatics Laboratory, Federal University of Viçosa, Viçosa, MG, BrazilGenomic wide selection (GWS) is one contributions of molecular genetics to breeding. Machine learning (ML) and artificial neural networks (ANN) methods are non-parameterized and can develop more accurate and parsimonious models for GWS analysis. Multivariate Adaptive Regression Splines (MARS) is considered one of the most flexible ML methods, automatically modeling nonlinearities and interactions of the predictor variables. This study aimed to evaluate and compare methods based on ANN, ML, including MARS, and G-BLUP through GWS. An F2 population formed by 1000 individuals and genotyped for 4010 SNP markers and twelve traits from a model considering epistatic effect, with QTL numbers ranging from eight to 480 and heritability (h2) of 0.3, 0.5 or 0.8 were simulated. Variation in heritability and number of QTL impacts the performance of methods. About quantitative traits (40, 80, 120, 240, and 480 QTLs) was observed highest R2 to Radial Base Network (RBF) and G-BLUP, followed by Random Forest (RF), Bagging (BA), and Boosting (BO). RF and BA also showed better results for traits to h2 of 0.3 with R2 values 16.51% and 16.30%, respectively, while MARS methods showed better results for oligogenic traits with R2 values ranging from 39,12 % to 43,20 % in h2 of 0.5 and from 59.92% to 78,56% in h2 of 0.8. Non-additive MARS methods also showed high R2 for traits with high heritability and 240 QTLs or more. ANN and ML methods are powerful tools to predict genetic values in traits with epistatic effect, for different degrees of heritability and QTL numbers.http://www.sciencedirect.com/science/article/pii/S2001037022004342Genome wide selectionQuantitative trait locusNon-additive effectsMultivariate adaptive regression splinesGenome-enabled prediction
spellingShingle Weverton Gomes da Costa
Maurício de Oliveira Celeri
Ivan de Paiva Barbosa
Gabi Nunes Silva
Camila Ferreira Azevedo
Aluizio Borem
Moysés Nascimento
Cosme Damião Cruz
Genomic prediction through machine learning and neural networks for traits with epistasis
Computational and Structural Biotechnology Journal
Genome wide selection
Quantitative trait locus
Non-additive effects
Multivariate adaptive regression splines
Genome-enabled prediction
title Genomic prediction through machine learning and neural networks for traits with epistasis
title_full Genomic prediction through machine learning and neural networks for traits with epistasis
title_fullStr Genomic prediction through machine learning and neural networks for traits with epistasis
title_full_unstemmed Genomic prediction through machine learning and neural networks for traits with epistasis
title_short Genomic prediction through machine learning and neural networks for traits with epistasis
title_sort genomic prediction through machine learning and neural networks for traits with epistasis
topic Genome wide selection
Quantitative trait locus
Non-additive effects
Multivariate adaptive regression splines
Genome-enabled prediction
url http://www.sciencedirect.com/science/article/pii/S2001037022004342
work_keys_str_mv AT wevertongomesdacosta genomicpredictionthroughmachinelearningandneuralnetworksfortraitswithepistasis
AT mauriciodeoliveiraceleri genomicpredictionthroughmachinelearningandneuralnetworksfortraitswithepistasis
AT ivandepaivabarbosa genomicpredictionthroughmachinelearningandneuralnetworksfortraitswithepistasis
AT gabinunessilva genomicpredictionthroughmachinelearningandneuralnetworksfortraitswithepistasis
AT camilaferreiraazevedo genomicpredictionthroughmachinelearningandneuralnetworksfortraitswithepistasis
AT aluizioborem genomicpredictionthroughmachinelearningandneuralnetworksfortraitswithepistasis
AT moysesnascimento genomicpredictionthroughmachinelearningandneuralnetworksfortraitswithepistasis
AT cosmedamiaocruz genomicpredictionthroughmachinelearningandneuralnetworksfortraitswithepistasis