Using machine learning to improve the accuracy of genomic prediction of reproduction traits in pigs

Abstract Background Recently, machine learning (ML) has become attractive in genomic prediction, but its superiority in genomic prediction over conventional (ss) GBLUP methods and the choice of optimal ML methods need to be investigated. Results In this study, 2566 Chinese Yorkshire pigs with reprod...

Full description

Bibliographic Details
Main Authors: Xue Wang, Shaolei Shi, Guijiang Wang, Wenxue Luo, Xia Wei, Ao Qiu, Fei Luo, Xiangdong Ding
Format: Article
Language:English
Published: BMC 2022-05-01
Series:Journal of Animal Science and Biotechnology
Subjects:
Online Access:https://doi.org/10.1186/s40104-022-00708-0
_version_ 1811341224287666176
author Xue Wang
Shaolei Shi
Guijiang Wang
Wenxue Luo
Xia Wei
Ao Qiu
Fei Luo
Xiangdong Ding
author_facet Xue Wang
Shaolei Shi
Guijiang Wang
Wenxue Luo
Xia Wei
Ao Qiu
Fei Luo
Xiangdong Ding
author_sort Xue Wang
collection DOAJ
description Abstract Background Recently, machine learning (ML) has become attractive in genomic prediction, but its superiority in genomic prediction over conventional (ss) GBLUP methods and the choice of optimal ML methods need to be investigated. Results In this study, 2566 Chinese Yorkshire pigs with reproduction trait records were genotyped with the GenoBaits Porcine SNP 50 K and PorcineSNP50 panels. Four ML methods, including support vector regression (SVR), kernel ridge regression (KRR), random forest (RF) and Adaboost.R2 were implemented. Through 20 replicates of fivefold cross-validation (CV) and one prediction for younger individuals, the utility of ML methods in genomic prediction was explored. In CV, compared with genomic BLUP (GBLUP), single-step GBLUP (ssGBLUP) and the Bayesian method BayesHE, ML methods significantly outperformed these conventional methods. ML methods improved the genomic prediction accuracy of GBLUP, ssGBLUP, and BayesHE by 19.3%, 15.0% and 20.8%, respectively. In addition, ML methods yielded smaller mean squared error (MSE) and mean absolute error (MAE) in all scenarios. ssGBLUP yielded an improvement of 3.8% on average in accuracy compared to that of GBLUP, and the accuracy of BayesHE was close to that of GBLUP. In genomic prediction of younger individuals, RF and Adaboost.R2_KRR performed better than GBLUP and BayesHE, while ssGBLUP performed comparably with RF, and ssGBLUP yielded slightly higher accuracy and lower MSE than Adaboost.R2_KRR in the prediction of total number of piglets born, while for number of piglets born alive, Adaboost.R2_KRR performed significantly better than ssGBLUP. Among ML methods, Adaboost.R2_KRR consistently performed well in our study. Our findings also demonstrated that optimal hyperparameters are useful for ML methods. After tuning hyperparameters in CV and in predicting genomic outcomes of younger individuals, the average improvement was 14.3% and 21.8% over those using default hyperparameters, respectively. Conclusion Our findings demonstrated that ML methods had better overall prediction performance than conventional genomic selection methods, and could be new options for genomic prediction. Among ML methods, Adaboost.R2_KRR consistently performed well in our study, and tuning hyperparameters is necessary for ML methods. The optimal hyperparameters depend on the character of traits, datasets etc.
first_indexed 2024-04-13T18:54:21Z
format Article
id doaj.art-24973a2e33f047cba503200044a0a46e
institution Directory Open Access Journal
issn 2049-1891
language English
last_indexed 2024-04-13T18:54:21Z
publishDate 2022-05-01
publisher BMC
record_format Article
series Journal of Animal Science and Biotechnology
spelling doaj.art-24973a2e33f047cba503200044a0a46e2022-12-22T02:34:19ZengBMCJournal of Animal Science and Biotechnology2049-18912022-05-0113111210.1186/s40104-022-00708-0Using machine learning to improve the accuracy of genomic prediction of reproduction traits in pigsXue Wang0Shaolei Shi1Guijiang Wang2Wenxue Luo3Xia Wei4Ao Qiu5Fei Luo6Xiangdong Ding7Key Laboratory of Animal Genetics and Breeding of Ministry of Agriculture and Rural Affairs, National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural UniversityKey Laboratory of Animal Genetics and Breeding of Ministry of Agriculture and Rural Affairs, National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural UniversityHebei Province Animal Husbandry and Improved Breeds Work StationHebei Province Animal Husbandry and Improved Breeds Work StationZhangjiakou Dahao Heshan New Agricultural Development Co., LtdKey Laboratory of Animal Genetics and Breeding of Ministry of Agriculture and Rural Affairs, National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural UniversityHebei Province Animal Husbandry and Improved Breeds Work StationKey Laboratory of Animal Genetics and Breeding of Ministry of Agriculture and Rural Affairs, National Engineering Laboratory of Animal Breeding, College of Animal Science and Technology, China Agricultural UniversityAbstract Background Recently, machine learning (ML) has become attractive in genomic prediction, but its superiority in genomic prediction over conventional (ss) GBLUP methods and the choice of optimal ML methods need to be investigated. Results In this study, 2566 Chinese Yorkshire pigs with reproduction trait records were genotyped with the GenoBaits Porcine SNP 50 K and PorcineSNP50 panels. Four ML methods, including support vector regression (SVR), kernel ridge regression (KRR), random forest (RF) and Adaboost.R2 were implemented. Through 20 replicates of fivefold cross-validation (CV) and one prediction for younger individuals, the utility of ML methods in genomic prediction was explored. In CV, compared with genomic BLUP (GBLUP), single-step GBLUP (ssGBLUP) and the Bayesian method BayesHE, ML methods significantly outperformed these conventional methods. ML methods improved the genomic prediction accuracy of GBLUP, ssGBLUP, and BayesHE by 19.3%, 15.0% and 20.8%, respectively. In addition, ML methods yielded smaller mean squared error (MSE) and mean absolute error (MAE) in all scenarios. ssGBLUP yielded an improvement of 3.8% on average in accuracy compared to that of GBLUP, and the accuracy of BayesHE was close to that of GBLUP. In genomic prediction of younger individuals, RF and Adaboost.R2_KRR performed better than GBLUP and BayesHE, while ssGBLUP performed comparably with RF, and ssGBLUP yielded slightly higher accuracy and lower MSE than Adaboost.R2_KRR in the prediction of total number of piglets born, while for number of piglets born alive, Adaboost.R2_KRR performed significantly better than ssGBLUP. Among ML methods, Adaboost.R2_KRR consistently performed well in our study. Our findings also demonstrated that optimal hyperparameters are useful for ML methods. After tuning hyperparameters in CV and in predicting genomic outcomes of younger individuals, the average improvement was 14.3% and 21.8% over those using default hyperparameters, respectively. Conclusion Our findings demonstrated that ML methods had better overall prediction performance than conventional genomic selection methods, and could be new options for genomic prediction. Among ML methods, Adaboost.R2_KRR consistently performed well in our study, and tuning hyperparameters is necessary for ML methods. The optimal hyperparameters depend on the character of traits, datasets etc.https://doi.org/10.1186/s40104-022-00708-0Genomic predictionMachine learningPigPrediction accuracy
spellingShingle Xue Wang
Shaolei Shi
Guijiang Wang
Wenxue Luo
Xia Wei
Ao Qiu
Fei Luo
Xiangdong Ding
Using machine learning to improve the accuracy of genomic prediction of reproduction traits in pigs
Journal of Animal Science and Biotechnology
Genomic prediction
Machine learning
Pig
Prediction accuracy
title Using machine learning to improve the accuracy of genomic prediction of reproduction traits in pigs
title_full Using machine learning to improve the accuracy of genomic prediction of reproduction traits in pigs
title_fullStr Using machine learning to improve the accuracy of genomic prediction of reproduction traits in pigs
title_full_unstemmed Using machine learning to improve the accuracy of genomic prediction of reproduction traits in pigs
title_short Using machine learning to improve the accuracy of genomic prediction of reproduction traits in pigs
title_sort using machine learning to improve the accuracy of genomic prediction of reproduction traits in pigs
topic Genomic prediction
Machine learning
Pig
Prediction accuracy
url https://doi.org/10.1186/s40104-022-00708-0
work_keys_str_mv AT xuewang usingmachinelearningtoimprovetheaccuracyofgenomicpredictionofreproductiontraitsinpigs
AT shaoleishi usingmachinelearningtoimprovetheaccuracyofgenomicpredictionofreproductiontraitsinpigs
AT guijiangwang usingmachinelearningtoimprovetheaccuracyofgenomicpredictionofreproductiontraitsinpigs
AT wenxueluo usingmachinelearningtoimprovetheaccuracyofgenomicpredictionofreproductiontraitsinpigs
AT xiawei usingmachinelearningtoimprovetheaccuracyofgenomicpredictionofreproductiontraitsinpigs
AT aoqiu usingmachinelearningtoimprovetheaccuracyofgenomicpredictionofreproductiontraitsinpigs
AT feiluo usingmachinelearningtoimprovetheaccuracyofgenomicpredictionofreproductiontraitsinpigs
AT xiangdongding usingmachinelearningtoimprovetheaccuracyofgenomicpredictionofreproductiontraitsinpigs