Improving Genomic Prediction with Machine Learning Incorporating TPE for Hyperparameters Optimization

Depending on excellent prediction ability, machine learning has been considered the most powerful implement to analyze high-throughput sequencing genome data. However, the sophisticated process of tuning hyperparameters tremendously impedes the wider application of machine learning in animal and pla...

Full description

Bibliographic Details
Main Authors: Mang Liang, Bingxing An, Keanning Li, Lili Du, Tianyu Deng, Sheng Cao, Yueying Du, Lingyang Xu, Xue Gao, Lupei Zhang, Junya Li, Huijiang Gao
Format: Article
Language:English
Published: MDPI AG 2022-11-01
Series:Biology
Subjects:
Online Access:https://www.mdpi.com/2079-7737/11/11/1647
_version_ 1797465833334636544
author Mang Liang
Bingxing An
Keanning Li
Lili Du
Tianyu Deng
Sheng Cao
Yueying Du
Lingyang Xu
Xue Gao
Lupei Zhang
Junya Li
Huijiang Gao
author_facet Mang Liang
Bingxing An
Keanning Li
Lili Du
Tianyu Deng
Sheng Cao
Yueying Du
Lingyang Xu
Xue Gao
Lupei Zhang
Junya Li
Huijiang Gao
author_sort Mang Liang
collection DOAJ
description Depending on excellent prediction ability, machine learning has been considered the most powerful implement to analyze high-throughput sequencing genome data. However, the sophisticated process of tuning hyperparameters tremendously impedes the wider application of machine learning in animal and plant breeding programs. Therefore, we integrated an automatic tuning hyperparameters algorithm, tree-structured Parzen estimator (TPE), with machine learning to simplify the process of using machine learning for genomic prediction. In this study, we applied TPE to optimize the hyperparameters of Kernel ridge regression (KRR) and support vector regression (SVR). To evaluate the performance of TPE, we compared the prediction accuracy of KRR-TPE and SVR-TPE with the genomic best linear unbiased prediction (GBLUP) and KRR-RS, KRR-Grid, SVR-RS, and SVR-Grid, which tuned the hyperparameters of KRR and SVR by using random search (RS) and grid search (Gird) in a simulation dataset and the real datasets. The results indicated that KRR-TPE achieved the most powerful prediction ability considering all populations and was the most convenient. Especially for the Chinese Simmental beef cattle and Loblolly pine populations, the prediction accuracy of KRR-TPE had an 8.73% and 6.08% average improvement compared with GBLUP, respectively. Our study will greatly promote the application of machine learning in GP and further accelerate breeding progress.
first_indexed 2024-03-09T18:28:11Z
format Article
id doaj.art-41e07fee139348bdae5abb1f9ab9497f
institution Directory Open Access Journal
issn 2079-7737
language English
last_indexed 2024-03-09T18:28:11Z
publishDate 2022-11-01
publisher MDPI AG
record_format Article
series Biology
spelling doaj.art-41e07fee139348bdae5abb1f9ab9497f2023-11-24T07:45:00ZengMDPI AGBiology2079-77372022-11-011111164710.3390/biology11111647Improving Genomic Prediction with Machine Learning Incorporating TPE for Hyperparameters OptimizationMang Liang0Bingxing An1Keanning Li2Lili Du3Tianyu Deng4Sheng Cao5Yueying Du6Lingyang Xu7Xue Gao8Lupei Zhang9Junya Li10Huijiang Gao11Institute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, ChinaInstitute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, ChinaInstitute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, ChinaInstitute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, ChinaInstitute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, ChinaInstitute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, ChinaInstitute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, ChinaInstitute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, ChinaInstitute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, ChinaInstitute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, ChinaInstitute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, ChinaInstitute of Animal Sciences, Chinese Academy of Agricultural Sciences, Beijing 100193, ChinaDepending on excellent prediction ability, machine learning has been considered the most powerful implement to analyze high-throughput sequencing genome data. However, the sophisticated process of tuning hyperparameters tremendously impedes the wider application of machine learning in animal and plant breeding programs. Therefore, we integrated an automatic tuning hyperparameters algorithm, tree-structured Parzen estimator (TPE), with machine learning to simplify the process of using machine learning for genomic prediction. In this study, we applied TPE to optimize the hyperparameters of Kernel ridge regression (KRR) and support vector regression (SVR). To evaluate the performance of TPE, we compared the prediction accuracy of KRR-TPE and SVR-TPE with the genomic best linear unbiased prediction (GBLUP) and KRR-RS, KRR-Grid, SVR-RS, and SVR-Grid, which tuned the hyperparameters of KRR and SVR by using random search (RS) and grid search (Gird) in a simulation dataset and the real datasets. The results indicated that KRR-TPE achieved the most powerful prediction ability considering all populations and was the most convenient. Especially for the Chinese Simmental beef cattle and Loblolly pine populations, the prediction accuracy of KRR-TPE had an 8.73% and 6.08% average improvement compared with GBLUP, respectively. Our study will greatly promote the application of machine learning in GP and further accelerate breeding progress.https://www.mdpi.com/2079-7737/11/11/1647hyperparameters optimizationtree-structured Parzen estimatorgenomic predictionmachine learning
spellingShingle Mang Liang
Bingxing An
Keanning Li
Lili Du
Tianyu Deng
Sheng Cao
Yueying Du
Lingyang Xu
Xue Gao
Lupei Zhang
Junya Li
Huijiang Gao
Improving Genomic Prediction with Machine Learning Incorporating TPE for Hyperparameters Optimization
Biology
hyperparameters optimization
tree-structured Parzen estimator
genomic prediction
machine learning
title Improving Genomic Prediction with Machine Learning Incorporating TPE for Hyperparameters Optimization
title_full Improving Genomic Prediction with Machine Learning Incorporating TPE for Hyperparameters Optimization
title_fullStr Improving Genomic Prediction with Machine Learning Incorporating TPE for Hyperparameters Optimization
title_full_unstemmed Improving Genomic Prediction with Machine Learning Incorporating TPE for Hyperparameters Optimization
title_short Improving Genomic Prediction with Machine Learning Incorporating TPE for Hyperparameters Optimization
title_sort improving genomic prediction with machine learning incorporating tpe for hyperparameters optimization
topic hyperparameters optimization
tree-structured Parzen estimator
genomic prediction
machine learning
url https://www.mdpi.com/2079-7737/11/11/1647
work_keys_str_mv AT mangliang improvinggenomicpredictionwithmachinelearningincorporatingtpeforhyperparametersoptimization
AT bingxingan improvinggenomicpredictionwithmachinelearningincorporatingtpeforhyperparametersoptimization
AT keanningli improvinggenomicpredictionwithmachinelearningincorporatingtpeforhyperparametersoptimization
AT lilidu improvinggenomicpredictionwithmachinelearningincorporatingtpeforhyperparametersoptimization
AT tianyudeng improvinggenomicpredictionwithmachinelearningincorporatingtpeforhyperparametersoptimization
AT shengcao improvinggenomicpredictionwithmachinelearningincorporatingtpeforhyperparametersoptimization
AT yueyingdu improvinggenomicpredictionwithmachinelearningincorporatingtpeforhyperparametersoptimization
AT lingyangxu improvinggenomicpredictionwithmachinelearningincorporatingtpeforhyperparametersoptimization
AT xuegao improvinggenomicpredictionwithmachinelearningincorporatingtpeforhyperparametersoptimization
AT lupeizhang improvinggenomicpredictionwithmachinelearningincorporatingtpeforhyperparametersoptimization
AT junyali improvinggenomicpredictionwithmachinelearningincorporatingtpeforhyperparametersoptimization
AT huijianggao improvinggenomicpredictionwithmachinelearningincorporatingtpeforhyperparametersoptimization