MSXFGP: combining improved sparrow search algorithm with XGBoost for enhanced genomic prediction

Abstract Background With the significant reduction in the cost of high-throughput sequencing technology, genomic selection technology has been rapidly developed in the field of plant breeding. Although numerous genomic selection methods have been proposed by researchers, the existing genomic selecti...

Full description

Bibliographic Details
Main Authors: Ganghui Zhou, Jing Gao, Dongshi Zuo, Jin Li, Rui Li
Format: Article
Language:English
Published: BMC 2023-10-01
Series:BMC Bioinformatics
Subjects:
Online Access:https://doi.org/10.1186/s12859-023-05514-7
_version_ 1797556100838457344
author Ganghui Zhou
Jing Gao
Dongshi Zuo
Jin Li
Rui Li
author_facet Ganghui Zhou
Jing Gao
Dongshi Zuo
Jin Li
Rui Li
author_sort Ganghui Zhou
collection DOAJ
description Abstract Background With the significant reduction in the cost of high-throughput sequencing technology, genomic selection technology has been rapidly developed in the field of plant breeding. Although numerous genomic selection methods have been proposed by researchers, the existing genomic selection methods still face the problem of poor prediction accuracy in practical applications. Results This paper proposes a genome prediction method MSXFGP based on a multi-strategy improved sparrow search algorithm (SSA) to optimize XGBoost parameters and feature selection. Firstly, logistic chaos mapping, elite learning, adaptive parameter adjustment, Levy flight, and an early stop strategy are incorporated into the SSA. This integration serves to enhance the global and local search capabilities of the algorithm, thereby improving its convergence accuracy and stability. Subsequently, the improved SSA is utilized to concurrently optimize XGBoost parameters and feature selection, leading to the establishment of a new genomic selection method, MSXFGP. Utilizing both the coefficient of determination R2 and the Pearson correlation coefficient as evaluation metrics, MSXFGP was evaluated against six existing genomic selection models across six datasets. The findings reveal that MSXFGP prediction accuracy is comparable or better than existing widely used genomic selection methods, and it exhibits better accuracy when R2 is utilized as an assessment metric. Additionally, this research provides a user-friendly Python utility designed to aid breeders in the effective application of this innovative method. MSXFGP is accessible at https://github.com/DIBreeding/MSXFGP . Conclusions The experimental results show that the prediction accuracy of MSXFGP is comparable or better than existing genome selection methods, providing a new approach for plant genome selection.
first_indexed 2024-03-10T16:56:57Z
format Article
id doaj.art-7bde3de7406b4864ac45cbcc0fa3570d
institution Directory Open Access Journal
issn 1471-2105
language English
last_indexed 2024-03-10T16:56:57Z
publishDate 2023-10-01
publisher BMC
record_format Article
series BMC Bioinformatics
spelling doaj.art-7bde3de7406b4864ac45cbcc0fa3570d2023-11-20T11:06:18ZengBMCBMC Bioinformatics1471-21052023-10-0124112110.1186/s12859-023-05514-7MSXFGP: combining improved sparrow search algorithm with XGBoost for enhanced genomic predictionGanghui Zhou0Jing Gao1Dongshi Zuo2Jin Li3Rui Li4College of Computer and Information Engineering, Inner Mongolia Agricultural UniversityCollege of Computer and Information Engineering, Inner Mongolia Agricultural UniversityCollege of Computer and Information Engineering, Inner Mongolia Agricultural UniversityCollege of Computer and Information Engineering, Inner Mongolia Agricultural UniversityCollege of Computer and Information Engineering, Inner Mongolia Agricultural UniversityAbstract Background With the significant reduction in the cost of high-throughput sequencing technology, genomic selection technology has been rapidly developed in the field of plant breeding. Although numerous genomic selection methods have been proposed by researchers, the existing genomic selection methods still face the problem of poor prediction accuracy in practical applications. Results This paper proposes a genome prediction method MSXFGP based on a multi-strategy improved sparrow search algorithm (SSA) to optimize XGBoost parameters and feature selection. Firstly, logistic chaos mapping, elite learning, adaptive parameter adjustment, Levy flight, and an early stop strategy are incorporated into the SSA. This integration serves to enhance the global and local search capabilities of the algorithm, thereby improving its convergence accuracy and stability. Subsequently, the improved SSA is utilized to concurrently optimize XGBoost parameters and feature selection, leading to the establishment of a new genomic selection method, MSXFGP. Utilizing both the coefficient of determination R2 and the Pearson correlation coefficient as evaluation metrics, MSXFGP was evaluated against six existing genomic selection models across six datasets. The findings reveal that MSXFGP prediction accuracy is comparable or better than existing widely used genomic selection methods, and it exhibits better accuracy when R2 is utilized as an assessment metric. Additionally, this research provides a user-friendly Python utility designed to aid breeders in the effective application of this innovative method. MSXFGP is accessible at https://github.com/DIBreeding/MSXFGP . Conclusions The experimental results show that the prediction accuracy of MSXFGP is comparable or better than existing genome selection methods, providing a new approach for plant genome selection.https://doi.org/10.1186/s12859-023-05514-7Genome selectionSparrow search algorithmXGBoostParameter optimizationFeature selection
spellingShingle Ganghui Zhou
Jing Gao
Dongshi Zuo
Jin Li
Rui Li
MSXFGP: combining improved sparrow search algorithm with XGBoost for enhanced genomic prediction
BMC Bioinformatics
Genome selection
Sparrow search algorithm
XGBoost
Parameter optimization
Feature selection
title MSXFGP: combining improved sparrow search algorithm with XGBoost for enhanced genomic prediction
title_full MSXFGP: combining improved sparrow search algorithm with XGBoost for enhanced genomic prediction
title_fullStr MSXFGP: combining improved sparrow search algorithm with XGBoost for enhanced genomic prediction
title_full_unstemmed MSXFGP: combining improved sparrow search algorithm with XGBoost for enhanced genomic prediction
title_short MSXFGP: combining improved sparrow search algorithm with XGBoost for enhanced genomic prediction
title_sort msxfgp combining improved sparrow search algorithm with xgboost for enhanced genomic prediction
topic Genome selection
Sparrow search algorithm
XGBoost
Parameter optimization
Feature selection
url https://doi.org/10.1186/s12859-023-05514-7
work_keys_str_mv AT ganghuizhou msxfgpcombiningimprovedsparrowsearchalgorithmwithxgboostforenhancedgenomicprediction
AT jinggao msxfgpcombiningimprovedsparrowsearchalgorithmwithxgboostforenhancedgenomicprediction
AT dongshizuo msxfgpcombiningimprovedsparrowsearchalgorithmwithxgboostforenhancedgenomicprediction
AT jinli msxfgpcombiningimprovedsparrowsearchalgorithmwithxgboostforenhancedgenomicprediction
AT ruili msxfgpcombiningimprovedsparrowsearchalgorithmwithxgboostforenhancedgenomicprediction