Automated Hyperparameter Optimization of Gradient Boosting Decision Tree Approach for Gold Mineral Prospectivity Mapping in the Xiong’ershan Area

The weak classifier ensemble algorithms based on the decision tree model, mainly include bagging (e.g., fandom forest-RF) and boosting (e.g., gradient boosting decision tree, eXtreme gradient boosting), the former reduces the variance for the overall generalization error reduction while the latter f...

Full description

Bibliographic Details
Main Authors: Mingjing Fan, Keyan Xiao, Li Sun, Shuai Zhang, Yang Xu
Format: Article
Language:English
Published: MDPI AG 2022-12-01
Series:Minerals
Subjects:
Online Access:https://www.mdpi.com/2075-163X/12/12/1621
_version_ 1797456160950845440
author Mingjing Fan
Keyan Xiao
Li Sun
Shuai Zhang
Yang Xu
author_facet Mingjing Fan
Keyan Xiao
Li Sun
Shuai Zhang
Yang Xu
author_sort Mingjing Fan
collection DOAJ
description The weak classifier ensemble algorithms based on the decision tree model, mainly include bagging (e.g., fandom forest-RF) and boosting (e.g., gradient boosting decision tree, eXtreme gradient boosting), the former reduces the variance for the overall generalization error reduction while the latter focuses on reducing the overall bias to that end. Because of its straightforward idea, it is prevalent in MPM (mineral prospectivity mapping). However, an inevitable problem in the application of such methods is the hyperparameters tuning which is a laborious and time-consuming task. The selection of hyperparameters suitable for a specific task is worth investigating. In this paper, a tree Parzen estimator-based GBDT (gradient boosting decision tree) model (TPE-GBDT) was introduced for hyperparameters tuning (e.g., loss criterion, n_estimators, learning_rate, max_features, subsample, max_depth, min_impurity_decrease). Then, the geological data of the gold deposit in the Xiong ‘ershan area was used to create training data for MPM and to compare the TPE-GBDT and random search-GBDT training results. Results showed that the TPE-GBDT model can obtain higher accuracy than random search-GBDT in a shorter time for the same parameter space, which proves that this algorithm is superior to random search in principle and more suitable for complex hyperparametric tuning. Subsequently, the validation measures, five-fold cross-validation, confusion matrix and success rate curves were employed to evaluate the overall performance of the hyperparameter optimization models. The results showed good scores for the predictive models. Finally, according to the maximum Youden index as the threshold to divide metallogenic potential areas and non-prospective areas, the high metallogenic prospect area (accounts for 10.22% of the total study area) derived by the TPE-GBDT model contained > 90% of the known deposits and provided a preferred range for future exploration work.
first_indexed 2024-03-09T16:03:23Z
format Article
id doaj.art-841dd45c6b7f43d6874dd1a39100491d
institution Directory Open Access Journal
issn 2075-163X
language English
last_indexed 2024-03-09T16:03:23Z
publishDate 2022-12-01
publisher MDPI AG
record_format Article
series Minerals
spelling doaj.art-841dd45c6b7f43d6874dd1a39100491d2023-11-24T16:52:54ZengMDPI AGMinerals2075-163X2022-12-011212162110.3390/min12121621Automated Hyperparameter Optimization of Gradient Boosting Decision Tree Approach for Gold Mineral Prospectivity Mapping in the Xiong’ershan AreaMingjing Fan0Keyan Xiao1Li Sun2Shuai Zhang3Yang Xu4MNR Key Laboratory of Metallogeny and Mineral Resource Assessment, Institute of Mineral Resources, Chinese Academy of Geological Sciences, Beijing 100037, ChinaMNR Key Laboratory of Metallogeny and Mineral Resource Assessment, Institute of Mineral Resources, Chinese Academy of Geological Sciences, Beijing 100037, ChinaMNR Key Laboratory of Metallogeny and Mineral Resource Assessment, Institute of Mineral Resources, Chinese Academy of Geological Sciences, Beijing 100037, ChinaChina Aero Geophysical Survey and Remote Sensing Center for Natural Resources, Beijing 100083, ChinaMNR Key Laboratory of Metallogeny and Mineral Resource Assessment, Institute of Mineral Resources, Chinese Academy of Geological Sciences, Beijing 100037, ChinaThe weak classifier ensemble algorithms based on the decision tree model, mainly include bagging (e.g., fandom forest-RF) and boosting (e.g., gradient boosting decision tree, eXtreme gradient boosting), the former reduces the variance for the overall generalization error reduction while the latter focuses on reducing the overall bias to that end. Because of its straightforward idea, it is prevalent in MPM (mineral prospectivity mapping). However, an inevitable problem in the application of such methods is the hyperparameters tuning which is a laborious and time-consuming task. The selection of hyperparameters suitable for a specific task is worth investigating. In this paper, a tree Parzen estimator-based GBDT (gradient boosting decision tree) model (TPE-GBDT) was introduced for hyperparameters tuning (e.g., loss criterion, n_estimators, learning_rate, max_features, subsample, max_depth, min_impurity_decrease). Then, the geological data of the gold deposit in the Xiong ‘ershan area was used to create training data for MPM and to compare the TPE-GBDT and random search-GBDT training results. Results showed that the TPE-GBDT model can obtain higher accuracy than random search-GBDT in a shorter time for the same parameter space, which proves that this algorithm is superior to random search in principle and more suitable for complex hyperparametric tuning. Subsequently, the validation measures, five-fold cross-validation, confusion matrix and success rate curves were employed to evaluate the overall performance of the hyperparameter optimization models. The results showed good scores for the predictive models. Finally, according to the maximum Youden index as the threshold to divide metallogenic potential areas and non-prospective areas, the high metallogenic prospect area (accounts for 10.22% of the total study area) derived by the TPE-GBDT model contained > 90% of the known deposits and provided a preferred range for future exploration work.https://www.mdpi.com/2075-163X/12/12/1621mineral prospectivity mappingmachine learninghyperparameter optimizationgradient boosting decision tree
spellingShingle Mingjing Fan
Keyan Xiao
Li Sun
Shuai Zhang
Yang Xu
Automated Hyperparameter Optimization of Gradient Boosting Decision Tree Approach for Gold Mineral Prospectivity Mapping in the Xiong’ershan Area
Minerals
mineral prospectivity mapping
machine learning
hyperparameter optimization
gradient boosting decision tree
title Automated Hyperparameter Optimization of Gradient Boosting Decision Tree Approach for Gold Mineral Prospectivity Mapping in the Xiong’ershan Area
title_full Automated Hyperparameter Optimization of Gradient Boosting Decision Tree Approach for Gold Mineral Prospectivity Mapping in the Xiong’ershan Area
title_fullStr Automated Hyperparameter Optimization of Gradient Boosting Decision Tree Approach for Gold Mineral Prospectivity Mapping in the Xiong’ershan Area
title_full_unstemmed Automated Hyperparameter Optimization of Gradient Boosting Decision Tree Approach for Gold Mineral Prospectivity Mapping in the Xiong’ershan Area
title_short Automated Hyperparameter Optimization of Gradient Boosting Decision Tree Approach for Gold Mineral Prospectivity Mapping in the Xiong’ershan Area
title_sort automated hyperparameter optimization of gradient boosting decision tree approach for gold mineral prospectivity mapping in the xiong ershan area
topic mineral prospectivity mapping
machine learning
hyperparameter optimization
gradient boosting decision tree
url https://www.mdpi.com/2075-163X/12/12/1621
work_keys_str_mv AT mingjingfan automatedhyperparameteroptimizationofgradientboostingdecisiontreeapproachforgoldmineralprospectivitymappinginthexiongershanarea
AT keyanxiao automatedhyperparameteroptimizationofgradientboostingdecisiontreeapproachforgoldmineralprospectivitymappinginthexiongershanarea
AT lisun automatedhyperparameteroptimizationofgradientboostingdecisiontreeapproachforgoldmineralprospectivitymappinginthexiongershanarea
AT shuaizhang automatedhyperparameteroptimizationofgradientboostingdecisiontreeapproachforgoldmineralprospectivitymappinginthexiongershanarea
AT yangxu automatedhyperparameteroptimizationofgradientboostingdecisiontreeapproachforgoldmineralprospectivitymappinginthexiongershanarea