Automated Hyperparameter Optimization of Gradient Boosting Decision Tree Approach for Gold Mineral Prospectivity Mapping in the Xiong’ershan Area
The weak classifier ensemble algorithms based on the decision tree model, mainly include bagging (e.g., fandom forest-RF) and boosting (e.g., gradient boosting decision tree, eXtreme gradient boosting), the former reduces the variance for the overall generalization error reduction while the latter f...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2022-12-01
|
Series: | Minerals |
Subjects: | |
Online Access: | https://www.mdpi.com/2075-163X/12/12/1621 |
_version_ | 1797456160950845440 |
---|---|
author | Mingjing Fan Keyan Xiao Li Sun Shuai Zhang Yang Xu |
author_facet | Mingjing Fan Keyan Xiao Li Sun Shuai Zhang Yang Xu |
author_sort | Mingjing Fan |
collection | DOAJ |
description | The weak classifier ensemble algorithms based on the decision tree model, mainly include bagging (e.g., fandom forest-RF) and boosting (e.g., gradient boosting decision tree, eXtreme gradient boosting), the former reduces the variance for the overall generalization error reduction while the latter focuses on reducing the overall bias to that end. Because of its straightforward idea, it is prevalent in MPM (mineral prospectivity mapping). However, an inevitable problem in the application of such methods is the hyperparameters tuning which is a laborious and time-consuming task. The selection of hyperparameters suitable for a specific task is worth investigating. In this paper, a tree Parzen estimator-based GBDT (gradient boosting decision tree) model (TPE-GBDT) was introduced for hyperparameters tuning (e.g., loss criterion, n_estimators, learning_rate, max_features, subsample, max_depth, min_impurity_decrease). Then, the geological data of the gold deposit in the Xiong ‘ershan area was used to create training data for MPM and to compare the TPE-GBDT and random search-GBDT training results. Results showed that the TPE-GBDT model can obtain higher accuracy than random search-GBDT in a shorter time for the same parameter space, which proves that this algorithm is superior to random search in principle and more suitable for complex hyperparametric tuning. Subsequently, the validation measures, five-fold cross-validation, confusion matrix and success rate curves were employed to evaluate the overall performance of the hyperparameter optimization models. The results showed good scores for the predictive models. Finally, according to the maximum Youden index as the threshold to divide metallogenic potential areas and non-prospective areas, the high metallogenic prospect area (accounts for 10.22% of the total study area) derived by the TPE-GBDT model contained > 90% of the known deposits and provided a preferred range for future exploration work. |
first_indexed | 2024-03-09T16:03:23Z |
format | Article |
id | doaj.art-841dd45c6b7f43d6874dd1a39100491d |
institution | Directory Open Access Journal |
issn | 2075-163X |
language | English |
last_indexed | 2024-03-09T16:03:23Z |
publishDate | 2022-12-01 |
publisher | MDPI AG |
record_format | Article |
series | Minerals |
spelling | doaj.art-841dd45c6b7f43d6874dd1a39100491d2023-11-24T16:52:54ZengMDPI AGMinerals2075-163X2022-12-011212162110.3390/min12121621Automated Hyperparameter Optimization of Gradient Boosting Decision Tree Approach for Gold Mineral Prospectivity Mapping in the Xiong’ershan AreaMingjing Fan0Keyan Xiao1Li Sun2Shuai Zhang3Yang Xu4MNR Key Laboratory of Metallogeny and Mineral Resource Assessment, Institute of Mineral Resources, Chinese Academy of Geological Sciences, Beijing 100037, ChinaMNR Key Laboratory of Metallogeny and Mineral Resource Assessment, Institute of Mineral Resources, Chinese Academy of Geological Sciences, Beijing 100037, ChinaMNR Key Laboratory of Metallogeny and Mineral Resource Assessment, Institute of Mineral Resources, Chinese Academy of Geological Sciences, Beijing 100037, ChinaChina Aero Geophysical Survey and Remote Sensing Center for Natural Resources, Beijing 100083, ChinaMNR Key Laboratory of Metallogeny and Mineral Resource Assessment, Institute of Mineral Resources, Chinese Academy of Geological Sciences, Beijing 100037, ChinaThe weak classifier ensemble algorithms based on the decision tree model, mainly include bagging (e.g., fandom forest-RF) and boosting (e.g., gradient boosting decision tree, eXtreme gradient boosting), the former reduces the variance for the overall generalization error reduction while the latter focuses on reducing the overall bias to that end. Because of its straightforward idea, it is prevalent in MPM (mineral prospectivity mapping). However, an inevitable problem in the application of such methods is the hyperparameters tuning which is a laborious and time-consuming task. The selection of hyperparameters suitable for a specific task is worth investigating. In this paper, a tree Parzen estimator-based GBDT (gradient boosting decision tree) model (TPE-GBDT) was introduced for hyperparameters tuning (e.g., loss criterion, n_estimators, learning_rate, max_features, subsample, max_depth, min_impurity_decrease). Then, the geological data of the gold deposit in the Xiong ‘ershan area was used to create training data for MPM and to compare the TPE-GBDT and random search-GBDT training results. Results showed that the TPE-GBDT model can obtain higher accuracy than random search-GBDT in a shorter time for the same parameter space, which proves that this algorithm is superior to random search in principle and more suitable for complex hyperparametric tuning. Subsequently, the validation measures, five-fold cross-validation, confusion matrix and success rate curves were employed to evaluate the overall performance of the hyperparameter optimization models. The results showed good scores for the predictive models. Finally, according to the maximum Youden index as the threshold to divide metallogenic potential areas and non-prospective areas, the high metallogenic prospect area (accounts for 10.22% of the total study area) derived by the TPE-GBDT model contained > 90% of the known deposits and provided a preferred range for future exploration work.https://www.mdpi.com/2075-163X/12/12/1621mineral prospectivity mappingmachine learninghyperparameter optimizationgradient boosting decision tree |
spellingShingle | Mingjing Fan Keyan Xiao Li Sun Shuai Zhang Yang Xu Automated Hyperparameter Optimization of Gradient Boosting Decision Tree Approach for Gold Mineral Prospectivity Mapping in the Xiong’ershan Area Minerals mineral prospectivity mapping machine learning hyperparameter optimization gradient boosting decision tree |
title | Automated Hyperparameter Optimization of Gradient Boosting Decision Tree Approach for Gold Mineral Prospectivity Mapping in the Xiong’ershan Area |
title_full | Automated Hyperparameter Optimization of Gradient Boosting Decision Tree Approach for Gold Mineral Prospectivity Mapping in the Xiong’ershan Area |
title_fullStr | Automated Hyperparameter Optimization of Gradient Boosting Decision Tree Approach for Gold Mineral Prospectivity Mapping in the Xiong’ershan Area |
title_full_unstemmed | Automated Hyperparameter Optimization of Gradient Boosting Decision Tree Approach for Gold Mineral Prospectivity Mapping in the Xiong’ershan Area |
title_short | Automated Hyperparameter Optimization of Gradient Boosting Decision Tree Approach for Gold Mineral Prospectivity Mapping in the Xiong’ershan Area |
title_sort | automated hyperparameter optimization of gradient boosting decision tree approach for gold mineral prospectivity mapping in the xiong ershan area |
topic | mineral prospectivity mapping machine learning hyperparameter optimization gradient boosting decision tree |
url | https://www.mdpi.com/2075-163X/12/12/1621 |
work_keys_str_mv | AT mingjingfan automatedhyperparameteroptimizationofgradientboostingdecisiontreeapproachforgoldmineralprospectivitymappinginthexiongershanarea AT keyanxiao automatedhyperparameteroptimizationofgradientboostingdecisiontreeapproachforgoldmineralprospectivitymappinginthexiongershanarea AT lisun automatedhyperparameteroptimizationofgradientboostingdecisiontreeapproachforgoldmineralprospectivitymappinginthexiongershanarea AT shuaizhang automatedhyperparameteroptimizationofgradientboostingdecisiontreeapproachforgoldmineralprospectivitymappinginthexiongershanarea AT yangxu automatedhyperparameteroptimizationofgradientboostingdecisiontreeapproachforgoldmineralprospectivitymappinginthexiongershanarea |