Ensemble Modelling of Skipjack Tuna (<i>Katsuwonus pelamis</i>) Habitats in the Western North Pacific Using Satellite Remotely Sensed Data; a Comparative Analysis Using Machine-Learning Models

To examine skipjack tuna’s habitat utilization in the western North Pacific (WNP) we used an ensemble modelling approach, which applied a fisher- derived presence-only dataset and three satellite remote-sensing predictor variables. The skipjack tuna data were compiled from daily point fishing data i...

Full description

Bibliographic Details
Main Authors: Robinson Mugo, Sei-Ichi Saitoh
Format: Article
Language:English
Published: MDPI AG 2020-08-01
Series:Remote Sensing
Subjects:
Online Access:https://www.mdpi.com/2072-4292/12/16/2591
_version_ 1797558734450327552
author Robinson Mugo
Sei-Ichi Saitoh
author_facet Robinson Mugo
Sei-Ichi Saitoh
author_sort Robinson Mugo
collection DOAJ
description To examine skipjack tuna’s habitat utilization in the western North Pacific (WNP) we used an ensemble modelling approach, which applied a fisher- derived presence-only dataset and three satellite remote-sensing predictor variables. The skipjack tuna data were compiled from daily point fishing data into monthly composites and re-gridded into a quarter degree resolution to match the environmental predictor variables, the sea surface temperature (SST), sea surface chlorophyll-a (SSC) and sea surface height anomalies (SSHA), which were also processed at quarter degree spatial resolution. Using the <i>sdm</i> package operated in RStudio software, we constructed habitat models over a 9-month period, from March to November 2004, using 17 algorithms, with a 70:30 split of training and test data, with bootstrapping and 10 runs as parameter settings for our models. Model performance evaluation was conducted using the area under the curve (AUC) of the receiver operating characteristic (ROC), the point biserial correlation coefficient (COR), the true skill statistic (TSS) and Cohen’s kappa (<i>k</i>) metrics. We analyzed the response curves for each predictor variable per algorithm, the variable importance information and the ROC plots. Ensemble predictions of habitats were weighted with the TSS metric. Model performance varied across various algorithms, with the Support Vector Machines (SVM), Boosted Regression Trees (BRT), Random Forests (RF), Multivariate Adaptive Regression Splines (MARS), Generalized Additive Models (GAM), Classification and Regression Trees (CART), Multi-Layer Perceptron (MLP), Recursive Partitioning and Regression Trees (RPART), and Maximum Entropy (MAXENT), showing consistently high performance than other algorithms, while the Flexible Discriminant Analysis (FDA), Mixture Discriminant Analysis (MDA), Bioclim (BIOC), Domain (DOM), Maxlike (MAXL), Mahalanobis Distance (MAHA) and Radial Basis Function (RBF) had lower performance. We found inter-algorithm variations in predictor variable responses. We conclude that the multi-algorithm modelling approach enabled us to assess the variability in algorithm performance, hence a data driven basis for building the ensemble model. Given the inter-algorithm variations observed, the ensemble prediction maps indicated a better habitat utilization map of skipjack tuna than would have been achieved by a single algorithm.
first_indexed 2024-03-10T17:34:56Z
format Article
id doaj.art-901ccf03cf0b47b0af70bf1e54a5287a
institution Directory Open Access Journal
issn 2072-4292
language English
last_indexed 2024-03-10T17:34:56Z
publishDate 2020-08-01
publisher MDPI AG
record_format Article
series Remote Sensing
spelling doaj.art-901ccf03cf0b47b0af70bf1e54a5287a2023-11-20T09:53:27ZengMDPI AGRemote Sensing2072-42922020-08-011216259110.3390/rs12162591Ensemble Modelling of Skipjack Tuna (<i>Katsuwonus pelamis</i>) Habitats in the Western North Pacific Using Satellite Remotely Sensed Data; a Comparative Analysis Using Machine-Learning ModelsRobinson Mugo0Sei-Ichi Saitoh1Regional Center for Mapping of Resources for Development, Nairobi 00618, KenyaArctic Research Center, Hokkaido University, Sapporo 001-0021, JapanTo examine skipjack tuna’s habitat utilization in the western North Pacific (WNP) we used an ensemble modelling approach, which applied a fisher- derived presence-only dataset and three satellite remote-sensing predictor variables. The skipjack tuna data were compiled from daily point fishing data into monthly composites and re-gridded into a quarter degree resolution to match the environmental predictor variables, the sea surface temperature (SST), sea surface chlorophyll-a (SSC) and sea surface height anomalies (SSHA), which were also processed at quarter degree spatial resolution. Using the <i>sdm</i> package operated in RStudio software, we constructed habitat models over a 9-month period, from March to November 2004, using 17 algorithms, with a 70:30 split of training and test data, with bootstrapping and 10 runs as parameter settings for our models. Model performance evaluation was conducted using the area under the curve (AUC) of the receiver operating characteristic (ROC), the point biserial correlation coefficient (COR), the true skill statistic (TSS) and Cohen’s kappa (<i>k</i>) metrics. We analyzed the response curves for each predictor variable per algorithm, the variable importance information and the ROC plots. Ensemble predictions of habitats were weighted with the TSS metric. Model performance varied across various algorithms, with the Support Vector Machines (SVM), Boosted Regression Trees (BRT), Random Forests (RF), Multivariate Adaptive Regression Splines (MARS), Generalized Additive Models (GAM), Classification and Regression Trees (CART), Multi-Layer Perceptron (MLP), Recursive Partitioning and Regression Trees (RPART), and Maximum Entropy (MAXENT), showing consistently high performance than other algorithms, while the Flexible Discriminant Analysis (FDA), Mixture Discriminant Analysis (MDA), Bioclim (BIOC), Domain (DOM), Maxlike (MAXL), Mahalanobis Distance (MAHA) and Radial Basis Function (RBF) had lower performance. We found inter-algorithm variations in predictor variable responses. We conclude that the multi-algorithm modelling approach enabled us to assess the variability in algorithm performance, hence a data driven basis for building the ensemble model. Given the inter-algorithm variations observed, the ensemble prediction maps indicated a better habitat utilization map of skipjack tuna than would have been achieved by a single algorithm.https://www.mdpi.com/2072-4292/12/16/2591ensemble modellingmachine learningskipjack tunawestern north pacificsatellite remote sensingfisheries oceanography
spellingShingle Robinson Mugo
Sei-Ichi Saitoh
Ensemble Modelling of Skipjack Tuna (<i>Katsuwonus pelamis</i>) Habitats in the Western North Pacific Using Satellite Remotely Sensed Data; a Comparative Analysis Using Machine-Learning Models
Remote Sensing
ensemble modelling
machine learning
skipjack tuna
western north pacific
satellite remote sensing
fisheries oceanography
title Ensemble Modelling of Skipjack Tuna (<i>Katsuwonus pelamis</i>) Habitats in the Western North Pacific Using Satellite Remotely Sensed Data; a Comparative Analysis Using Machine-Learning Models
title_full Ensemble Modelling of Skipjack Tuna (<i>Katsuwonus pelamis</i>) Habitats in the Western North Pacific Using Satellite Remotely Sensed Data; a Comparative Analysis Using Machine-Learning Models
title_fullStr Ensemble Modelling of Skipjack Tuna (<i>Katsuwonus pelamis</i>) Habitats in the Western North Pacific Using Satellite Remotely Sensed Data; a Comparative Analysis Using Machine-Learning Models
title_full_unstemmed Ensemble Modelling of Skipjack Tuna (<i>Katsuwonus pelamis</i>) Habitats in the Western North Pacific Using Satellite Remotely Sensed Data; a Comparative Analysis Using Machine-Learning Models
title_short Ensemble Modelling of Skipjack Tuna (<i>Katsuwonus pelamis</i>) Habitats in the Western North Pacific Using Satellite Remotely Sensed Data; a Comparative Analysis Using Machine-Learning Models
title_sort ensemble modelling of skipjack tuna i katsuwonus pelamis i habitats in the western north pacific using satellite remotely sensed data a comparative analysis using machine learning models
topic ensemble modelling
machine learning
skipjack tuna
western north pacific
satellite remote sensing
fisheries oceanography
url https://www.mdpi.com/2072-4292/12/16/2591
work_keys_str_mv AT robinsonmugo ensemblemodellingofskipjacktunaikatsuwonuspelamisihabitatsinthewesternnorthpacificusingsatelliteremotelysenseddataacomparativeanalysisusingmachinelearningmodels
AT seiichisaitoh ensemblemodellingofskipjacktunaikatsuwonuspelamisihabitatsinthewesternnorthpacificusingsatelliteremotelysenseddataacomparativeanalysisusingmachinelearningmodels