The dimensionality reductions of environmental variables have a significant effect on the performance of species distribution models

Abstract How to effectively obtain species‐related low‐dimensional data from massive environmental variables has become an urgent problem for species distribution models (SDMs). In this study, we will explore whether dimensionality reduction on environmental variables can improve the predictive perf...

Full description

Bibliographic Details
Main Authors: Hao‐Tian Zhang, Wen‐Yong Guo, Wen‐Ting Wang
Format: Article
Language:English
Published: Wiley 2023-11-01
Series:Ecology and Evolution
Subjects:
Online Access:https://doi.org/10.1002/ece3.10747
_version_ 1797448627210158080
author Hao‐Tian Zhang
Wen‐Yong Guo
Wen‐Ting Wang
author_facet Hao‐Tian Zhang
Wen‐Yong Guo
Wen‐Ting Wang
author_sort Hao‐Tian Zhang
collection DOAJ
description Abstract How to effectively obtain species‐related low‐dimensional data from massive environmental variables has become an urgent problem for species distribution models (SDMs). In this study, we will explore whether dimensionality reduction on environmental variables can improve the predictive performance of SDMs. We first used two linear (i.e., principal component analysis (PCA) and independent components analysis) and two nonlinear (i.e., kernel principal component analysis (KPCA) and uniform manifold approximation and projection) dimensionality reduction techniques (DRTs) to reduce the dimensionality of high‐dimensional environmental data. Then, we established five SDMs based on the environmental variables of dimensionality reduction for 23 real plant species and nine virtual species, and compared the predictive performance of those with the SDMs based on the selected environmental variables through Pearson's correlation coefficient (PCC). In addition, we studied the effects of DRTs, model complexity, and sample size on the predictive performance of SDMs. The predictive performance of SDMs under DRTs other than KPCA is better than using PCC. And the predictive performance of SDMs using linear DRTs is better than using nonlinear DRTs. In addition, using DRTs to deal with environmental variables has no less impact on the predictive performance of SDMs than model complexity and sample size. When the model complexity is at the complex level, PCA can improve the predictive performance of SDMs the most by 2.55% compared with PCC. At the middle level of sample size, the PCA improved the predictive performance of SDMs by 2.68% compared with the PCC. Our study demonstrates that DRTs have a significant effect on the predictive performance of SDMs. Specifically, linear DRTs, especially PCA, are more effective at improving model predictive performance under relatively complex model complexity or large sample sizes.
first_indexed 2024-03-09T14:14:09Z
format Article
id doaj.art-d466a2cb75fc4c18a8a55fef8b083794
institution Directory Open Access Journal
issn 2045-7758
language English
last_indexed 2024-03-09T14:14:09Z
publishDate 2023-11-01
publisher Wiley
record_format Article
series Ecology and Evolution
spelling doaj.art-d466a2cb75fc4c18a8a55fef8b0837942023-11-29T05:44:08ZengWileyEcology and Evolution2045-77582023-11-011311n/an/a10.1002/ece3.10747The dimensionality reductions of environmental variables have a significant effect on the performance of species distribution modelsHao‐Tian Zhang0Wen‐Yong Guo1Wen‐Ting Wang2School of Mathematics and Computer Science Northwest Minzu University Lanzhou ChinaResearch Center for Global Change and Complex Ecosystems, School of Ecological and Environmental Sciences East China Normal University Shanghai ChinaSchool of Mathematics and Computer Science Northwest Minzu University Lanzhou ChinaAbstract How to effectively obtain species‐related low‐dimensional data from massive environmental variables has become an urgent problem for species distribution models (SDMs). In this study, we will explore whether dimensionality reduction on environmental variables can improve the predictive performance of SDMs. We first used two linear (i.e., principal component analysis (PCA) and independent components analysis) and two nonlinear (i.e., kernel principal component analysis (KPCA) and uniform manifold approximation and projection) dimensionality reduction techniques (DRTs) to reduce the dimensionality of high‐dimensional environmental data. Then, we established five SDMs based on the environmental variables of dimensionality reduction for 23 real plant species and nine virtual species, and compared the predictive performance of those with the SDMs based on the selected environmental variables through Pearson's correlation coefficient (PCC). In addition, we studied the effects of DRTs, model complexity, and sample size on the predictive performance of SDMs. The predictive performance of SDMs under DRTs other than KPCA is better than using PCC. And the predictive performance of SDMs using linear DRTs is better than using nonlinear DRTs. In addition, using DRTs to deal with environmental variables has no less impact on the predictive performance of SDMs than model complexity and sample size. When the model complexity is at the complex level, PCA can improve the predictive performance of SDMs the most by 2.55% compared with PCC. At the middle level of sample size, the PCA improved the predictive performance of SDMs by 2.68% compared with the PCC. Our study demonstrates that DRTs have a significant effect on the predictive performance of SDMs. Specifically, linear DRTs, especially PCA, are more effective at improving model predictive performance under relatively complex model complexity or large sample sizes.https://doi.org/10.1002/ece3.10747dimensionality reduction techniquesenvironmental variablesmodel complexitypredictive performancesample sizesspecies distribution models
spellingShingle Hao‐Tian Zhang
Wen‐Yong Guo
Wen‐Ting Wang
The dimensionality reductions of environmental variables have a significant effect on the performance of species distribution models
Ecology and Evolution
dimensionality reduction techniques
environmental variables
model complexity
predictive performance
sample sizes
species distribution models
title The dimensionality reductions of environmental variables have a significant effect on the performance of species distribution models
title_full The dimensionality reductions of environmental variables have a significant effect on the performance of species distribution models
title_fullStr The dimensionality reductions of environmental variables have a significant effect on the performance of species distribution models
title_full_unstemmed The dimensionality reductions of environmental variables have a significant effect on the performance of species distribution models
title_short The dimensionality reductions of environmental variables have a significant effect on the performance of species distribution models
title_sort dimensionality reductions of environmental variables have a significant effect on the performance of species distribution models
topic dimensionality reduction techniques
environmental variables
model complexity
predictive performance
sample sizes
species distribution models
url https://doi.org/10.1002/ece3.10747
work_keys_str_mv AT haotianzhang thedimensionalityreductionsofenvironmentalvariableshaveasignificanteffectontheperformanceofspeciesdistributionmodels
AT wenyongguo thedimensionalityreductionsofenvironmentalvariableshaveasignificanteffectontheperformanceofspeciesdistributionmodels
AT wentingwang thedimensionalityreductionsofenvironmentalvariableshaveasignificanteffectontheperformanceofspeciesdistributionmodels
AT haotianzhang dimensionalityreductionsofenvironmentalvariableshaveasignificanteffectontheperformanceofspeciesdistributionmodels
AT wenyongguo dimensionalityreductionsofenvironmentalvariableshaveasignificanteffectontheperformanceofspeciesdistributionmodels
AT wentingwang dimensionalityreductionsofenvironmentalvariableshaveasignificanteffectontheperformanceofspeciesdistributionmodels