A fast method for fitting integrated species distribution models

Abstract Integrated distribution models (IDMs) predict where species might occur using data from multiple sources, a technique thought to be especially useful when data from any individual source are scarce. Recent advances allow us to fit such models with latent terms to account for dependence with...

Full description

Bibliographic Details
Main Authors: Elliot Dovers, Gordana C. Popovic, David I. Warton
Format: Article
Language:English
Published: Wiley 2024-01-01
Series:Methods in Ecology and Evolution
Subjects:
Online Access:https://doi.org/10.1111/2041-210X.14252
_version_ 1797359814693617664
author Elliot Dovers
Gordana C. Popovic
David I. Warton
author_facet Elliot Dovers
Gordana C. Popovic
David I. Warton
author_sort Elliot Dovers
collection DOAJ
description Abstract Integrated distribution models (IDMs) predict where species might occur using data from multiple sources, a technique thought to be especially useful when data from any individual source are scarce. Recent advances allow us to fit such models with latent terms to account for dependence within and between data sources, but they are computationally challenging to fit. We propose a fast new methodology for fitting integrated distribution models using presence/absence and presence‐only data, via a spatial random effects approach combined with automatic differentiation. We have written an R package (called scampr) for straightforward implementation of our approach. We use simulation to demonstrate that our approach has comparable performance to INLA—a common framework for fitting IDMs—but with computation times up to an order of magnitude faster. We also use simulation to look at when IDMs can be expected to outperform models fitted to a single data source, and find that the amount of benefit gained from using an IDM is a function of the relative amount of additional information available from incorporating a second data source into the model. We apply our method to predict 29 plant species in NSW, Australia, and find particular benefit in predictive performance when data from a single source are scarce and when compared to models for presence‐only data. Our faster methods of fitting IDMs make it feasible to more deeply explore the model space (e.g. comparing different ways to model latent terms), and in future work, to consider extensions to more complex models, for example the multi‐species setting.
first_indexed 2024-03-08T15:29:12Z
format Article
id doaj.art-28f84f4c14dc45b69e3387fa79e04a09
institution Directory Open Access Journal
issn 2041-210X
language English
last_indexed 2024-03-08T15:29:12Z
publishDate 2024-01-01
publisher Wiley
record_format Article
series Methods in Ecology and Evolution
spelling doaj.art-28f84f4c14dc45b69e3387fa79e04a092024-01-10T06:33:14ZengWileyMethods in Ecology and Evolution2041-210X2024-01-0115119120310.1111/2041-210X.14252A fast method for fitting integrated species distribution modelsElliot Dovers0Gordana C. Popovic1David I. Warton2School of Mathematics and Statistics and Evolution & Ecology Research Centre UNSW Sydney Sydney New South Wales AustraliaSchool of Mathematics and Statistics and Evolution & Ecology Research Centre UNSW Sydney Sydney New South Wales AustraliaSchool of Mathematics and Statistics and Evolution & Ecology Research Centre UNSW Sydney Sydney New South Wales AustraliaAbstract Integrated distribution models (IDMs) predict where species might occur using data from multiple sources, a technique thought to be especially useful when data from any individual source are scarce. Recent advances allow us to fit such models with latent terms to account for dependence within and between data sources, but they are computationally challenging to fit. We propose a fast new methodology for fitting integrated distribution models using presence/absence and presence‐only data, via a spatial random effects approach combined with automatic differentiation. We have written an R package (called scampr) for straightforward implementation of our approach. We use simulation to demonstrate that our approach has comparable performance to INLA—a common framework for fitting IDMs—but with computation times up to an order of magnitude faster. We also use simulation to look at when IDMs can be expected to outperform models fitted to a single data source, and find that the amount of benefit gained from using an IDM is a function of the relative amount of additional information available from incorporating a second data source into the model. We apply our method to predict 29 plant species in NSW, Australia, and find particular benefit in predictive performance when data from a single source are scarce and when compared to models for presence‐only data. Our faster methods of fitting IDMs make it feasible to more deeply explore the model space (e.g. comparing different ways to model latent terms), and in future work, to consider extensions to more complex models, for example the multi‐species setting.https://doi.org/10.1111/2041-210X.14252data fusiondata integrationecologyLaplace approximationlog‐Gaussian Cox processpresence/absence data
spellingShingle Elliot Dovers
Gordana C. Popovic
David I. Warton
A fast method for fitting integrated species distribution models
Methods in Ecology and Evolution
data fusion
data integration
ecology
Laplace approximation
log‐Gaussian Cox process
presence/absence data
title A fast method for fitting integrated species distribution models
title_full A fast method for fitting integrated species distribution models
title_fullStr A fast method for fitting integrated species distribution models
title_full_unstemmed A fast method for fitting integrated species distribution models
title_short A fast method for fitting integrated species distribution models
title_sort fast method for fitting integrated species distribution models
topic data fusion
data integration
ecology
Laplace approximation
log‐Gaussian Cox process
presence/absence data
url https://doi.org/10.1111/2041-210X.14252
work_keys_str_mv AT elliotdovers afastmethodforfittingintegratedspeciesdistributionmodels
AT gordanacpopovic afastmethodforfittingintegratedspeciesdistributionmodels
AT davidiwarton afastmethodforfittingintegratedspeciesdistributionmodels
AT elliotdovers fastmethodforfittingintegratedspeciesdistributionmodels
AT gordanacpopovic fastmethodforfittingintegratedspeciesdistributionmodels
AT davidiwarton fastmethodforfittingintegratedspeciesdistributionmodels