Multi‐Model Prediction of West Nile Virus Neuroinvasive Disease With Machine Learning for Identification of Important Regional Climatic Drivers

Abstract West Nile virus (WNV) is the leading cause of mosquito‐borne illness in the continental United States (CONUS). Spatial heterogeneity in historical incidence, environmental factors, and complex ecology make prediction of spatiotemporal variation in WNV transmission challenging. Machine learn...

Full description

Bibliographic Details
Main Authors: Karen M. Holcomb, J. Erin Staples, Randall J. Nett, Charles B. Beard, Lyle R. Petersen, Stanley G. Benjamin, Benjamin W. Green, Hunter Jones, Michael A. Johansson
Format: Article
Language:English
Published: American Geophysical Union (AGU) 2023-11-01
Series:GeoHealth
Subjects:
Online Access:https://doi.org/10.1029/2023GH000906
_version_ 1797449789507371008
author Karen M. Holcomb
J. Erin Staples
Randall J. Nett
Charles B. Beard
Lyle R. Petersen
Stanley G. Benjamin
Benjamin W. Green
Hunter Jones
Michael A. Johansson
author_facet Karen M. Holcomb
J. Erin Staples
Randall J. Nett
Charles B. Beard
Lyle R. Petersen
Stanley G. Benjamin
Benjamin W. Green
Hunter Jones
Michael A. Johansson
author_sort Karen M. Holcomb
collection DOAJ
description Abstract West Nile virus (WNV) is the leading cause of mosquito‐borne illness in the continental United States (CONUS). Spatial heterogeneity in historical incidence, environmental factors, and complex ecology make prediction of spatiotemporal variation in WNV transmission challenging. Machine learning provides promising tools for identification of important variables in such situations. To predict annual WNV neuroinvasive disease (WNND) cases in CONUS (2015–2021), we fitted 10 probabilistic models with variation in complexity from naïve to machine learning algorithm and an ensemble. We made predictions in each of nine climate regions on a hexagonal grid and evaluated each model's predictive accuracy. Using the machine learning models (random forest and neural network), we identified the relative importance and variation in ranking of predictors (historical WNND cases, climate anomalies, human demographics, and land use) across regions. We found that historical WNND cases and population density were among the most important factors while anomalies in temperature and precipitation often had relatively low importance. While the relative performance of each model varied across climatic regions, the magnitude of difference between models was small. All models except the naïve model had non‐significant differences in performance relative to the baseline model (negative binomial model fit per hexagon). No model, including the ensemble or more complex machine learning models, outperformed models based on historical case counts on the hexagon or region level; these models are good forecasting benchmarks. Further work is needed to assess if predictive capacity can be improved beyond that of these historical baselines.
first_indexed 2024-03-09T14:30:07Z
format Article
id doaj.art-ca143f64a797489f89bbe0bf3af92899
institution Directory Open Access Journal
issn 2471-1403
language English
last_indexed 2024-03-09T14:30:07Z
publishDate 2023-11-01
publisher American Geophysical Union (AGU)
record_format Article
series GeoHealth
spelling doaj.art-ca143f64a797489f89bbe0bf3af928992023-11-28T02:35:46ZengAmerican Geophysical Union (AGU)GeoHealth2471-14032023-11-01711n/an/a10.1029/2023GH000906Multi‐Model Prediction of West Nile Virus Neuroinvasive Disease With Machine Learning for Identification of Important Regional Climatic DriversKaren M. Holcomb0J. Erin Staples1Randall J. Nett2Charles B. Beard3Lyle R. Petersen4Stanley G. Benjamin5Benjamin W. Green6Hunter Jones7Michael A. Johansson8Global Systems Laboratory National Oceanic and Atmospheric Administration Boulder CO USADivision of Vector‐Borne Diseases Centers for Disease Control and Prevention Fort Collins CO USADivision of Vector‐Borne Diseases Centers for Disease Control and Prevention Fort Collins CO USADivision of Vector‐Borne Diseases Centers for Disease Control and Prevention Fort Collins CO USADivision of Vector‐Borne Diseases Centers for Disease Control and Prevention Fort Collins CO USAGlobal Systems Laboratory National Oceanic and Atmospheric Administration Boulder CO USAGlobal Systems Laboratory National Oceanic and Atmospheric Administration Boulder CO USAClimate Prediction Office National Oceanic and Atmospheric Administration Silver Spring MD USADivision of Vector‐Borne Diseases Centers for Disease Control and Prevention San Juan PR USAAbstract West Nile virus (WNV) is the leading cause of mosquito‐borne illness in the continental United States (CONUS). Spatial heterogeneity in historical incidence, environmental factors, and complex ecology make prediction of spatiotemporal variation in WNV transmission challenging. Machine learning provides promising tools for identification of important variables in such situations. To predict annual WNV neuroinvasive disease (WNND) cases in CONUS (2015–2021), we fitted 10 probabilistic models with variation in complexity from naïve to machine learning algorithm and an ensemble. We made predictions in each of nine climate regions on a hexagonal grid and evaluated each model's predictive accuracy. Using the machine learning models (random forest and neural network), we identified the relative importance and variation in ranking of predictors (historical WNND cases, climate anomalies, human demographics, and land use) across regions. We found that historical WNND cases and population density were among the most important factors while anomalies in temperature and precipitation often had relatively low importance. While the relative performance of each model varied across climatic regions, the magnitude of difference between models was small. All models except the naïve model had non‐significant differences in performance relative to the baseline model (negative binomial model fit per hexagon). No model, including the ensemble or more complex machine learning models, outperformed models based on historical case counts on the hexagon or region level; these models are good forecasting benchmarks. Further work is needed to assess if predictive capacity can be improved beyond that of these historical baselines.https://doi.org/10.1029/2023GH000906West Nile virusmachine learningpredictionclimatevariable importance
spellingShingle Karen M. Holcomb
J. Erin Staples
Randall J. Nett
Charles B. Beard
Lyle R. Petersen
Stanley G. Benjamin
Benjamin W. Green
Hunter Jones
Michael A. Johansson
Multi‐Model Prediction of West Nile Virus Neuroinvasive Disease With Machine Learning for Identification of Important Regional Climatic Drivers
GeoHealth
West Nile virus
machine learning
prediction
climate
variable importance
title Multi‐Model Prediction of West Nile Virus Neuroinvasive Disease With Machine Learning for Identification of Important Regional Climatic Drivers
title_full Multi‐Model Prediction of West Nile Virus Neuroinvasive Disease With Machine Learning for Identification of Important Regional Climatic Drivers
title_fullStr Multi‐Model Prediction of West Nile Virus Neuroinvasive Disease With Machine Learning for Identification of Important Regional Climatic Drivers
title_full_unstemmed Multi‐Model Prediction of West Nile Virus Neuroinvasive Disease With Machine Learning for Identification of Important Regional Climatic Drivers
title_short Multi‐Model Prediction of West Nile Virus Neuroinvasive Disease With Machine Learning for Identification of Important Regional Climatic Drivers
title_sort multi model prediction of west nile virus neuroinvasive disease with machine learning for identification of important regional climatic drivers
topic West Nile virus
machine learning
prediction
climate
variable importance
url https://doi.org/10.1029/2023GH000906
work_keys_str_mv AT karenmholcomb multimodelpredictionofwestnilevirusneuroinvasivediseasewithmachinelearningforidentificationofimportantregionalclimaticdrivers
AT jerinstaples multimodelpredictionofwestnilevirusneuroinvasivediseasewithmachinelearningforidentificationofimportantregionalclimaticdrivers
AT randalljnett multimodelpredictionofwestnilevirusneuroinvasivediseasewithmachinelearningforidentificationofimportantregionalclimaticdrivers
AT charlesbbeard multimodelpredictionofwestnilevirusneuroinvasivediseasewithmachinelearningforidentificationofimportantregionalclimaticdrivers
AT lylerpetersen multimodelpredictionofwestnilevirusneuroinvasivediseasewithmachinelearningforidentificationofimportantregionalclimaticdrivers
AT stanleygbenjamin multimodelpredictionofwestnilevirusneuroinvasivediseasewithmachinelearningforidentificationofimportantregionalclimaticdrivers
AT benjaminwgreen multimodelpredictionofwestnilevirusneuroinvasivediseasewithmachinelearningforidentificationofimportantregionalclimaticdrivers
AT hunterjones multimodelpredictionofwestnilevirusneuroinvasivediseasewithmachinelearningforidentificationofimportantregionalclimaticdrivers
AT michaelajohansson multimodelpredictionofwestnilevirusneuroinvasivediseasewithmachinelearningforidentificationofimportantregionalclimaticdrivers