Estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in South Korea

Abstract Severe fever with thrombocytopenia syndrome (SFTS) is an emerging tick-borne infectious disease in China, Japan, and Korea. This study aimed to estimate the monthly SFTS occurrence and the monthly number of SFTS cases in the geographical area in Korea using epidemiological data including de...

Full description

Bibliographic Details
Main Authors: Giphil Cho, Seungheon Lee, Hyojung Lee
Format: Article
Language:English
Published: Nature Portfolio 2021-11-01
Series:Scientific Reports
Online Access:https://doi.org/10.1038/s41598-021-01361-9
_version_ 1818834100750909440
author Giphil Cho
Seungheon Lee
Hyojung Lee
author_facet Giphil Cho
Seungheon Lee
Hyojung Lee
author_sort Giphil Cho
collection DOAJ
description Abstract Severe fever with thrombocytopenia syndrome (SFTS) is an emerging tick-borne infectious disease in China, Japan, and Korea. This study aimed to estimate the monthly SFTS occurrence and the monthly number of SFTS cases in the geographical area in Korea using epidemiological data including demographic, geographic, and meteorological factors. Important features were chosen through univariate feature selection. Two models using machine learning methods were analyzed: the classification model in machine learning (CMML) and regression model in machine learning (RMML). We developed a novel model incorporating the CMML results into RMML, defined as modified-RMML. Feature importance was computed to assess the contribution of estimating the number of SFTS cases using modified-RMML. Aspect to the accuracy of the novel model, the performance of modified-RMML was improved by reducing the MSE for the test data as 12.6–52.2%, compared to the RMML using five machine learning methods. During the period of increasing the SFTS cases from May to October, the modified-RMML could give more accurate estimation. Computing the feature importance, it is clearly observed that climate factors such as average maximum temperature, precipitation as well as mountain visitors, and the estimation of SFTS occurrence obtained from CMML had high Gini importance. The novel model incorporating CMML and RMML models improves the accuracy of the estimation of SFTS cases. Using the model, climate factors, including temperature, relative humidity, and mountain visitors play important roles in transmitting SFTS in Korea. Our findings highlighted that the guidelines for mountain visitors to prevent SFTS transmissions should be addressed. Moreover, it provides important insights for establishing control interventions that predict early identification of SFTS cases.
first_indexed 2024-12-19T02:29:27Z
format Article
id doaj.art-01764c284e954f4a826c5c1d17b14126
institution Directory Open Access Journal
issn 2045-2322
language English
last_indexed 2024-12-19T02:29:27Z
publishDate 2021-11-01
publisher Nature Portfolio
record_format Article
series Scientific Reports
spelling doaj.art-01764c284e954f4a826c5c1d17b141262022-12-21T20:39:42ZengNature PortfolioScientific Reports2045-23222021-11-0111111010.1038/s41598-021-01361-9Estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in South KoreaGiphil Cho0Seungheon Lee1Hyojung Lee2Finance·Fishery·Manufacture Industrial Mathematics Center on Big Data, Pusan National UniversityDepartment of Mathematics, Pusan National UniversityDepartment of Statistics, Kyungpook National UniversityAbstract Severe fever with thrombocytopenia syndrome (SFTS) is an emerging tick-borne infectious disease in China, Japan, and Korea. This study aimed to estimate the monthly SFTS occurrence and the monthly number of SFTS cases in the geographical area in Korea using epidemiological data including demographic, geographic, and meteorological factors. Important features were chosen through univariate feature selection. Two models using machine learning methods were analyzed: the classification model in machine learning (CMML) and regression model in machine learning (RMML). We developed a novel model incorporating the CMML results into RMML, defined as modified-RMML. Feature importance was computed to assess the contribution of estimating the number of SFTS cases using modified-RMML. Aspect to the accuracy of the novel model, the performance of modified-RMML was improved by reducing the MSE for the test data as 12.6–52.2%, compared to the RMML using five machine learning methods. During the period of increasing the SFTS cases from May to October, the modified-RMML could give more accurate estimation. Computing the feature importance, it is clearly observed that climate factors such as average maximum temperature, precipitation as well as mountain visitors, and the estimation of SFTS occurrence obtained from CMML had high Gini importance. The novel model incorporating CMML and RMML models improves the accuracy of the estimation of SFTS cases. Using the model, climate factors, including temperature, relative humidity, and mountain visitors play important roles in transmitting SFTS in Korea. Our findings highlighted that the guidelines for mountain visitors to prevent SFTS transmissions should be addressed. Moreover, it provides important insights for establishing control interventions that predict early identification of SFTS cases.https://doi.org/10.1038/s41598-021-01361-9
spellingShingle Giphil Cho
Seungheon Lee
Hyojung Lee
Estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in South Korea
Scientific Reports
title Estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in South Korea
title_full Estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in South Korea
title_fullStr Estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in South Korea
title_full_unstemmed Estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in South Korea
title_short Estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in South Korea
title_sort estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in south korea
url https://doi.org/10.1038/s41598-021-01361-9
work_keys_str_mv AT giphilcho estimatingseverefeverwiththrombocytopeniasyndrometransmissionusingmachinelearningmethodsinsouthkorea
AT seungheonlee estimatingseverefeverwiththrombocytopeniasyndrometransmissionusingmachinelearningmethodsinsouthkorea
AT hyojunglee estimatingseverefeverwiththrombocytopeniasyndrometransmissionusingmachinelearningmethodsinsouthkorea