Estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in South Korea
Abstract Severe fever with thrombocytopenia syndrome (SFTS) is an emerging tick-borne infectious disease in China, Japan, and Korea. This study aimed to estimate the monthly SFTS occurrence and the monthly number of SFTS cases in the geographical area in Korea using epidemiological data including de...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Nature Portfolio
2021-11-01
|
Series: | Scientific Reports |
Online Access: | https://doi.org/10.1038/s41598-021-01361-9 |
_version_ | 1818834100750909440 |
---|---|
author | Giphil Cho Seungheon Lee Hyojung Lee |
author_facet | Giphil Cho Seungheon Lee Hyojung Lee |
author_sort | Giphil Cho |
collection | DOAJ |
description | Abstract Severe fever with thrombocytopenia syndrome (SFTS) is an emerging tick-borne infectious disease in China, Japan, and Korea. This study aimed to estimate the monthly SFTS occurrence and the monthly number of SFTS cases in the geographical area in Korea using epidemiological data including demographic, geographic, and meteorological factors. Important features were chosen through univariate feature selection. Two models using machine learning methods were analyzed: the classification model in machine learning (CMML) and regression model in machine learning (RMML). We developed a novel model incorporating the CMML results into RMML, defined as modified-RMML. Feature importance was computed to assess the contribution of estimating the number of SFTS cases using modified-RMML. Aspect to the accuracy of the novel model, the performance of modified-RMML was improved by reducing the MSE for the test data as 12.6–52.2%, compared to the RMML using five machine learning methods. During the period of increasing the SFTS cases from May to October, the modified-RMML could give more accurate estimation. Computing the feature importance, it is clearly observed that climate factors such as average maximum temperature, precipitation as well as mountain visitors, and the estimation of SFTS occurrence obtained from CMML had high Gini importance. The novel model incorporating CMML and RMML models improves the accuracy of the estimation of SFTS cases. Using the model, climate factors, including temperature, relative humidity, and mountain visitors play important roles in transmitting SFTS in Korea. Our findings highlighted that the guidelines for mountain visitors to prevent SFTS transmissions should be addressed. Moreover, it provides important insights for establishing control interventions that predict early identification of SFTS cases. |
first_indexed | 2024-12-19T02:29:27Z |
format | Article |
id | doaj.art-01764c284e954f4a826c5c1d17b14126 |
institution | Directory Open Access Journal |
issn | 2045-2322 |
language | English |
last_indexed | 2024-12-19T02:29:27Z |
publishDate | 2021-11-01 |
publisher | Nature Portfolio |
record_format | Article |
series | Scientific Reports |
spelling | doaj.art-01764c284e954f4a826c5c1d17b141262022-12-21T20:39:42ZengNature PortfolioScientific Reports2045-23222021-11-0111111010.1038/s41598-021-01361-9Estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in South KoreaGiphil Cho0Seungheon Lee1Hyojung Lee2Finance·Fishery·Manufacture Industrial Mathematics Center on Big Data, Pusan National UniversityDepartment of Mathematics, Pusan National UniversityDepartment of Statistics, Kyungpook National UniversityAbstract Severe fever with thrombocytopenia syndrome (SFTS) is an emerging tick-borne infectious disease in China, Japan, and Korea. This study aimed to estimate the monthly SFTS occurrence and the monthly number of SFTS cases in the geographical area in Korea using epidemiological data including demographic, geographic, and meteorological factors. Important features were chosen through univariate feature selection. Two models using machine learning methods were analyzed: the classification model in machine learning (CMML) and regression model in machine learning (RMML). We developed a novel model incorporating the CMML results into RMML, defined as modified-RMML. Feature importance was computed to assess the contribution of estimating the number of SFTS cases using modified-RMML. Aspect to the accuracy of the novel model, the performance of modified-RMML was improved by reducing the MSE for the test data as 12.6–52.2%, compared to the RMML using five machine learning methods. During the period of increasing the SFTS cases from May to October, the modified-RMML could give more accurate estimation. Computing the feature importance, it is clearly observed that climate factors such as average maximum temperature, precipitation as well as mountain visitors, and the estimation of SFTS occurrence obtained from CMML had high Gini importance. The novel model incorporating CMML and RMML models improves the accuracy of the estimation of SFTS cases. Using the model, climate factors, including temperature, relative humidity, and mountain visitors play important roles in transmitting SFTS in Korea. Our findings highlighted that the guidelines for mountain visitors to prevent SFTS transmissions should be addressed. Moreover, it provides important insights for establishing control interventions that predict early identification of SFTS cases.https://doi.org/10.1038/s41598-021-01361-9 |
spellingShingle | Giphil Cho Seungheon Lee Hyojung Lee Estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in South Korea Scientific Reports |
title | Estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in South Korea |
title_full | Estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in South Korea |
title_fullStr | Estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in South Korea |
title_full_unstemmed | Estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in South Korea |
title_short | Estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in South Korea |
title_sort | estimating severe fever with thrombocytopenia syndrome transmission using machine learning methods in south korea |
url | https://doi.org/10.1038/s41598-021-01361-9 |
work_keys_str_mv | AT giphilcho estimatingseverefeverwiththrombocytopeniasyndrometransmissionusingmachinelearningmethodsinsouthkorea AT seungheonlee estimatingseverefeverwiththrombocytopeniasyndrometransmissionusingmachinelearningmethodsinsouthkorea AT hyojunglee estimatingseverefeverwiththrombocytopeniasyndrometransmissionusingmachinelearningmethodsinsouthkorea |