A New Data Science Model With Supervised Learning and its Application on Pesticide Poisoning Diagnosis in Rural Workers

In a Data Science project, it is essential to determine the relevance of the data and identify patterns that contribute to decision–making based on domain–specific knowledge. Furthermore, a clear definition of methodologies and creation of documentation to guide a project&#...

Full description

Bibliographic Details
Main Authors: Jaqueline C. S. Carvalho, Tales C. Pimenta, Alessandra C. P. Silverio, Marcos A. Carvalho, Joao Paulo C. S. Carvalho
Format: Article
Language:English
Published: IEEE 2024-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/10464294/
_version_ 1797243412710162432
author Jaqueline C. S. Carvalho
Tales C. Pimenta
Alessandra C. P. Silverio
Marcos A. Carvalho
Joao Paulo C. S. Carvalho
author_facet Jaqueline C. S. Carvalho
Tales C. Pimenta
Alessandra C. P. Silverio
Marcos A. Carvalho
Joao Paulo C. S. Carvalho
author_sort Jaqueline C. S. Carvalho
collection DOAJ
description In a Data Science project, it is essential to determine the relevance of the data and identify patterns that contribute to decision–making based on domain–specific knowledge. Furthermore, a clear definition of methodologies and creation of documentation to guide a project’s development from inception to completion are essential elements. This study presents a Data Science model designed to guide the process, covering data collection through training with the aim of facilitating knowledge discovery. Motivated by deficiencies in existing Data Science methodologies, particularly the lack of practical step–by–step guidance on how to prepare data to reach the production phase. Named “Data Refinement Cycle with Supervised Machine Learning (DRC–SML)”, the proposed model was developed based on the emerging needs of a Data Sciense project aimed at assisting healthcare professionals in diagnosing pesticide poisoning among rural workers. The dataset used in this project resulted from scientific research in which 1027 samples were collected, containing data related to toxicity biomarkers and clinical analyses. We achieved an accuracy of 99.61% with only 27 rules for determining the diagnosis. The results optimized healthcare practices and improved quality of life in rural areas. The project outcomes demonstrated the success of the proposed model.
first_indexed 2024-04-24T18:54:43Z
format Article
id doaj.art-0b4818c923e04bbc8f26878c1a08fc42
institution Directory Open Access Journal
issn 2169-3536
language English
last_indexed 2024-04-24T18:54:43Z
publishDate 2024-01-01
publisher IEEE
record_format Article
series IEEE Access
spelling doaj.art-0b4818c923e04bbc8f26878c1a08fc422024-03-26T17:43:59ZengIEEEIEEE Access2169-35362024-01-0112408714088210.1109/ACCESS.2024.337576410464294A New Data Science Model With Supervised Learning and its Application on Pesticide Poisoning Diagnosis in Rural WorkersJaqueline C. S. Carvalho0https://orcid.org/0000-0002-6485-430XTales C. Pimenta1https://orcid.org/0000-0002-6485-430XAlessandra C. P. Silverio2https://orcid.org/0000-0003-2093-2713Marcos A. Carvalho3https://orcid.org/0000-0002-3546-5815Joao Paulo C. S. Carvalho4https://orcid.org/0009-0000-6447-2013Institute of Systems Engineering and Information Technology, Federal University of Itajubá, Itajubá, BrazilInstitute of Systems Engineering and Information Technology, Federal University of Itajubá, Itajubá, BrazilDepartment of Computer Science, José do Rosário Vellano University, Alfenas, BrazilInstitute of Systems Engineering and Information Technology, Federal University of Itajubá, Itajubá, BrazilMathematics and Natural Sciences Division, Brescia University, Owensboro, KY, USAIn a Data Science project, it is essential to determine the relevance of the data and identify patterns that contribute to decision–making based on domain–specific knowledge. Furthermore, a clear definition of methodologies and creation of documentation to guide a project’s development from inception to completion are essential elements. This study presents a Data Science model designed to guide the process, covering data collection through training with the aim of facilitating knowledge discovery. Motivated by deficiencies in existing Data Science methodologies, particularly the lack of practical step–by–step guidance on how to prepare data to reach the production phase. Named “Data Refinement Cycle with Supervised Machine Learning (DRC–SML)”, the proposed model was developed based on the emerging needs of a Data Sciense project aimed at assisting healthcare professionals in diagnosing pesticide poisoning among rural workers. The dataset used in this project resulted from scientific research in which 1027 samples were collected, containing data related to toxicity biomarkers and clinical analyses. We achieved an accuracy of 99.61% with only 27 rules for determining the diagnosis. The results optimized healthcare practices and improved quality of life in rural areas. The project outcomes demonstrated the success of the proposed model.https://ieeexplore.ieee.org/document/10464294/Data sciencedecision support systemmachine learningpesticide poisoning diagnosis
spellingShingle Jaqueline C. S. Carvalho
Tales C. Pimenta
Alessandra C. P. Silverio
Marcos A. Carvalho
Joao Paulo C. S. Carvalho
A New Data Science Model With Supervised Learning and its Application on Pesticide Poisoning Diagnosis in Rural Workers
IEEE Access
Data science
decision support system
machine learning
pesticide poisoning diagnosis
title A New Data Science Model With Supervised Learning and its Application on Pesticide Poisoning Diagnosis in Rural Workers
title_full A New Data Science Model With Supervised Learning and its Application on Pesticide Poisoning Diagnosis in Rural Workers
title_fullStr A New Data Science Model With Supervised Learning and its Application on Pesticide Poisoning Diagnosis in Rural Workers
title_full_unstemmed A New Data Science Model With Supervised Learning and its Application on Pesticide Poisoning Diagnosis in Rural Workers
title_short A New Data Science Model With Supervised Learning and its Application on Pesticide Poisoning Diagnosis in Rural Workers
title_sort new data science model with supervised learning and its application on pesticide poisoning diagnosis in rural workers
topic Data science
decision support system
machine learning
pesticide poisoning diagnosis
url https://ieeexplore.ieee.org/document/10464294/
work_keys_str_mv AT jaquelinecscarvalho anewdatasciencemodelwithsupervisedlearninganditsapplicationonpesticidepoisoningdiagnosisinruralworkers
AT talescpimenta anewdatasciencemodelwithsupervisedlearninganditsapplicationonpesticidepoisoningdiagnosisinruralworkers
AT alessandracpsilverio anewdatasciencemodelwithsupervisedlearninganditsapplicationonpesticidepoisoningdiagnosisinruralworkers
AT marcosacarvalho anewdatasciencemodelwithsupervisedlearninganditsapplicationonpesticidepoisoningdiagnosisinruralworkers
AT joaopaulocscarvalho anewdatasciencemodelwithsupervisedlearninganditsapplicationonpesticidepoisoningdiagnosisinruralworkers
AT jaquelinecscarvalho newdatasciencemodelwithsupervisedlearninganditsapplicationonpesticidepoisoningdiagnosisinruralworkers
AT talescpimenta newdatasciencemodelwithsupervisedlearninganditsapplicationonpesticidepoisoningdiagnosisinruralworkers
AT alessandracpsilverio newdatasciencemodelwithsupervisedlearninganditsapplicationonpesticidepoisoningdiagnosisinruralworkers
AT marcosacarvalho newdatasciencemodelwithsupervisedlearninganditsapplicationonpesticidepoisoningdiagnosisinruralworkers
AT joaopaulocscarvalho newdatasciencemodelwithsupervisedlearninganditsapplicationonpesticidepoisoningdiagnosisinruralworkers