Feature sensitivity criterion-based sampling strategy from the Optimization based on Phylogram Analysis (Fs-OPA) and Cox regression applied to mental disorder datasets.

Digital datasets in several health care facilities, as hospitals and prehospital services, accumulated data from thousands of patients for more than a decade. In general, there is no local team with enough experts with the required different skills capable of analyzing them in entirety. The integrat...

Full description

Bibliographic Details
Main Authors: Fatemeh Gholi Zadeh Kharrat, Newton Shydeo Brandão Miyoshi, Juliana Cobre, João Mazzoncini De Azevedo-Marques, Paulo Mazzoncini de Azevedo-Marques, Alexandre Cláudio Botazzo Delbem
Format: Article
Language:English
Published: Public Library of Science (PLoS) 2020-01-01
Series:PLoS ONE
Online Access:https://doi.org/10.1371/journal.pone.0235147
_version_ 1829503424958300160
author Fatemeh Gholi Zadeh Kharrat
Newton Shydeo Brandão Miyoshi
Juliana Cobre
João Mazzoncini De Azevedo-Marques
Paulo Mazzoncini de Azevedo-Marques
Alexandre Cláudio Botazzo Delbem
author_facet Fatemeh Gholi Zadeh Kharrat
Newton Shydeo Brandão Miyoshi
Juliana Cobre
João Mazzoncini De Azevedo-Marques
Paulo Mazzoncini de Azevedo-Marques
Alexandre Cláudio Botazzo Delbem
author_sort Fatemeh Gholi Zadeh Kharrat
collection DOAJ
description Digital datasets in several health care facilities, as hospitals and prehospital services, accumulated data from thousands of patients for more than a decade. In general, there is no local team with enough experts with the required different skills capable of analyzing them in entirety. The integration of those abilities usually demands a relatively long-period and is cost. Considering that scenario, this paper proposes a new Feature Sensitivity technique that can automatically deal with a large dataset. It uses a criterion-based sampling strategy from the Optimization based on Phylogram Analysis. Called FS-opa, the new approach seems proper for dealing with any types of raw data from health centers and manipulate their entire datasets. Besides, FS-opa can find the principal features for the construction of inference models without depending on expert knowledge of the problem domain. The selected features can be combined with usual statistical or machine learning methods to perform predictions. The new method can mine entire datasets from scratch. FS-opa was evaluated using a relatively large dataset from electronic health records of mental disorder prehospital services in Brazil. Cox's approach was integrated to FS-opa to generate survival analysis models related to the length of stay (LOS) in hospitals, assuming that it is a relevant aspect that can benefit estimates of the efficiency of hospitals and the quality of patient treatments. Since FS-opa can work with raw datasets, no knowledge from the problem domain was used to obtain the preliminary prediction models found. Results show that FS-opa succeeded in performing a feature sensitivity analysis using only the raw data available. In this way, FS-opa can find the principal features without bias of an inference model, since the proposed method does not use it. Moreover, the experiments show that FS-opa can provide models with a useful trade-off according to their representativeness and parsimony. It can benefit further analyses by experts since they can focus on aspects that benefit problem modeling.
first_indexed 2024-12-16T09:58:00Z
format Article
id doaj.art-6be37f2d111b4eb9965a809a96cd842f
institution Directory Open Access Journal
issn 1932-6203
language English
last_indexed 2024-12-16T09:58:00Z
publishDate 2020-01-01
publisher Public Library of Science (PLoS)
record_format Article
series PLoS ONE
spelling doaj.art-6be37f2d111b4eb9965a809a96cd842f2022-12-21T22:35:51ZengPublic Library of Science (PLoS)PLoS ONE1932-62032020-01-01157e023514710.1371/journal.pone.0235147Feature sensitivity criterion-based sampling strategy from the Optimization based on Phylogram Analysis (Fs-OPA) and Cox regression applied to mental disorder datasets.Fatemeh Gholi Zadeh KharratNewton Shydeo Brandão MiyoshiJuliana CobreJoão Mazzoncini De Azevedo-MarquesPaulo Mazzoncini de Azevedo-MarquesAlexandre Cláudio Botazzo DelbemDigital datasets in several health care facilities, as hospitals and prehospital services, accumulated data from thousands of patients for more than a decade. In general, there is no local team with enough experts with the required different skills capable of analyzing them in entirety. The integration of those abilities usually demands a relatively long-period and is cost. Considering that scenario, this paper proposes a new Feature Sensitivity technique that can automatically deal with a large dataset. It uses a criterion-based sampling strategy from the Optimization based on Phylogram Analysis. Called FS-opa, the new approach seems proper for dealing with any types of raw data from health centers and manipulate their entire datasets. Besides, FS-opa can find the principal features for the construction of inference models without depending on expert knowledge of the problem domain. The selected features can be combined with usual statistical or machine learning methods to perform predictions. The new method can mine entire datasets from scratch. FS-opa was evaluated using a relatively large dataset from electronic health records of mental disorder prehospital services in Brazil. Cox's approach was integrated to FS-opa to generate survival analysis models related to the length of stay (LOS) in hospitals, assuming that it is a relevant aspect that can benefit estimates of the efficiency of hospitals and the quality of patient treatments. Since FS-opa can work with raw datasets, no knowledge from the problem domain was used to obtain the preliminary prediction models found. Results show that FS-opa succeeded in performing a feature sensitivity analysis using only the raw data available. In this way, FS-opa can find the principal features without bias of an inference model, since the proposed method does not use it. Moreover, the experiments show that FS-opa can provide models with a useful trade-off according to their representativeness and parsimony. It can benefit further analyses by experts since they can focus on aspects that benefit problem modeling.https://doi.org/10.1371/journal.pone.0235147
spellingShingle Fatemeh Gholi Zadeh Kharrat
Newton Shydeo Brandão Miyoshi
Juliana Cobre
João Mazzoncini De Azevedo-Marques
Paulo Mazzoncini de Azevedo-Marques
Alexandre Cláudio Botazzo Delbem
Feature sensitivity criterion-based sampling strategy from the Optimization based on Phylogram Analysis (Fs-OPA) and Cox regression applied to mental disorder datasets.
PLoS ONE
title Feature sensitivity criterion-based sampling strategy from the Optimization based on Phylogram Analysis (Fs-OPA) and Cox regression applied to mental disorder datasets.
title_full Feature sensitivity criterion-based sampling strategy from the Optimization based on Phylogram Analysis (Fs-OPA) and Cox regression applied to mental disorder datasets.
title_fullStr Feature sensitivity criterion-based sampling strategy from the Optimization based on Phylogram Analysis (Fs-OPA) and Cox regression applied to mental disorder datasets.
title_full_unstemmed Feature sensitivity criterion-based sampling strategy from the Optimization based on Phylogram Analysis (Fs-OPA) and Cox regression applied to mental disorder datasets.
title_short Feature sensitivity criterion-based sampling strategy from the Optimization based on Phylogram Analysis (Fs-OPA) and Cox regression applied to mental disorder datasets.
title_sort feature sensitivity criterion based sampling strategy from the optimization based on phylogram analysis fs opa and cox regression applied to mental disorder datasets
url https://doi.org/10.1371/journal.pone.0235147
work_keys_str_mv AT fatemehgholizadehkharrat featuresensitivitycriterionbasedsamplingstrategyfromtheoptimizationbasedonphylogramanalysisfsopaandcoxregressionappliedtomentaldisorderdatasets
AT newtonshydeobrandaomiyoshi featuresensitivitycriterionbasedsamplingstrategyfromtheoptimizationbasedonphylogramanalysisfsopaandcoxregressionappliedtomentaldisorderdatasets
AT julianacobre featuresensitivitycriterionbasedsamplingstrategyfromtheoptimizationbasedonphylogramanalysisfsopaandcoxregressionappliedtomentaldisorderdatasets
AT joaomazzoncinideazevedomarques featuresensitivitycriterionbasedsamplingstrategyfromtheoptimizationbasedonphylogramanalysisfsopaandcoxregressionappliedtomentaldisorderdatasets
AT paulomazzoncinideazevedomarques featuresensitivitycriterionbasedsamplingstrategyfromtheoptimizationbasedonphylogramanalysisfsopaandcoxregressionappliedtomentaldisorderdatasets
AT alexandreclaudiobotazzodelbem featuresensitivitycriterionbasedsamplingstrategyfromtheoptimizationbasedonphylogramanalysisfsopaandcoxregressionappliedtomentaldisorderdatasets