A synthetic dataset of liver disorder patients

The data in this article include 10,000 synthetic patients with liver disorders, characterized by 70 different variables, including clinical features, and patient outcomes, such as hospital admission or surgery. Patient data are generated, simulating as close as possible real patient data, using a p...

Full description

Bibliographic Details
Main Authors: Giovanna Nicora, Tommaso Mario Buonocore, Enea Parimbelli
Format: Article
Language:English
Published: Elsevier 2023-04-01
Series:Data in Brief
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2352340923000392
_version_ 1827987266128052224
author Giovanna Nicora
Tommaso Mario Buonocore
Enea Parimbelli
author_facet Giovanna Nicora
Tommaso Mario Buonocore
Enea Parimbelli
author_sort Giovanna Nicora
collection DOAJ
description The data in this article include 10,000 synthetic patients with liver disorders, characterized by 70 different variables, including clinical features, and patient outcomes, such as hospital admission or surgery. Patient data are generated, simulating as close as possible real patient data, using a publicly available Bayesian network describing a casual model for liver disorders. By varying the network parameters, we also generated an additional set of 500 patients with characteristics that deviated from the initial patient population. We provide an overview of the synthetic data generation process and the associated scripts for generating the cohorts. This dataset can be useful for the machine learning models training and validation, especially under the effect of dataset shift between training and testing sets.
first_indexed 2024-04-09T23:44:11Z
format Article
id doaj.art-cfa458e55f4844b9abf819fcf7018aef
institution Directory Open Access Journal
issn 2352-3409
language English
last_indexed 2024-04-09T23:44:11Z
publishDate 2023-04-01
publisher Elsevier
record_format Article
series Data in Brief
spelling doaj.art-cfa458e55f4844b9abf819fcf7018aef2023-03-18T04:41:33ZengElsevierData in Brief2352-34092023-04-0147108921A synthetic dataset of liver disorder patientsGiovanna Nicora0Tommaso Mario Buonocore1Enea Parimbelli2Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Italy; enGenome Srl, Italy; Corresponding authors at: Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Italy.Department of Electrical, Computer and Biomedical Engineering, University of Pavia, ItalyDepartment of Electrical, Computer and Biomedical Engineering, University of Pavia, Italy; Telfer School of Management, University of Ottawa, Ottawa, ON, Canada; Corresponding authors at: Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Italy.The data in this article include 10,000 synthetic patients with liver disorders, characterized by 70 different variables, including clinical features, and patient outcomes, such as hospital admission or surgery. Patient data are generated, simulating as close as possible real patient data, using a publicly available Bayesian network describing a casual model for liver disorders. By varying the network parameters, we also generated an additional set of 500 patients with characteristics that deviated from the initial patient population. We provide an overview of the synthetic data generation process and the associated scripts for generating the cohorts. This dataset can be useful for the machine learning models training and validation, especially under the effect of dataset shift between training and testing sets.http://www.sciencedirect.com/science/article/pii/S2352340923000392Synthetic patientsMachine learningBayesian networkDataset shiftCausal model
spellingShingle Giovanna Nicora
Tommaso Mario Buonocore
Enea Parimbelli
A synthetic dataset of liver disorder patients
Data in Brief
Synthetic patients
Machine learning
Bayesian network
Dataset shift
Causal model
title A synthetic dataset of liver disorder patients
title_full A synthetic dataset of liver disorder patients
title_fullStr A synthetic dataset of liver disorder patients
title_full_unstemmed A synthetic dataset of liver disorder patients
title_short A synthetic dataset of liver disorder patients
title_sort synthetic dataset of liver disorder patients
topic Synthetic patients
Machine learning
Bayesian network
Dataset shift
Causal model
url http://www.sciencedirect.com/science/article/pii/S2352340923000392
work_keys_str_mv AT giovannanicora asyntheticdatasetofliverdisorderpatients
AT tommasomariobuonocore asyntheticdatasetofliverdisorderpatients
AT eneaparimbelli asyntheticdatasetofliverdisorderpatients
AT giovannanicora syntheticdatasetofliverdisorderpatients
AT tommasomariobuonocore syntheticdatasetofliverdisorderpatients
AT eneaparimbelli syntheticdatasetofliverdisorderpatients