A synthetic dataset of liver disorder patients
The data in this article include 10,000 synthetic patients with liver disorders, characterized by 70 different variables, including clinical features, and patient outcomes, such as hospital admission or surgery. Patient data are generated, simulating as close as possible real patient data, using a p...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Elsevier
2023-04-01
|
Series: | Data in Brief |
Subjects: | |
Online Access: | http://www.sciencedirect.com/science/article/pii/S2352340923000392 |
_version_ | 1827987266128052224 |
---|---|
author | Giovanna Nicora Tommaso Mario Buonocore Enea Parimbelli |
author_facet | Giovanna Nicora Tommaso Mario Buonocore Enea Parimbelli |
author_sort | Giovanna Nicora |
collection | DOAJ |
description | The data in this article include 10,000 synthetic patients with liver disorders, characterized by 70 different variables, including clinical features, and patient outcomes, such as hospital admission or surgery. Patient data are generated, simulating as close as possible real patient data, using a publicly available Bayesian network describing a casual model for liver disorders. By varying the network parameters, we also generated an additional set of 500 patients with characteristics that deviated from the initial patient population. We provide an overview of the synthetic data generation process and the associated scripts for generating the cohorts. This dataset can be useful for the machine learning models training and validation, especially under the effect of dataset shift between training and testing sets. |
first_indexed | 2024-04-09T23:44:11Z |
format | Article |
id | doaj.art-cfa458e55f4844b9abf819fcf7018aef |
institution | Directory Open Access Journal |
issn | 2352-3409 |
language | English |
last_indexed | 2024-04-09T23:44:11Z |
publishDate | 2023-04-01 |
publisher | Elsevier |
record_format | Article |
series | Data in Brief |
spelling | doaj.art-cfa458e55f4844b9abf819fcf7018aef2023-03-18T04:41:33ZengElsevierData in Brief2352-34092023-04-0147108921A synthetic dataset of liver disorder patientsGiovanna Nicora0Tommaso Mario Buonocore1Enea Parimbelli2Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Italy; enGenome Srl, Italy; Corresponding authors at: Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Italy.Department of Electrical, Computer and Biomedical Engineering, University of Pavia, ItalyDepartment of Electrical, Computer and Biomedical Engineering, University of Pavia, Italy; Telfer School of Management, University of Ottawa, Ottawa, ON, Canada; Corresponding authors at: Department of Electrical, Computer and Biomedical Engineering, University of Pavia, Italy.The data in this article include 10,000 synthetic patients with liver disorders, characterized by 70 different variables, including clinical features, and patient outcomes, such as hospital admission or surgery. Patient data are generated, simulating as close as possible real patient data, using a publicly available Bayesian network describing a casual model for liver disorders. By varying the network parameters, we also generated an additional set of 500 patients with characteristics that deviated from the initial patient population. We provide an overview of the synthetic data generation process and the associated scripts for generating the cohorts. This dataset can be useful for the machine learning models training and validation, especially under the effect of dataset shift between training and testing sets.http://www.sciencedirect.com/science/article/pii/S2352340923000392Synthetic patientsMachine learningBayesian networkDataset shiftCausal model |
spellingShingle | Giovanna Nicora Tommaso Mario Buonocore Enea Parimbelli A synthetic dataset of liver disorder patients Data in Brief Synthetic patients Machine learning Bayesian network Dataset shift Causal model |
title | A synthetic dataset of liver disorder patients |
title_full | A synthetic dataset of liver disorder patients |
title_fullStr | A synthetic dataset of liver disorder patients |
title_full_unstemmed | A synthetic dataset of liver disorder patients |
title_short | A synthetic dataset of liver disorder patients |
title_sort | synthetic dataset of liver disorder patients |
topic | Synthetic patients Machine learning Bayesian network Dataset shift Causal model |
url | http://www.sciencedirect.com/science/article/pii/S2352340923000392 |
work_keys_str_mv | AT giovannanicora asyntheticdatasetofliverdisorderpatients AT tommasomariobuonocore asyntheticdatasetofliverdisorderpatients AT eneaparimbelli asyntheticdatasetofliverdisorderpatients AT giovannanicora syntheticdatasetofliverdisorderpatients AT tommasomariobuonocore syntheticdatasetofliverdisorderpatients AT eneaparimbelli syntheticdatasetofliverdisorderpatients |