An ontology for very large numbers of longitudinal health records to facilitate data mining and machine learning

Despite the extensive experience of the authors working in industry with a variety of electronic health records that worked well in their intended context, none currently available in reasonably large numbers seem to have ontologies and formats that will scale well to very large numbers of detailed...

Full description

Bibliographic Details
Main Authors: B. Robson, O.K. Baek
Format: Article
Language:English
Published: Elsevier 2023-01-01
Series:Informatics in Medicine Unlocked
Subjects:
Online Access:http://www.sciencedirect.com/science/article/pii/S2352914823000461
_version_ 1827965424821600256
author B. Robson
O.K. Baek
author_facet B. Robson
O.K. Baek
author_sort B. Robson
collection DOAJ
description Despite the extensive experience of the authors working in industry with a variety of electronic health records that worked well in their intended context, none currently available in reasonably large numbers seem to have ontologies and formats that will scale well to very large numbers of detailed cradle-to-grave longitudinal health records facilitating knowledge extraction. By that we mean data mining, Deep Learning neural nets and all related analytic and predictive methods for biomedical research and clinical decision support potentially applied to the health records of an entire nation. They are mostly far too complicated to support frequent high-dimensional analysis, which is required because such records will update (or should update) dynamically on a regular basis, will in future include new tests etc. acquired daily by translational medical research, and not least allow public health, research, and diagnostic, vaccine, and drug development teams to respond quickly to emergent epidemics like COVID-19. A Presidential Advisory team call in 2010 for interoperability and ease of data mining for medical records is discussed and the situation seems still not fully resolved. The solution appears to lie between efficient comma separated value files and the ability to embellish these with a moderate degree of more elaborate ontology. One recommendation is made here with discussion and analysis that should guide alternative and future approaches. It combines demographic, comorbidity, genomic, diagnostic, interventional, and outcomes information along with time/date stamping method appropriate to analysis, with facilities for special research studies. By using a “metadata operator”, a suitable balance between a comma separated values file and an ontological structure is possible.
first_indexed 2024-04-09T17:32:49Z
format Article
id doaj.art-b74cfc16ed8242d7ad45295e3b0fb38e
institution Directory Open Access Journal
issn 2352-9148
language English
last_indexed 2024-04-09T17:32:49Z
publishDate 2023-01-01
publisher Elsevier
record_format Article
series Informatics in Medicine Unlocked
spelling doaj.art-b74cfc16ed8242d7ad45295e3b0fb38e2023-04-18T04:08:53ZengElsevierInformatics in Medicine Unlocked2352-91482023-01-0138101204An ontology for very large numbers of longitudinal health records to facilitate data mining and machine learningB. Robson0O.K. Baek1Ingine Inc., Cleveland, OH, USA; The Dirac Foundation, Oxfordshire, UK; Corresponding author. Ingine Inc., Cleveland, OH, USA.Electronics and Telecommunications Research Institute, South KoreaDespite the extensive experience of the authors working in industry with a variety of electronic health records that worked well in their intended context, none currently available in reasonably large numbers seem to have ontologies and formats that will scale well to very large numbers of detailed cradle-to-grave longitudinal health records facilitating knowledge extraction. By that we mean data mining, Deep Learning neural nets and all related analytic and predictive methods for biomedical research and clinical decision support potentially applied to the health records of an entire nation. They are mostly far too complicated to support frequent high-dimensional analysis, which is required because such records will update (or should update) dynamically on a regular basis, will in future include new tests etc. acquired daily by translational medical research, and not least allow public health, research, and diagnostic, vaccine, and drug development teams to respond quickly to emergent epidemics like COVID-19. A Presidential Advisory team call in 2010 for interoperability and ease of data mining for medical records is discussed and the situation seems still not fully resolved. The solution appears to lie between efficient comma separated value files and the ability to embellish these with a moderate degree of more elaborate ontology. One recommendation is made here with discussion and analysis that should guide alternative and future approaches. It combines demographic, comorbidity, genomic, diagnostic, interventional, and outcomes information along with time/date stamping method appropriate to analysis, with facilities for special research studies. By using a “metadata operator”, a suitable balance between a comma separated values file and an ontological structure is possible.http://www.sciencedirect.com/science/article/pii/S2352914823000461Longitudinal patient recordOntologyBig dataArtificial intelligenceMachine learningDeep learning
spellingShingle B. Robson
O.K. Baek
An ontology for very large numbers of longitudinal health records to facilitate data mining and machine learning
Informatics in Medicine Unlocked
Longitudinal patient record
Ontology
Big data
Artificial intelligence
Machine learning
Deep learning
title An ontology for very large numbers of longitudinal health records to facilitate data mining and machine learning
title_full An ontology for very large numbers of longitudinal health records to facilitate data mining and machine learning
title_fullStr An ontology for very large numbers of longitudinal health records to facilitate data mining and machine learning
title_full_unstemmed An ontology for very large numbers of longitudinal health records to facilitate data mining and machine learning
title_short An ontology for very large numbers of longitudinal health records to facilitate data mining and machine learning
title_sort ontology for very large numbers of longitudinal health records to facilitate data mining and machine learning
topic Longitudinal patient record
Ontology
Big data
Artificial intelligence
Machine learning
Deep learning
url http://www.sciencedirect.com/science/article/pii/S2352914823000461
work_keys_str_mv AT brobson anontologyforverylargenumbersoflongitudinalhealthrecordstofacilitatedataminingandmachinelearning
AT okbaek anontologyforverylargenumbersoflongitudinalhealthrecordstofacilitatedataminingandmachinelearning
AT brobson ontologyforverylargenumbersoflongitudinalhealthrecordstofacilitatedataminingandmachinelearning
AT okbaek ontologyforverylargenumbersoflongitudinalhealthrecordstofacilitatedataminingandmachinelearning