Ontology-Based AI Design Patterns and Constraints in Cancer Registry Data Validation

Data validation in cancer registration is a critical operation but is resource-intensive and has traditionally depended on proprietary software. Ontology-based AI is a novel approach utilising machine reasoning based on axioms formally described in description logic. This is a different approach fro...

Full description

Bibliographic Details
Main Authors: Nicholas Nicholson, Francesco Giusti, Carmen Martos
Format: Article
Language:English
Published: MDPI AG 2023-12-01
Series:Cancers
Subjects:
Online Access:https://www.mdpi.com/2072-6694/15/24/5812
_version_ 1797381700506877952
author Nicholas Nicholson
Francesco Giusti
Carmen Martos
author_facet Nicholas Nicholson
Francesco Giusti
Carmen Martos
author_sort Nicholas Nicholson
collection DOAJ
description Data validation in cancer registration is a critical operation but is resource-intensive and has traditionally depended on proprietary software. Ontology-based AI is a novel approach utilising machine reasoning based on axioms formally described in description logic. This is a different approach from deep learning AI techniques but not exclusive of them. The advantage of the ontology approach lies in its ability to address a number of challenges concurrently. The disadvantages relate to computational costs, which increase with language expressivity and the size of data sets, and class containment restrictions imposed by description logics. Both these aspects would benefit from the availability of design patterns, which is the motivation behind this study. We modelled the European cancer registry data validation rules in description logic using a number of design patterns and showed the viability of the approach. Reasoning speeds are a limiting factor for large cancer registry data sets comprising many hundreds of thousands of records, but these can be offset to a certain extent by developing the ontology in a modular way. Data validation is also a highly parallelisable process. Important potential future work in this domain would be to identify and optimise reusable design patterns, paying particular attention to avoiding any unintended reasoning efficiency hotspots.
first_indexed 2024-03-08T20:56:13Z
format Article
id doaj.art-ed6655fea08341d7b6959373b18e65ba
institution Directory Open Access Journal
issn 2072-6694
language English
last_indexed 2024-03-08T20:56:13Z
publishDate 2023-12-01
publisher MDPI AG
record_format Article
series Cancers
spelling doaj.art-ed6655fea08341d7b6959373b18e65ba2023-12-22T13:58:57ZengMDPI AGCancers2072-66942023-12-011524581210.3390/cancers15245812Ontology-Based AI Design Patterns and Constraints in Cancer Registry Data ValidationNicholas Nicholson0Francesco Giusti1Carmen Martos2European Commission, Joint Research Centre (JRC), 21027 Ispra, ItalyBelgian Cancer Registry, 1210 Brussels, BelgiumRare Diseases Research Unit, Foundation for the Promotion of Health and Biomedical Research in the Valencian Region (FISABIO), 46020 Valencia, SpainData validation in cancer registration is a critical operation but is resource-intensive and has traditionally depended on proprietary software. Ontology-based AI is a novel approach utilising machine reasoning based on axioms formally described in description logic. This is a different approach from deep learning AI techniques but not exclusive of them. The advantage of the ontology approach lies in its ability to address a number of challenges concurrently. The disadvantages relate to computational costs, which increase with language expressivity and the size of data sets, and class containment restrictions imposed by description logics. Both these aspects would benefit from the availability of design patterns, which is the motivation behind this study. We modelled the European cancer registry data validation rules in description logic using a number of design patterns and showed the viability of the approach. Reasoning speeds are a limiting factor for large cancer registry data sets comprising many hundreds of thousands of records, but these can be offset to a certain extent by developing the ontology in a modular way. Data validation is also a highly parallelisable process. Important potential future work in this domain would be to identify and optimise reusable design patterns, paying particular attention to avoiding any unintended reasoning efficiency hotspots.https://www.mdpi.com/2072-6694/15/24/5812data validationknowledge representationontology-based AIontology design patternsmachine reasoningcancer registries
spellingShingle Nicholas Nicholson
Francesco Giusti
Carmen Martos
Ontology-Based AI Design Patterns and Constraints in Cancer Registry Data Validation
Cancers
data validation
knowledge representation
ontology-based AI
ontology design patterns
machine reasoning
cancer registries
title Ontology-Based AI Design Patterns and Constraints in Cancer Registry Data Validation
title_full Ontology-Based AI Design Patterns and Constraints in Cancer Registry Data Validation
title_fullStr Ontology-Based AI Design Patterns and Constraints in Cancer Registry Data Validation
title_full_unstemmed Ontology-Based AI Design Patterns and Constraints in Cancer Registry Data Validation
title_short Ontology-Based AI Design Patterns and Constraints in Cancer Registry Data Validation
title_sort ontology based ai design patterns and constraints in cancer registry data validation
topic data validation
knowledge representation
ontology-based AI
ontology design patterns
machine reasoning
cancer registries
url https://www.mdpi.com/2072-6694/15/24/5812
work_keys_str_mv AT nicholasnicholson ontologybasedaidesignpatternsandconstraintsincancerregistrydatavalidation
AT francescogiusti ontologybasedaidesignpatternsandconstraintsincancerregistrydatavalidation
AT carmenmartos ontologybasedaidesignpatternsandconstraintsincancerregistrydatavalidation