A review of data abstraction
It is well-known that Artificial Intelligence (AI), and in particular Machine Learning (ML), is not effective without good data preparation, as also pointed out by the recent wave of data-centric AI. Data preparation is the process of gathering, transforming and cleaning raw data prior to processing...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2023-06-01
|
Series: | Frontiers in Artificial Intelligence |
Subjects: | |
Online Access: | https://www.frontiersin.org/articles/10.3389/frai.2023.1085754/full |
_version_ | 1827918208925958144 |
---|---|
author | Gianluca Cima Marco Console Maurizio Lenzerini Antonella Poggi |
author_facet | Gianluca Cima Marco Console Maurizio Lenzerini Antonella Poggi |
author_sort | Gianluca Cima |
collection | DOAJ |
description | It is well-known that Artificial Intelligence (AI), and in particular Machine Learning (ML), is not effective without good data preparation, as also pointed out by the recent wave of data-centric AI. Data preparation is the process of gathering, transforming and cleaning raw data prior to processing and analysis. Since nowadays data often reside in distributed and heterogeneous data sources, the first activity of data preparation requires collecting data from suitable data sources and data services, often distributed and heterogeneous. It is thus essential that providers describe their data services in a way to make them compliant with the FAIR guiding principles, i.e., make them automatically Findable, Accessible, Interoperable, and Reusable (FAIR). The notion of data abstraction has been introduced exactly to meet this need. Abstraction is a kind of reverse engineering task that automatically provides a semantic characterization of a data service made available by a provider. The goal of this paper is to review the results obtained so far in data abstraction, by presenting the formal framework for its definition, reporting about the decidability and complexity of the main theoretical problems concerning abstraction, and discuss open issues and interesting directions for future research. |
first_indexed | 2024-03-13T03:39:04Z |
format | Article |
id | doaj.art-fcbb10d974a04fde8dc02531e5c1b855 |
institution | Directory Open Access Journal |
issn | 2624-8212 |
language | English |
last_indexed | 2024-03-13T03:39:04Z |
publishDate | 2023-06-01 |
publisher | Frontiers Media S.A. |
record_format | Article |
series | Frontiers in Artificial Intelligence |
spelling | doaj.art-fcbb10d974a04fde8dc02531e5c1b8552023-06-23T12:48:08ZengFrontiers Media S.A.Frontiers in Artificial Intelligence2624-82122023-06-01610.3389/frai.2023.10857541085754A review of data abstractionGianluca CimaMarco ConsoleMaurizio LenzeriniAntonella PoggiIt is well-known that Artificial Intelligence (AI), and in particular Machine Learning (ML), is not effective without good data preparation, as also pointed out by the recent wave of data-centric AI. Data preparation is the process of gathering, transforming and cleaning raw data prior to processing and analysis. Since nowadays data often reside in distributed and heterogeneous data sources, the first activity of data preparation requires collecting data from suitable data sources and data services, often distributed and heterogeneous. It is thus essential that providers describe their data services in a way to make them compliant with the FAIR guiding principles, i.e., make them automatically Findable, Accessible, Interoperable, and Reusable (FAIR). The notion of data abstraction has been introduced exactly to meet this need. Abstraction is a kind of reverse engineering task that automatically provides a semantic characterization of a data service made available by a provider. The goal of this paper is to review the results obtained so far in data abstraction, by presenting the formal framework for its definition, reporting about the decidability and complexity of the main theoretical problems concerning abstraction, and discuss open issues and interesting directions for future research.https://www.frontiersin.org/articles/10.3389/frai.2023.1085754/fullknowledge representationabstractionautomated reasoningdata integrationdata preparation |
spellingShingle | Gianluca Cima Marco Console Maurizio Lenzerini Antonella Poggi A review of data abstraction Frontiers in Artificial Intelligence knowledge representation abstraction automated reasoning data integration data preparation |
title | A review of data abstraction |
title_full | A review of data abstraction |
title_fullStr | A review of data abstraction |
title_full_unstemmed | A review of data abstraction |
title_short | A review of data abstraction |
title_sort | review of data abstraction |
topic | knowledge representation abstraction automated reasoning data integration data preparation |
url | https://www.frontiersin.org/articles/10.3389/frai.2023.1085754/full |
work_keys_str_mv | AT gianlucacima areviewofdataabstraction AT marcoconsole areviewofdataabstraction AT mauriziolenzerini areviewofdataabstraction AT antonellapoggi areviewofdataabstraction AT gianlucacima reviewofdataabstraction AT marcoconsole reviewofdataabstraction AT mauriziolenzerini reviewofdataabstraction AT antonellapoggi reviewofdataabstraction |