A Column Styled Composable Schema Matcher for Semantic Data-Types
Schema matching exists as a long-standing challenge in many database related applications, such as data integration, where two databases with different schema have to be integrated. With the evolvement from database to big data, the schema matching has been enriched with various purposes and applica...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Ubiquity Press
2019-06-01
|
Series: | Data Science Journal |
Subjects: | |
Online Access: | https://datascience.codata.org/articles/973 |
_version_ | 1818175774992105472 |
---|---|
author | Xiaofeng Liao Jordy Bottelier Zhiming Zhao |
author_facet | Xiaofeng Liao Jordy Bottelier Zhiming Zhao |
author_sort | Xiaofeng Liao |
collection | DOAJ |
description | Schema matching exists as a long-standing challenge in many database related applications, such as data integration, where two databases with different schema have to be integrated. With the evolvement from database to big data, the schema matching has been enriched with various purposes and application contexts, ranging from data integration, to service integration, to semantic data clouding, until more recent exploratory data analysis over big data. These enriched contexts increase the demand for schema matching between semantic data-types, such as XML, RDF etc. The existing integration approaches have not dealt with the challenges of defining a relation between XML and other semantic data-types. To address these challenges, this paper studies the problem of schema mapping from XML to RDF in two folds. Firstly, testify the validity of single matcher in a column based manner for the semantic data types. Secondly, testify the validity of a highly configurable framework that utilizes hierarchical classification in order to construct a composable pipeline. We propose and implement a Reconfigurable pipeline for Semi-Automatic Schema Matching (REPSASM), which aims to solve the customizability of the matching problem by providing an environment in which a user can create, configure and experiment with their own schema-matching procedure. The experiments performed within this work show that the configurability and hierarchical classification improves the matching result, and it proposes an algorithm to automatically optimize such a hierarchy pipeline. |
first_indexed | 2024-12-11T20:05:39Z |
format | Article |
id | doaj.art-41cda60a24944021be704eab1a30d8a3 |
institution | Directory Open Access Journal |
issn | 1683-1470 |
language | English |
last_indexed | 2024-12-11T20:05:39Z |
publishDate | 2019-06-01 |
publisher | Ubiquity Press |
record_format | Article |
series | Data Science Journal |
spelling | doaj.art-41cda60a24944021be704eab1a30d8a32022-12-22T00:52:24ZengUbiquity PressData Science Journal1683-14702019-06-0118110.5334/dsj-2019-025717A Column Styled Composable Schema Matcher for Semantic Data-TypesXiaofeng Liao0Jordy Bottelier1Zhiming Zhao2System and Network Engineering Lab, Informatics Institute, University of Amsterdam, AmsterdamSystem and Network Engineering Lab, Informatics Institute, University of Amsterdam, AmsterdamSystem and Network Engineering Lab, Informatics Institute, University of Amsterdam, AmsterdamSchema matching exists as a long-standing challenge in many database related applications, such as data integration, where two databases with different schema have to be integrated. With the evolvement from database to big data, the schema matching has been enriched with various purposes and application contexts, ranging from data integration, to service integration, to semantic data clouding, until more recent exploratory data analysis over big data. These enriched contexts increase the demand for schema matching between semantic data-types, such as XML, RDF etc. The existing integration approaches have not dealt with the challenges of defining a relation between XML and other semantic data-types. To address these challenges, this paper studies the problem of schema mapping from XML to RDF in two folds. Firstly, testify the validity of single matcher in a column based manner for the semantic data types. Secondly, testify the validity of a highly configurable framework that utilizes hierarchical classification in order to construct a composable pipeline. We propose and implement a Reconfigurable pipeline for Semi-Automatic Schema Matching (REPSASM), which aims to solve the customizability of the matching problem by providing an environment in which a user can create, configure and experiment with their own schema-matching procedure. The experiments performed within this work show that the configurability and hierarchical classification improves the matching result, and it proposes an algorithm to automatically optimize such a hierarchy pipeline.https://datascience.codata.org/articles/973Schema MatchingSemantic Data-typesXMLRDF |
spellingShingle | Xiaofeng Liao Jordy Bottelier Zhiming Zhao A Column Styled Composable Schema Matcher for Semantic Data-Types Data Science Journal Schema Matching Semantic Data-types XML RDF |
title | A Column Styled Composable Schema Matcher for Semantic Data-Types |
title_full | A Column Styled Composable Schema Matcher for Semantic Data-Types |
title_fullStr | A Column Styled Composable Schema Matcher for Semantic Data-Types |
title_full_unstemmed | A Column Styled Composable Schema Matcher for Semantic Data-Types |
title_short | A Column Styled Composable Schema Matcher for Semantic Data-Types |
title_sort | column styled composable schema matcher for semantic data types |
topic | Schema Matching Semantic Data-types XML RDF |
url | https://datascience.codata.org/articles/973 |
work_keys_str_mv | AT xiaofengliao acolumnstyledcomposableschemamatcherforsemanticdatatypes AT jordybottelier acolumnstyledcomposableschemamatcherforsemanticdatatypes AT zhimingzhao acolumnstyledcomposableschemamatcherforsemanticdatatypes AT xiaofengliao columnstyledcomposableschemamatcherforsemanticdatatypes AT jordybottelier columnstyledcomposableschemamatcherforsemanticdatatypes AT zhimingzhao columnstyledcomposableschemamatcherforsemanticdatatypes |