An approach for instance based schema matching with google similarity and regular expression

Instance based schema matching is the process of comparing instances from different heterogeneous data sources in determining the correspondences of schema attributes. It is a substitutional choice when schema information is not available or might be available but worthless to be used for matching p...

Full description

Bibliographic Details
Main Authors: Mehdi, Osama, Ibrahim, Hamidah, Affendey, Lilly
Format: Article
Language:English
Published: Zarqa University 2017
Online Access:http://psasir.upm.edu.my/id/eprint/60807/1/An%20approach%20for%20instance%20based%20schema%20matching%20with%20google%20similarity%20and%20regular%20expression.pdf
_version_ 1825932246960308224
author Mehdi, Osama
Ibrahim, Hamidah
Affendey, Lilly
author_facet Mehdi, Osama
Ibrahim, Hamidah
Affendey, Lilly
author_sort Mehdi, Osama
collection UPM
description Instance based schema matching is the process of comparing instances from different heterogeneous data sources in determining the correspondences of schema attributes. It is a substitutional choice when schema information is not available or might be available but worthless to be used for matching purpose. Different strategies have been used by various instance based schema matching approaches for discovering correspondences between schema attributes. These strategies are neural network, machine learning, information theoretic discrepancy and rule based. Most of these approaches treated instances including instances with numeric values as strings which prevents discovering common patterns or performing statistical computation between the numeric instances. As a consequence, this causes unidentified matches especially for numeric instances. In this paper, we propose an approach that addresses the above limitation of the previous approaches. Since we only fully exploit the instances of the schemas for this task, we rely on strategies that combine the strength of Google as a web semantic and regular expression as pattern recognition. The results show that our approach is able to find 1-1 schema matches with high accuracy in the range of 93%-99% in terms of Precision (P), Recall (R), and F-measure (F). Furthermore, the results showed that our proposed approach outperformed the previous approaches although only a sample of instances is used instead of considering the whole instances during the process of instance based schema matching as used in the previous works.
first_indexed 2024-03-06T09:39:03Z
format Article
id upm.eprints-60807
institution Universiti Putra Malaysia
language English
last_indexed 2024-03-06T09:39:03Z
publishDate 2017
publisher Zarqa University
record_format dspace
spelling upm.eprints-608072019-03-27T09:07:43Z http://psasir.upm.edu.my/id/eprint/60807/ An approach for instance based schema matching with google similarity and regular expression Mehdi, Osama Ibrahim, Hamidah Affendey, Lilly Instance based schema matching is the process of comparing instances from different heterogeneous data sources in determining the correspondences of schema attributes. It is a substitutional choice when schema information is not available or might be available but worthless to be used for matching purpose. Different strategies have been used by various instance based schema matching approaches for discovering correspondences between schema attributes. These strategies are neural network, machine learning, information theoretic discrepancy and rule based. Most of these approaches treated instances including instances with numeric values as strings which prevents discovering common patterns or performing statistical computation between the numeric instances. As a consequence, this causes unidentified matches especially for numeric instances. In this paper, we propose an approach that addresses the above limitation of the previous approaches. Since we only fully exploit the instances of the schemas for this task, we rely on strategies that combine the strength of Google as a web semantic and regular expression as pattern recognition. The results show that our approach is able to find 1-1 schema matches with high accuracy in the range of 93%-99% in terms of Precision (P), Recall (R), and F-measure (F). Furthermore, the results showed that our proposed approach outperformed the previous approaches although only a sample of instances is used instead of considering the whole instances during the process of instance based schema matching as used in the previous works. Zarqa University 2017 Article PeerReviewed text en http://psasir.upm.edu.my/id/eprint/60807/1/An%20approach%20for%20instance%20based%20schema%20matching%20with%20google%20similarity%20and%20regular%20expression.pdf Mehdi, Osama and Ibrahim, Hamidah and Affendey, Lilly (2017) An approach for instance based schema matching with google similarity and regular expression. The International Arab Journal of Information Technology, 14 (5). pp. 1-10. ISSN 1683-3198
spellingShingle Mehdi, Osama
Ibrahim, Hamidah
Affendey, Lilly
An approach for instance based schema matching with google similarity and regular expression
title An approach for instance based schema matching with google similarity and regular expression
title_full An approach for instance based schema matching with google similarity and regular expression
title_fullStr An approach for instance based schema matching with google similarity and regular expression
title_full_unstemmed An approach for instance based schema matching with google similarity and regular expression
title_short An approach for instance based schema matching with google similarity and regular expression
title_sort approach for instance based schema matching with google similarity and regular expression
url http://psasir.upm.edu.my/id/eprint/60807/1/An%20approach%20for%20instance%20based%20schema%20matching%20with%20google%20similarity%20and%20regular%20expression.pdf
work_keys_str_mv AT mehdiosama anapproachforinstancebasedschemamatchingwithgooglesimilarityandregularexpression
AT ibrahimhamidah anapproachforinstancebasedschemamatchingwithgooglesimilarityandregularexpression
AT affendeylilly anapproachforinstancebasedschemamatchingwithgooglesimilarityandregularexpression
AT mehdiosama approachforinstancebasedschemamatchingwithgooglesimilarityandregularexpression
AT ibrahimhamidah approachforinstancebasedschemamatchingwithgooglesimilarityandregularexpression
AT affendeylilly approachforinstancebasedschemamatchingwithgooglesimilarityandregularexpression