On Web Taxonomy Integration

We address the problem of integrating objects from a source taxonomy into a master taxonomy. This problem is not only pervasive on the nowadays web, but also important to the emerging semantic web. A straightforward approach to automating this process would be to train a classifier for each category...

Full description

Bibliographic Details
Main Authors: Zhang, Dell, Lee, Wee Sun
Format: Article
Language:en_US
Published: 2003
Subjects:
Online Access:http://hdl.handle.net/1721.1/3867
_version_ 1826197353987571712
author Zhang, Dell
Lee, Wee Sun
author_facet Zhang, Dell
Lee, Wee Sun
author_sort Zhang, Dell
collection MIT
description We address the problem of integrating objects from a source taxonomy into a master taxonomy. This problem is not only pervasive on the nowadays web, but also important to the emerging semantic web. A straightforward approach to automating this process would be to train a classifier for each category in the master taxonomy, and then classify objects from the source taxonomy into these categories. In this paper we attempt to use a powerful classification method, Support Vector Machine (SVM), to attack this problem. Our key insight is that the availability of the source taxonomy data could be helpful to build better classifiers in this scenario, therefore it would be beneficial to do transductive learning rather than inductive learning, i.e., learning to optimize classification performance on a particular set of test examples. Noticing that the categorization of the master and source taxonomies often have some semantic overlap, we propose a new method, Cluster Shrinkage (CS), to further enhance the classification by exploiting such implicit knowledge. Our experiments with real-world web data show substantial improvements in the performance of taxonomy integration.
first_indexed 2024-09-23T10:46:24Z
format Article
id mit-1721.1/3867
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T10:46:24Z
publishDate 2003
record_format dspace
spelling mit-1721.1/38672019-04-12T08:07:46Z On Web Taxonomy Integration Zhang, Dell Lee, Wee Sun web taxonomy integration classification support vector machines transductive learning We address the problem of integrating objects from a source taxonomy into a master taxonomy. This problem is not only pervasive on the nowadays web, but also important to the emerging semantic web. A straightforward approach to automating this process would be to train a classifier for each category in the master taxonomy, and then classify objects from the source taxonomy into these categories. In this paper we attempt to use a powerful classification method, Support Vector Machine (SVM), to attack this problem. Our key insight is that the availability of the source taxonomy data could be helpful to build better classifiers in this scenario, therefore it would be beneficial to do transductive learning rather than inductive learning, i.e., learning to optimize classification performance on a particular set of test examples. Noticing that the categorization of the master and source taxonomies often have some semantic overlap, we propose a new method, Cluster Shrinkage (CS), to further enhance the classification by exploiting such implicit knowledge. Our experiments with real-world web data show substantial improvements in the performance of taxonomy integration. Singapore-MIT Alliance (SMA) 2003-12-13T19:41:16Z 2003-12-13T19:41:16Z 2004-01 Article http://hdl.handle.net/1721.1/3867 en_US Computer Science (CS); 106014 bytes application/pdf application/pdf
spellingShingle web taxonomy integration
classification
support vector machines
transductive learning
Zhang, Dell
Lee, Wee Sun
On Web Taxonomy Integration
title On Web Taxonomy Integration
title_full On Web Taxonomy Integration
title_fullStr On Web Taxonomy Integration
title_full_unstemmed On Web Taxonomy Integration
title_short On Web Taxonomy Integration
title_sort on web taxonomy integration
topic web taxonomy integration
classification
support vector machines
transductive learning
url http://hdl.handle.net/1721.1/3867
work_keys_str_mv AT zhangdell onwebtaxonomyintegration
AT leeweesun onwebtaxonomyintegration