General Strategy for Querying Web Sources in a Data Federation Environment

Modern database management systems are supporting the inclusion and querying of non-relational sources within a data federation environment via wrappers. Wrapper development for Web sources, however, is a convolution of code with extraction and query planning knowledge and becomes a daunting task. W...

詳細記述

書誌詳細
主要な著者: Firat, Aykut, Wu, Lynn, Madnick, Stuart E.
その他の著者: Sloan School of Management
フォーマット: 論文
言語:en_US
出版事項: IGI Global 2011
オンライン・アクセス:http://hdl.handle.net/1721.1/67341
https://orcid.org/0000-0001-9240-2573
https://orcid.org/0000-0003-0613-5152
_version_ 1826213757015031808
author Firat, Aykut
Wu, Lynn
Madnick, Stuart E.
author2 Sloan School of Management
author_facet Sloan School of Management
Firat, Aykut
Wu, Lynn
Madnick, Stuart E.
author_sort Firat, Aykut
collection MIT
description Modern database management systems are supporting the inclusion and querying of non-relational sources within a data federation environment via wrappers. Wrapper development for Web sources, however, is a convolution of code with extraction and query planning knowledge and becomes a daunting task. We use IBM DB2 federation engine to demonstrate the challenges of incorporating Web sources into a data federation. We, then, present a practical and general strategy for the inclusion and querying of Web sources without requiring any changes in the underlying data federation technology. This strategy separates the code and knowledge in wrapper development by introducing a general-purpose capabilities-aware mini query-planner and a data extraction engine. As a result, Web sources can be included in a data federation system faster, and maintained easier.
first_indexed 2024-09-23T15:54:20Z
format Article
id mit-1721.1/67341
institution Massachusetts Institute of Technology
language en_US
last_indexed 2024-09-23T15:54:20Z
publishDate 2011
publisher IGI Global
record_format dspace
spelling mit-1721.1/673412022-10-02T04:56:40Z General Strategy for Querying Web Sources in a Data Federation Environment Firat, Aykut Wu, Lynn Madnick, Stuart E. Sloan School of Management Madnick, Stuart E. Wu, Lynn Madnick, Stuart E. Modern database management systems are supporting the inclusion and querying of non-relational sources within a data federation environment via wrappers. Wrapper development for Web sources, however, is a convolution of code with extraction and query planning knowledge and becomes a daunting task. We use IBM DB2 federation engine to demonstrate the challenges of incorporating Web sources into a data federation. We, then, present a practical and general strategy for the inclusion and querying of Web sources without requiring any changes in the underlying data federation technology. This strategy separates the code and knowledge in wrapper development by introducing a general-purpose capabilities-aware mini query-planner and a data extraction engine. As a result, Web sources can be included in a data federation system faster, and maintained easier. 2011-12-01T18:38:46Z 2011-12-01T18:38:46Z 2009-01 Article http://purl.org/eprint/type/JournalArticle 1533-8010 1063-8016 http://hdl.handle.net/1721.1/67341 Firat, Aykut, Lynn Wu, and Stuart Madnick. “General Strategy for Querying Web Sources in a Data Federation Environment.” Journal of Database Management 20 (2009): 1-18. Web. 1 Dec. 2011. © 2009 IGI Global https://orcid.org/0000-0001-9240-2573 https://orcid.org/0000-0003-0613-5152 en_US http://dx.doi.org/10.4018/jdm.2009092201 Journal of Database Management Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use. application/pdf IGI Global Idea Group Inc.
spellingShingle Firat, Aykut
Wu, Lynn
Madnick, Stuart E.
General Strategy for Querying Web Sources in a Data Federation Environment
title General Strategy for Querying Web Sources in a Data Federation Environment
title_full General Strategy for Querying Web Sources in a Data Federation Environment
title_fullStr General Strategy for Querying Web Sources in a Data Federation Environment
title_full_unstemmed General Strategy for Querying Web Sources in a Data Federation Environment
title_short General Strategy for Querying Web Sources in a Data Federation Environment
title_sort general strategy for querying web sources in a data federation environment
url http://hdl.handle.net/1721.1/67341
https://orcid.org/0000-0001-9240-2573
https://orcid.org/0000-0003-0613-5152
work_keys_str_mv AT firataykut generalstrategyforqueryingwebsourcesinadatafederationenvironment
AT wulynn generalstrategyforqueryingwebsourcesinadatafederationenvironment
AT madnickstuarte generalstrategyforqueryingwebsourcesinadatafederationenvironment