Taking the OXPath down the deep web.

Although deep web analysis has been studied extensively, there is no succinct formalism to describe user interactions with AJAX-enabled web applications. Toward this end, we introduce OXPath as a superset of XPath 1.0. Beyond XPath, OXPath is able (1) to fill web forms and trigger DOM events, (2) to...

पूर्ण विवरण

ग्रंथसूची विवरण
मुख्य लेखकों: Sellers, A, Furche, T, Gottlob, G, Grasso, G, Schallhart, C
अन्य लेखक: Ailamaki, A
स्वरूप: Journal article
भाषा:English
प्रकाशित: ACM 2011
_version_ 1826289932865372160
author Sellers, A
Furche, T
Gottlob, G
Grasso, G
Schallhart, C
author2 Ailamaki, A
author_facet Ailamaki, A
Sellers, A
Furche, T
Gottlob, G
Grasso, G
Schallhart, C
author_sort Sellers, A
collection OXFORD
description Although deep web analysis has been studied extensively, there is no succinct formalism to describe user interactions with AJAX-enabled web applications. Toward this end, we introduce OXPath as a superset of XPath 1.0. Beyond XPath, OXPath is able (1) to fill web forms and trigger DOM events, (2) to access dynamically computed CSS attributes, (3) to navigate between visible form fields, and (4) to mark relevant information for extraction. This way, OXPath expressions can closely simulate the human interaction relevant for navigation rather than rely exclusively on the HTML structure. Thus, they are quite resilient against technical changes. We demonstrate the expressiveness and practical efficacy of OXPath to tackle a group flight planning problem. We use the OXPath implementation and visual interface to access the popular, highly-scripted travel site Kayak. We show, how to formulate OXPath expressions to extract all booking information with just a few lines of code.
first_indexed 2024-03-07T02:36:27Z
format Journal article
id oxford-uuid:a8f20bfe-eb17-44e9-a69d-9fe43ca297df
institution University of Oxford
language English
last_indexed 2024-03-07T02:36:27Z
publishDate 2011
publisher ACM
record_format dspace
spelling oxford-uuid:a8f20bfe-eb17-44e9-a69d-9fe43ca297df2022-03-27T03:05:07ZTaking the OXPath down the deep web.Journal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:a8f20bfe-eb17-44e9-a69d-9fe43ca297dfEnglishSymplectic Elements at OxfordACM2011Sellers, AFurche, TGottlob, GGrasso, GSchallhart, CAilamaki, AAmer-Yahia, SPatel, JRisch, TSenellart, PStoyanovich, JAlthough deep web analysis has been studied extensively, there is no succinct formalism to describe user interactions with AJAX-enabled web applications. Toward this end, we introduce OXPath as a superset of XPath 1.0. Beyond XPath, OXPath is able (1) to fill web forms and trigger DOM events, (2) to access dynamically computed CSS attributes, (3) to navigate between visible form fields, and (4) to mark relevant information for extraction. This way, OXPath expressions can closely simulate the human interaction relevant for navigation rather than rely exclusively on the HTML structure. Thus, they are quite resilient against technical changes. We demonstrate the expressiveness and practical efficacy of OXPath to tackle a group flight planning problem. We use the OXPath implementation and visual interface to access the popular, highly-scripted travel site Kayak. We show, how to formulate OXPath expressions to extract all booking information with just a few lines of code.
spellingShingle Sellers, A
Furche, T
Gottlob, G
Grasso, G
Schallhart, C
Taking the OXPath down the deep web.
title Taking the OXPath down the deep web.
title_full Taking the OXPath down the deep web.
title_fullStr Taking the OXPath down the deep web.
title_full_unstemmed Taking the OXPath down the deep web.
title_short Taking the OXPath down the deep web.
title_sort taking the oxpath down the deep web
work_keys_str_mv AT sellersa takingtheoxpathdownthedeepweb
AT furchet takingtheoxpathdownthedeepweb
AT gottlobg takingtheoxpathdownthedeepweb
AT grassog takingtheoxpathdownthedeepweb
AT schallhartc takingtheoxpathdownthedeepweb