Visual oXPath: robust wrapping by example.

Good examples are hard to find, particularly in wrapper induction: Picking even one wrong example can spell disaster by yielding overgeneralized or overspecialized wrappers. Such wrappers extract data with low precision or recall, unless adjusted by human experts at significant cost. Visual OXPath i...

Full description

Bibliographic Details
Main Authors: Kranzdorf, J, Sellers, A, Grasso, G, Schallhart, C, Furche, T
Other Authors: Mille, A
Format: Journal article
Language:English
Published: ACM 2012
_version_ 1826266409690202112
author Kranzdorf, J
Sellers, A
Grasso, G
Schallhart, C
Furche, T
author2 Mille, A
author_facet Mille, A
Kranzdorf, J
Sellers, A
Grasso, G
Schallhart, C
Furche, T
author_sort Kranzdorf, J
collection OXFORD
description Good examples are hard to find, particularly in wrapper induction: Picking even one wrong example can spell disaster by yielding overgeneralized or overspecialized wrappers. Such wrappers extract data with low precision or recall, unless adjusted by human experts at significant cost. Visual OXPath is an open-source, visual wrapper induction system that requires minimal examples and eases wrapper refinement: Often it derives the intended wrapper from a single example through sophisticated heuristics that determine the best set of similar examples. To ease wrapper refinement, it offers a list of wrappers ranked by example similarity and robustness. Visual OXPath offers extensive visual feedback for this refinement which can be performed without any knowledge of the underlying wrapper language. Where further refinement by a human wrapper is needed, Visual OXPath profits from being based on OXPath, a declarative wrapper language that extends XPath with a thin layer of features necessary for extraction and page navigation. Copyright is held by the International World Wide Web Conference Committee (IW3C2).
first_indexed 2024-03-06T20:38:30Z
format Journal article
id oxford-uuid:3371059f-cc08-443c-9e23-af3b01a4f32a
institution University of Oxford
language English
last_indexed 2024-03-06T20:38:30Z
publishDate 2012
publisher ACM
record_format dspace
spelling oxford-uuid:3371059f-cc08-443c-9e23-af3b01a4f32a2022-03-26T13:20:19ZVisual oXPath: robust wrapping by example.Journal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:3371059f-cc08-443c-9e23-af3b01a4f32aEnglishSymplectic Elements at OxfordACM2012Kranzdorf, JSellers, AGrasso, GSchallhart, CFurche, TMille, AGandon, FMisselis, JRabinovich, MStaab, SGood examples are hard to find, particularly in wrapper induction: Picking even one wrong example can spell disaster by yielding overgeneralized or overspecialized wrappers. Such wrappers extract data with low precision or recall, unless adjusted by human experts at significant cost. Visual OXPath is an open-source, visual wrapper induction system that requires minimal examples and eases wrapper refinement: Often it derives the intended wrapper from a single example through sophisticated heuristics that determine the best set of similar examples. To ease wrapper refinement, it offers a list of wrappers ranked by example similarity and robustness. Visual OXPath offers extensive visual feedback for this refinement which can be performed without any knowledge of the underlying wrapper language. Where further refinement by a human wrapper is needed, Visual OXPath profits from being based on OXPath, a declarative wrapper language that extends XPath with a thin layer of features necessary for extraction and page navigation. Copyright is held by the International World Wide Web Conference Committee (IW3C2).
spellingShingle Kranzdorf, J
Sellers, A
Grasso, G
Schallhart, C
Furche, T
Visual oXPath: robust wrapping by example.
title Visual oXPath: robust wrapping by example.
title_full Visual oXPath: robust wrapping by example.
title_fullStr Visual oXPath: robust wrapping by example.
title_full_unstemmed Visual oXPath: robust wrapping by example.
title_short Visual oXPath: robust wrapping by example.
title_sort visual oxpath robust wrapping by example
work_keys_str_mv AT kranzdorfj visualoxpathrobustwrappingbyexample
AT sellersa visualoxpathrobustwrappingbyexample
AT grassog visualoxpathrobustwrappingbyexample
AT schallhartc visualoxpathrobustwrappingbyexample
AT furchet visualoxpathrobustwrappingbyexample