Visual oXPath: robust wrapping by example.
Good examples are hard to find, particularly in wrapper induction: Picking even one wrong example can spell disaster by yielding overgeneralized or overspecialized wrappers. Such wrappers extract data with low precision or recall, unless adjusted by human experts at significant cost. Visual OXPath i...
Main Authors: | , , , , |
---|---|
Other Authors: | |
Format: | Journal article |
Language: | English |
Published: |
ACM
2012
|
_version_ | 1826266409690202112 |
---|---|
author | Kranzdorf, J Sellers, A Grasso, G Schallhart, C Furche, T |
author2 | Mille, A |
author_facet | Mille, A Kranzdorf, J Sellers, A Grasso, G Schallhart, C Furche, T |
author_sort | Kranzdorf, J |
collection | OXFORD |
description | Good examples are hard to find, particularly in wrapper induction: Picking even one wrong example can spell disaster by yielding overgeneralized or overspecialized wrappers. Such wrappers extract data with low precision or recall, unless adjusted by human experts at significant cost. Visual OXPath is an open-source, visual wrapper induction system that requires minimal examples and eases wrapper refinement: Often it derives the intended wrapper from a single example through sophisticated heuristics that determine the best set of similar examples. To ease wrapper refinement, it offers a list of wrappers ranked by example similarity and robustness. Visual OXPath offers extensive visual feedback for this refinement which can be performed without any knowledge of the underlying wrapper language. Where further refinement by a human wrapper is needed, Visual OXPath profits from being based on OXPath, a declarative wrapper language that extends XPath with a thin layer of features necessary for extraction and page navigation. Copyright is held by the International World Wide Web Conference Committee (IW3C2). |
first_indexed | 2024-03-06T20:38:30Z |
format | Journal article |
id | oxford-uuid:3371059f-cc08-443c-9e23-af3b01a4f32a |
institution | University of Oxford |
language | English |
last_indexed | 2024-03-06T20:38:30Z |
publishDate | 2012 |
publisher | ACM |
record_format | dspace |
spelling | oxford-uuid:3371059f-cc08-443c-9e23-af3b01a4f32a2022-03-26T13:20:19ZVisual oXPath: robust wrapping by example.Journal articlehttp://purl.org/coar/resource_type/c_dcae04bcuuid:3371059f-cc08-443c-9e23-af3b01a4f32aEnglishSymplectic Elements at OxfordACM2012Kranzdorf, JSellers, AGrasso, GSchallhart, CFurche, TMille, AGandon, FMisselis, JRabinovich, MStaab, SGood examples are hard to find, particularly in wrapper induction: Picking even one wrong example can spell disaster by yielding overgeneralized or overspecialized wrappers. Such wrappers extract data with low precision or recall, unless adjusted by human experts at significant cost. Visual OXPath is an open-source, visual wrapper induction system that requires minimal examples and eases wrapper refinement: Often it derives the intended wrapper from a single example through sophisticated heuristics that determine the best set of similar examples. To ease wrapper refinement, it offers a list of wrappers ranked by example similarity and robustness. Visual OXPath offers extensive visual feedback for this refinement which can be performed without any knowledge of the underlying wrapper language. Where further refinement by a human wrapper is needed, Visual OXPath profits from being based on OXPath, a declarative wrapper language that extends XPath with a thin layer of features necessary for extraction and page navigation. Copyright is held by the International World Wide Web Conference Committee (IW3C2). |
spellingShingle | Kranzdorf, J Sellers, A Grasso, G Schallhart, C Furche, T Visual oXPath: robust wrapping by example. |
title | Visual oXPath: robust wrapping by example. |
title_full | Visual oXPath: robust wrapping by example. |
title_fullStr | Visual oXPath: robust wrapping by example. |
title_full_unstemmed | Visual oXPath: robust wrapping by example. |
title_short | Visual oXPath: robust wrapping by example. |
title_sort | visual oxpath robust wrapping by example |
work_keys_str_mv | AT kranzdorfj visualoxpathrobustwrappingbyexample AT sellersa visualoxpathrobustwrappingbyexample AT grassog visualoxpathrobustwrappingbyexample AT schallhartc visualoxpathrobustwrappingbyexample AT furchet visualoxpathrobustwrappingbyexample |