RED: Redundancy-driven data extraction from result pages?

Data-driven websites are mostly accessed through search interfaces. Such sites follow a common publishing pattern that, surprisingly, has not been fully exploited for unsupervised data extraction yet: the result of a search is presented as a paginated list of result records. Each result record conta...

Full description

Bibliographic Details
Main Authors: Guo, J, Crescenzi, V, Furche, T, Grasso, G, Gottlob, G
Format: Conference item
Published: Association for Computing Machinery 2019