Automatic Detection of Webpages that Share the Same Web Template

Template extraction is the process of isolating the template of a given webpage. It is widely used in several disciplines, including webpages development, content extraction, block detection, and webpages indexing. One of the main goals of template extraction is identifying a set of webpages with th...

Full description

Bibliographic Details
Main Authors: Julián Alarte, David Insa, Josep Silva, Salvador Tamarit
Format: Article
Language:English
Published: Open Publishing Association 2014-09-01
Series:Electronic Proceedings in Theoretical Computer Science
Online Access:http://arxiv.org/pdf/1409.2590v1