Summary: | In order to have effective communication and interaction between machine and
human being, one of the major issues is the effectiveness of Web information
retrieval. At the moment of WWW developed, the Uniform Resource Locator is also
created to give access to other machine on the Web. However, we still required to
locate the fragment of the data in a resource. Nowadays, the information on the
Web keeps changing over time. However, to observe the changes on the Web
Information, we need an automated system to help due to the inefficiency of times
wasted on waiting. For this reason, the main purpose of this project was to build a
monitoring system for Web information.
One of the objectives was to review the techniques used in the published work of
the Web Information analysis. A summary of approaches to the analysis will be
presented in this report. This includes the Document Object Model of the webpage
which provides a more flexible access interface to the document, and the XPath
Model that supporting the information fragment locating.
While approaching to the real world problem, the error‐coded webpage must first
be fixed by a pretty printer. Although the DOM API was well‐developed, the XPath
has no free API to support its generation. The accuracy of XPath generation is
significant as it directly affects the correctness of data retrieval.
The intelligent monitoring system was successfully developed. It has been named as
“Webmon” and consists of three subsystems: User Account subsystem to support
user identification, Monitoring Task subsystem to provide core functionality, and a
Daemon Server subsystem to support task scheduling.
There are still some rooms of improvement especially in the area of encoding.
Nevertheless, the success of development is believed to have great impact to other
Web service.