Natural language processing for web document representation

The World Wide Web has brought us to an era where information is abundantly available and easily accessible from all over the world. With important decision-makings becoming more dependent on these huge resources, there is a growing need for techniques which is able to produce more comprehensible...

Full description

Bibliographic Details
Main Author: Hoang, Manh Linh.
Other Authors: Chen Lihui
Format: Final Year Project (FYP)
Language:English
Published: 2010
Subjects:
Online Access:http://hdl.handle.net/10356/40681
Description
Summary:The World Wide Web has brought us to an era where information is abundantly available and easily accessible from all over the world. With important decision-makings becoming more dependent on these huge resources, there is a growing need for techniques which is able to produce more comprehensible representations of text documents for users by using Natural Language Processing (NLP) techniques NLP concerns the issues of portraying information in natural (human) language. It can provide human language-like representations of documents to be further processed for data mining purposes. This project aims to study various NLP-based representation methods for web text documents, with the focus on semantic analysis based models, namely the sentence-level semantic analysis model and the concept-based analysis model. Moreover, an important part of the project is spent on the design and development of an integrated Semantic Analysis Tool (SAT) based on the aforementioned models.