Deriving protein-protein interactions of dengue from literature by using automatic content extraction features

Dengue Fever is one of the most severe diseases spread throughout the tropics. However, due to the immunity response elicited among the serotypes of dengue virus, it is very difficult to develop vaccines to protect human from dengue infections. However, with the advancement in technology, researcher...

Full description

Bibliographic Details
Main Author:	Huang, Yizhou
Other Authors:	Rajapakse Jagath Chandana
Format:	Final Year Project (FYP)
Language:	English
Published:	2015
Subjects:	DRNTU::Engineering::Computer science and engineering::Data
Online Access:	http://hdl.handle.net/10356/62840

_version_	1826112246314434560
author	Huang, Yizhou
author2	Rajapakse Jagath Chandana
author_facet	Rajapakse Jagath Chandana Huang, Yizhou
author_sort	Huang, Yizhou
collection	NTU
description	Dengue Fever is one of the most severe diseases spread throughout the tropics. However, due to the immunity response elicited among the serotypes of dengue virus, it is very difficult to develop vaccines to protect human from dengue infections. However, with the advancement in technology, researchers have focused on the area of genetic structure to develop vaccines. This project aims to regulate Automatic Content Extraction features and uses these features to derive protein interactions and gene regulation relations through text mining. The human gene list and dengue gene list are downloaded from online genome mapping repository while the texts are retrieved from abstracts of biomedical literature. Sentences are then pre-processed for further analysis. Biological knowledge and facts on gene regulations and protein interactions are generated with optimized methods and techniques. In this project, the keyword-tag and word-relation-word features are extracted to describe the regulation relations. To investigate the performance of different feature sets, this project makes use of Stanford Natural Language Processing Tools to analyse the semantic structure of sentences. A decision tree classifier is trained to learn the extracted patterns to perform the prediction job. The accuracy based on keyword-tag and word-relation-word feature have reached 99.4%. The reason for high accuracy is that the feature sets also contain some features extracted from the testing dataset. To improve this problem, more datasets will be involved to evaluate the performance.
first_indexed	2024-10-01T03:04:09Z
format	Final Year Project (FYP)
id	ntu-10356/62840
institution	Nanyang Technological University
language	English
last_indexed	2024-10-01T03:04:09Z
publishDate	2015
record_format	dspace
spelling	ntu-10356/628402023-03-03T20:37:26Z Deriving protein-protein interactions of dengue from literature by using automatic content extraction features Huang, Yizhou Rajapakse Jagath Chandana School of Computer Engineering Centre for Computational Intelligence DRNTU::Engineering::Computer science and engineering::Data Dengue Fever is one of the most severe diseases spread throughout the tropics. However, due to the immunity response elicited among the serotypes of dengue virus, it is very difficult to develop vaccines to protect human from dengue infections. However, with the advancement in technology, researchers have focused on the area of genetic structure to develop vaccines. This project aims to regulate Automatic Content Extraction features and uses these features to derive protein interactions and gene regulation relations through text mining. The human gene list and dengue gene list are downloaded from online genome mapping repository while the texts are retrieved from abstracts of biomedical literature. Sentences are then pre-processed for further analysis. Biological knowledge and facts on gene regulations and protein interactions are generated with optimized methods and techniques. In this project, the keyword-tag and word-relation-word features are extracted to describe the regulation relations. To investigate the performance of different feature sets, this project makes use of Stanford Natural Language Processing Tools to analyse the semantic structure of sentences. A decision tree classifier is trained to learn the extracted patterns to perform the prediction job. The accuracy based on keyword-tag and word-relation-word feature have reached 99.4%. The reason for high accuracy is that the feature sets also contain some features extracted from the testing dataset. To improve this problem, more datasets will be involved to evaluate the performance. Bachelor of Engineering (Computer Science) 2015-04-30T02:00:00Z 2015-04-30T02:00:00Z 2015 2015 Final Year Project (FYP) http://hdl.handle.net/10356/62840 en Nanyang Technological University 59 p. application/pdf
spellingShingle	DRNTU::Engineering::Computer science and engineering::Data Huang, Yizhou Deriving protein-protein interactions of dengue from literature by using automatic content extraction features
title	Deriving protein-protein interactions of dengue from literature by using automatic content extraction features
title_full	Deriving protein-protein interactions of dengue from literature by using automatic content extraction features
title_fullStr	Deriving protein-protein interactions of dengue from literature by using automatic content extraction features
title_full_unstemmed	Deriving protein-protein interactions of dengue from literature by using automatic content extraction features
title_short	Deriving protein-protein interactions of dengue from literature by using automatic content extraction features
title_sort	deriving protein protein interactions of dengue from literature by using automatic content extraction features
topic	DRNTU::Engineering::Computer science and engineering::Data
url	http://hdl.handle.net/10356/62840
work_keys_str_mv	AT huangyizhou derivingproteinproteininteractionsofdenguefromliteraturebyusingautomaticcontentextractionfeatures

Deriving protein-protein interactions of dengue from literature by using automatic content extraction features

Similar Items