Building a database of cancer genomic data

Cancer is known to be a genetic disease. Due to the inherent complexity of cancer, large-scale genomic datasets are necessary to study this disease. In this project, the goal is to firstly download publically available data about cancer and integrate them into a database. The architecture of the dat...

Full description

Bibliographic Details
Main Author: Woo, Sherman Wei Hao
Other Authors: Zheng Jie
Format: Final Year Project (FYP)
Language:English
Published: 2016
Subjects:
Online Access:http://hdl.handle.net/10356/67289
Description
Summary:Cancer is known to be a genetic disease. Due to the inherent complexity of cancer, large-scale genomic datasets are necessary to study this disease. In this project, the goal is to firstly download publically available data about cancer and integrate them into a database. The architecture of the database was designed according to the structured nature of the data and convenience of usage. As such, the design of the database adopts a data warehousing strategy. The database design and queries will be discussed and further developments like adopting a distributed systems approach and combining both SQL and NoSQL capabilities have been briefly mentioned.