Deep learning approaches to predict clinical outcomes of cancer patients from multi-omics data

Cancer is a concerning disease for many people nowadays because of its high mortality rate and its high occurrence. It is a very complex disease because the causes of cancer can be the many different possible combinations of interactions between different biological entities. Accurate prediction of...

Full description

Bibliographic Details
Main Author: Wu, Lue
Other Authors: Jagath C Rajapakse
Format: Final Year Project (FYP)
Language:English
Published: Nanyang Technological University 2020
Subjects:
Online Access:https://hdl.handle.net/10356/138003
Description
Summary:Cancer is a concerning disease for many people nowadays because of its high mortality rate and its high occurrence. It is a very complex disease because the causes of cancer can be the many different possible combinations of interactions between different biological entities. Accurate prediction of cancer outcome can be helpful in the study of cancer as well as the treatment quality of cancer. Deep neural network is a machine learning method based on artificial neural network. The neural network tries to find the correct mathematical model from input to output. The power of the neural network is that it allows the modeling of complex non-linear relationship and thus suitable for complex disease like cancer. Technological advances increasingly enable multiple biological layers to be probed in parallel, ranging from genome to proteome and phospho-proteome. For each patient, many layers of data are available to us and we refer them as multi-omics data. Multi-omics data can reveal complicated interactions between different biological entities, allowing us to find more information of cancer. The problem, however, of using multi-omics data, is the “small n large p” problem. This problem refers to the fact of multi-omics data have few samples but very large dimensions. This project tries to address this problem using deep learning approach, by first building a patient similarity network (PSN) using multi-omics data and extract topological features from the network to be used to train the neural network. In addition, it also investigates the use of Similarity network Fusion (SNF) on different biological data types to improve the prediction accuracy. The models are trained and tested using data from The Cancer Genome Atlas (TCGA) and the Therapeutically Applicable Research to Generate Effective Treatments (TARGET) program.