LdClusterView : a visualization for genomics data

Scalable processing of large, heterogeneous, and possible incomplete and/or conflicting data, makes the analysis of haplotype data a challenging task. Moreover, near completion of the genome sequences and the re-focus on research analysis, makes the issue of effective genomic sequence display essent...

Full description

Bibliographic Details
Main Author: Gupta, Aakash
Other Authors: Zheng Jie
Format: Final Year Project (FYP)
Language:English
Published: 2017
Subjects:
Online Access:http://hdl.handle.net/10356/70272
_version_ 1811687254878322688
author Gupta, Aakash
author2 Zheng Jie
author_facet Zheng Jie
Gupta, Aakash
author_sort Gupta, Aakash
collection NTU
description Scalable processing of large, heterogeneous, and possible incomplete and/or conflicting data, makes the analysis of haplotype data a challenging task. Moreover, near completion of the genome sequences and the re-focus on research analysis, makes the issue of effective genomic sequence display essential: it becomes cumbersome and difficult to understand to have billions of genomic DNA letters displayed on the screen as plain text! Thus, it is of paramount importance to be able to collect and digest the large amount of data about biological systems that is accumulating in the literature.  Visualizing the data has successfully aided in gaining better understanding of the data. Moreover, researchers wish to view all facets of the genotype and haplotype data, including the spatial distribution of the loci along a chromosome, the different frequencies of haplotypes in different subgroups, and possibly also the correlation of occurring haplotypes. This emphasizes a need for a dynamic visualization which can address such complex and huge data sets on many different levels. As a solution, Singapore Immunology Network (SigN) aims to provide a customizable and highly user-interactive display of requested portion of genomes. Apart from kick-starting the project, SIgN aims to release the project in the public domain to enable collaborators from all over the world to contribute to and expand the project. As the foundational stone, three kinds of plots have been made to analyse genomic sequences in a better manner – Manhattan Plot, Genes Plot, and the Leaf Nodes Plot.
first_indexed 2024-10-01T05:13:24Z
format Final Year Project (FYP)
id ntu-10356/70272
institution Nanyang Technological University
language English
last_indexed 2024-10-01T05:13:24Z
publishDate 2017
record_format dspace
spelling ntu-10356/702722023-03-03T20:34:33Z LdClusterView : a visualization for genomics data Gupta, Aakash Zheng Jie School of Computer Science and Engineering A*STAR DRNTU::Engineering::Computer science and engineering Scalable processing of large, heterogeneous, and possible incomplete and/or conflicting data, makes the analysis of haplotype data a challenging task. Moreover, near completion of the genome sequences and the re-focus on research analysis, makes the issue of effective genomic sequence display essential: it becomes cumbersome and difficult to understand to have billions of genomic DNA letters displayed on the screen as plain text! Thus, it is of paramount importance to be able to collect and digest the large amount of data about biological systems that is accumulating in the literature.  Visualizing the data has successfully aided in gaining better understanding of the data. Moreover, researchers wish to view all facets of the genotype and haplotype data, including the spatial distribution of the loci along a chromosome, the different frequencies of haplotypes in different subgroups, and possibly also the correlation of occurring haplotypes. This emphasizes a need for a dynamic visualization which can address such complex and huge data sets on many different levels. As a solution, Singapore Immunology Network (SigN) aims to provide a customizable and highly user-interactive display of requested portion of genomes. Apart from kick-starting the project, SIgN aims to release the project in the public domain to enable collaborators from all over the world to contribute to and expand the project. As the foundational stone, three kinds of plots have been made to analyse genomic sequences in a better manner – Manhattan Plot, Genes Plot, and the Leaf Nodes Plot. Bachelor of Engineering (Computer Science) 2017-04-18T07:00:45Z 2017-04-18T07:00:45Z 2017 Final Year Project (FYP) http://hdl.handle.net/10356/70272 en Nanyang Technological University 43 p. application/pdf
spellingShingle DRNTU::Engineering::Computer science and engineering
Gupta, Aakash
LdClusterView : a visualization for genomics data
title LdClusterView : a visualization for genomics data
title_full LdClusterView : a visualization for genomics data
title_fullStr LdClusterView : a visualization for genomics data
title_full_unstemmed LdClusterView : a visualization for genomics data
title_short LdClusterView : a visualization for genomics data
title_sort ldclusterview a visualization for genomics data
topic DRNTU::Engineering::Computer science and engineering
url http://hdl.handle.net/10356/70272
work_keys_str_mv AT guptaaakash ldclusterviewavisualizationforgenomicsdata