Comparative assembly and analysis of different sized genomes using Pacbio sequencing technology

PacBio is the third generation sequencing technology which is based on the single molecule real time sequencing (SMRT) platform using the property of zero-mode waveguide (ZMW). This technology generates very long reads which is best suited for various applications like de novo genome assembly, struc...

Full description

Bibliographic Details
Main Authors: Ridhi Goel, Pooja Raj, C.P. Rajadurai, Dhinoth Kumar
Format: Article
Language:English
Published: Science Planet Inc. 2017-12-01
Series:Canadian Journal of Biotechnology
Online Access:https://www.canadianjbiotech.com/CAN_J_BIOTECH/Archives/v1/Special Issue-Supplement/cjb.2017-a216.pdf
Description
Summary:PacBio is the third generation sequencing technology which is based on the single molecule real time sequencing (SMRT) platform using the property of zero-mode waveguide (ZMW). This technology generates very long reads which is best suited for various applications like de novo genome assembly, structural variations, full length transcriptomes, direct detection of base modifications etc. PacBio data can either be used alone or in combination with the illumina based shorter reads to facilitate a good assembly. Different algorithms are available to construct the genome based on PacBio alone or hybrid datasets. In order to identify the best possible approach we did a comparative study employing the widely accepted assembly tools on E.coli, C.elegans and A.thaliana datasets (PacBio & Ilumina (Paired end & Mate Pair)). We performed de novo genome assembly, gene prediction and gene annotation for all possible dataset (PacBio & Illumina PE & MP) and tools combination. The study resulted in the identification of the best method that could assemble the 4.6 MB of E.coli genome covering ~97% of BUSCO represented genes in a single contig. For C.elegans and A.thaliana we were able to achieve 109 MB and 123 MB sized assembly with ~80% of BUSCO represented genes.
ISSN:2560-8304