Unraveling high-throughput demultiplexing techniques across multiple plant species

RNA sequencing (RNA-seq) is essential for understanding biological mechanisms in plant biology. RNA-seq samples are pooled together (multiplexed) for simultaneous sequencing. Traditional demultiplexing methods often rely on expensive barcode matching, leading to collisions—misidentifications of samp...

Full description

Bibliographic Details
Main Author: Maitra, Ishani
Other Authors: Marek Mutwil
Format: Final Year Project (FYP)
Language:English
Published: Nanyang Technological University 2024
Subjects:
Online Access:https://hdl.handle.net/10356/176353
_version_ 1826129230154432512
author Maitra, Ishani
author2 Marek Mutwil
author_facet Marek Mutwil
Maitra, Ishani
author_sort Maitra, Ishani
collection NTU
description RNA sequencing (RNA-seq) is essential for understanding biological mechanisms in plant biology. RNA-seq samples are pooled together (multiplexed) for simultaneous sequencing. Traditional demultiplexing methods often rely on expensive barcode matching, leading to collisions—misidentifications of samples due to sequencing noise or inaccuracies in barcode assignment, especially in complex data. Therefore, we proposed a cost-efficient demultiplexing method that can accommodate complex datasets. The method is tested on Arabidopsis thaliana, Brachypodium distachyon, and Oldenlandia corymbosa, with A. thaliana and B. distachyon subjected to dark stress treatment. The samples are pooled together in various multiplex combinations. RNA sequences were aligned to a reference coding sequence (CDS) genome using HISAT2. A multiplex CDS was achieved by concatenating the three species’ reference genomes. A strong correlation was observed and suggested that multiplex CDS can be used for subsequent comparative analysis. The control read counts were scaled according to the observed linear relationship between O. corymbosa gene read counts in both control and treatment groups within the multiplex ABO samples. DEGs were precisely identified using DESeq2 and a proposed differential gene expression analysis on scaled control read counts. We demonstrated a promising cost-efficient demultiplexing method capable of handling large and complex datasets without the need for barcoding.
first_indexed 2024-10-01T07:37:33Z
format Final Year Project (FYP)
id ntu-10356/176353
institution Nanyang Technological University
language English
last_indexed 2024-10-01T07:37:33Z
publishDate 2024
publisher Nanyang Technological University
record_format dspace
spelling ntu-10356/1763532024-05-20T15:33:06Z Unraveling high-throughput demultiplexing techniques across multiple plant species Maitra, Ishani Marek Mutwil School of Biological Sciences mutwil@ntu.edu.sg Medicine, Health and Life Sciences Demultiplexing RNA sequencing (RNA-seq) is essential for understanding biological mechanisms in plant biology. RNA-seq samples are pooled together (multiplexed) for simultaneous sequencing. Traditional demultiplexing methods often rely on expensive barcode matching, leading to collisions—misidentifications of samples due to sequencing noise or inaccuracies in barcode assignment, especially in complex data. Therefore, we proposed a cost-efficient demultiplexing method that can accommodate complex datasets. The method is tested on Arabidopsis thaliana, Brachypodium distachyon, and Oldenlandia corymbosa, with A. thaliana and B. distachyon subjected to dark stress treatment. The samples are pooled together in various multiplex combinations. RNA sequences were aligned to a reference coding sequence (CDS) genome using HISAT2. A multiplex CDS was achieved by concatenating the three species’ reference genomes. A strong correlation was observed and suggested that multiplex CDS can be used for subsequent comparative analysis. The control read counts were scaled according to the observed linear relationship between O. corymbosa gene read counts in both control and treatment groups within the multiplex ABO samples. DEGs were precisely identified using DESeq2 and a proposed differential gene expression analysis on scaled control read counts. We demonstrated a promising cost-efficient demultiplexing method capable of handling large and complex datasets without the need for barcoding. Bachelor's degree 2024-05-17T13:31:29Z 2024-05-17T13:31:29Z 2024 Final Year Project (FYP) Maitra, I. (2024). Unraveling high-throughput demultiplexing techniques across multiple plant species. Final Year Project (FYP), Nanyang Technological University, Singapore. https://hdl.handle.net/10356/176353 https://hdl.handle.net/10356/176353 en application/pdf Nanyang Technological University
spellingShingle Medicine, Health and Life Sciences
Demultiplexing
Maitra, Ishani
Unraveling high-throughput demultiplexing techniques across multiple plant species
title Unraveling high-throughput demultiplexing techniques across multiple plant species
title_full Unraveling high-throughput demultiplexing techniques across multiple plant species
title_fullStr Unraveling high-throughput demultiplexing techniques across multiple plant species
title_full_unstemmed Unraveling high-throughput demultiplexing techniques across multiple plant species
title_short Unraveling high-throughput demultiplexing techniques across multiple plant species
title_sort unraveling high throughput demultiplexing techniques across multiple plant species
topic Medicine, Health and Life Sciences
Demultiplexing
url https://hdl.handle.net/10356/176353
work_keys_str_mv AT maitraishani unravelinghighthroughputdemultiplexingtechniquesacrossmultipleplantspecies