Estimation of Gene Expression at Isoform-level from mRNA-Seq Data by Bayesian Hierarchical Modeling

mRNA-Seq is a precise and highly reproducible technique for measurement of transcripts levels and yields sequence information of a transcriptome at a single nucleotide base-level thus enabling us to determine splice junctions and alternative splicing events with high confidence. Often analysis of mR...

Full description

Bibliographic Details
Main Authors: Madhuchhanda eBhattacharjee, Ravi eGupta, Ramana eDavuluri
Format: Article
Language:English
Published: Frontiers Media S.A. 2012-11-01
Series:Frontiers in Genetics
Subjects:
Online Access:http://journal.frontiersin.org/Journal/10.3389/fgene.2012.00239/full
_version_ 1798044047878651904
author Madhuchhanda eBhattacharjee
Ravi eGupta
Ramana eDavuluri
author_facet Madhuchhanda eBhattacharjee
Ravi eGupta
Ramana eDavuluri
author_sort Madhuchhanda eBhattacharjee
collection DOAJ
description mRNA-Seq is a precise and highly reproducible technique for measurement of transcripts levels and yields sequence information of a transcriptome at a single nucleotide base-level thus enabling us to determine splice junctions and alternative splicing events with high confidence. Often analysis of mRNA-Seq data does not attempt to quantify the expressions at isoform level. In this paper our objective would be use the mRNA-Seq data to infer expression at isoform level, where splicing patterns of a gene is assumed to be known. A Bayesian latent variable based modeling framework is proposed here, where the parameterization enables us to infer at various levels. For example, expression variability of an isoform across different conditions; the model parameterization also allows us to carry out two-sample comparisons, e.g. using a Bayesian t-test, in addition simple presence or absence of an isoform can also be estimated by the use of the latent variables present in the model. In this paper we would carry out inference on isoform expression under different normalization techniques, since it has been recently shown that one of the most prominent sources of variation in differential call using mRNA-Seq data is the normalization method used.The statistical framework is developed for multiple-isoforms and easily extends to reads mapping to multiple genes. This could be achieved by slight conceptual modifications in definitions of what we consider as a gene and what as an exon. Additionally proposed framework can be extended by appropriate modeling of the design matrix to infer about yet unknown novel transcripts. However such attempts should be made judiciously since the input date used in the proposed model does not use reads from splice junctions.
first_indexed 2024-04-11T22:57:25Z
format Article
id doaj.art-7bd9348a2d9040d9926d684cb0fa4ded
institution Directory Open Access Journal
issn 1664-8021
language English
last_indexed 2024-04-11T22:57:25Z
publishDate 2012-11-01
publisher Frontiers Media S.A.
record_format Article
series Frontiers in Genetics
spelling doaj.art-7bd9348a2d9040d9926d684cb0fa4ded2022-12-22T03:58:19ZengFrontiers Media S.A.Frontiers in Genetics1664-80212012-11-01310.3389/fgene.2012.0023921011Estimation of Gene Expression at Isoform-level from mRNA-Seq Data by Bayesian Hierarchical ModelingMadhuchhanda eBhattacharjee0Ravi eGupta1Ramana eDavuluri2Pune UniversityThe Wistar InstituteThe Wistar InstitutemRNA-Seq is a precise and highly reproducible technique for measurement of transcripts levels and yields sequence information of a transcriptome at a single nucleotide base-level thus enabling us to determine splice junctions and alternative splicing events with high confidence. Often analysis of mRNA-Seq data does not attempt to quantify the expressions at isoform level. In this paper our objective would be use the mRNA-Seq data to infer expression at isoform level, where splicing patterns of a gene is assumed to be known. A Bayesian latent variable based modeling framework is proposed here, where the parameterization enables us to infer at various levels. For example, expression variability of an isoform across different conditions; the model parameterization also allows us to carry out two-sample comparisons, e.g. using a Bayesian t-test, in addition simple presence or absence of an isoform can also be estimated by the use of the latent variables present in the model. In this paper we would carry out inference on isoform expression under different normalization techniques, since it has been recently shown that one of the most prominent sources of variation in differential call using mRNA-Seq data is the normalization method used.The statistical framework is developed for multiple-isoforms and easily extends to reads mapping to multiple genes. This could be achieved by slight conceptual modifications in definitions of what we consider as a gene and what as an exon. Additionally proposed framework can be extended by appropriate modeling of the design matrix to infer about yet unknown novel transcripts. However such attempts should be made judiciously since the input date used in the proposed model does not use reads from splice junctions.http://journal.frontiersin.org/Journal/10.3389/fgene.2012.00239/fullBayesian latent variable modelingBayesian t-testisoform-expressionmRNA-seqmulti-sample comparisonspike-n-slab method
spellingShingle Madhuchhanda eBhattacharjee
Ravi eGupta
Ramana eDavuluri
Estimation of Gene Expression at Isoform-level from mRNA-Seq Data by Bayesian Hierarchical Modeling
Frontiers in Genetics
Bayesian latent variable modeling
Bayesian t-test
isoform-expression
mRNA-seq
multi-sample comparison
spike-n-slab method
title Estimation of Gene Expression at Isoform-level from mRNA-Seq Data by Bayesian Hierarchical Modeling
title_full Estimation of Gene Expression at Isoform-level from mRNA-Seq Data by Bayesian Hierarchical Modeling
title_fullStr Estimation of Gene Expression at Isoform-level from mRNA-Seq Data by Bayesian Hierarchical Modeling
title_full_unstemmed Estimation of Gene Expression at Isoform-level from mRNA-Seq Data by Bayesian Hierarchical Modeling
title_short Estimation of Gene Expression at Isoform-level from mRNA-Seq Data by Bayesian Hierarchical Modeling
title_sort estimation of gene expression at isoform level from mrna seq data by bayesian hierarchical modeling
topic Bayesian latent variable modeling
Bayesian t-test
isoform-expression
mRNA-seq
multi-sample comparison
spike-n-slab method
url http://journal.frontiersin.org/Journal/10.3389/fgene.2012.00239/full
work_keys_str_mv AT madhuchhandaebhattacharjee estimationofgeneexpressionatisoformlevelfrommrnaseqdatabybayesianhierarchicalmodeling
AT raviegupta estimationofgeneexpressionatisoformlevelfrommrnaseqdatabybayesianhierarchicalmodeling
AT ramanaedavuluri estimationofgeneexpressionatisoformlevelfrommrnaseqdatabybayesianhierarchicalmodeling