Multivariate Analysis and Visualization of Splicing Correlations in Single-Gene Transcriptomes

<p>Abstract</p> <p>Background</p> <p>RNA metabolism, through 'combinatorial splicing', can generate enormous structural diversity in the proteome. Alternative domains may interact, however, with unpredictable phenotypic consequences, necessitating integrated R...

Full description

Bibliographic Details
Main Authors: Agnew William S, Parmigiani Giovanni, Emerick Mark C
Format: Article
Language:English
Published: BMC 2007-01-01
Series:BMC Bioinformatics
Online Access:http://www.biomedcentral.com/1471-2105/8/16
Description
Summary:<p>Abstract</p> <p>Background</p> <p>RNA metabolism, through 'combinatorial splicing', can generate enormous structural diversity in the proteome. Alternative domains may interact, however, with unpredictable phenotypic consequences, necessitating integrated RNA-level regulation of molecular composition. Splicing correlations within transcripts of single genes provide valuable clues to functional relationships among molecular domains as well as genomic targets for higher-order splicing regulation.</p> <p>Results</p> <p>We present tools to visualize complex splicing patterns in full-length cDNA libraries. Developmental changes in pair-wise correlations are presented vectorially in '<it>clock plots' </it>and linkage grids. Higher-order correlations are assessed statistically through Monte Carlo analysis of a log-linear model with an empirical-Bayes estimate of the true probabilities of observed and unobserved splice forms. Log-linear coefficients are visualized in a '<it>spliceprint,' </it>a signature of splice correlations in the transcriptome. We present two novel metrics: the <it>linkage change index</it>, which measures the directional change in pair-wise correlation with tissue differentiation, and the <it>accuracy index</it>, a very simple goodness-of-fit metric that is more sensitive than the integrated squared error when applied to sparsely populated tables, and unlike chi-square, does not diverge at low variance. Considerable attention is given to sparse contingency tables, which are inherent to single-gene libraries.</p> <p>Conclusion</p> <p>Patterns of splicing correlations are revealed, which span a broad range of interaction order and change in development. The methods have a broad scope of applicability, beyond the single gene – including, for example, multiple gene interactions in the complete transcriptome.</p>
ISSN:1471-2105