English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Poster

Statistical Tests for Detecting Differential RNA-Transcript Expression from Read Counts

MPS-Authors
/persons/resource/persons83894

Drewe,  P
Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society;

/persons/resource/persons85601

Bohnert,  R
Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society;

/persons/resource/persons84153

Rätsch,  G
Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society;

External Resource
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available
Citation

Stegle, O., Drewe, P., Bohnert, R., Borgwardt, K., & Rätsch, G. (2010). Statistical Tests for Detecting Differential RNA-Transcript Expression from Read Counts. Poster presented at Joint Cold Spring Harbor Laboratory/Wellcome Trust Conference: Genome Informatics, Hinxton, UK.


Cite as: https://hdl.handle.net/21.11116/0000-0010-5B17-F
Abstract
As a fruit of the current revolution in sequencing technology, transcriptomes can now be analyzed at an unprecedented level of detail. Established applications of this include the detection of differentially expressed genes across biological samples and the quantification of the abundances of various RNA transcripts. The next step is now to combine these concepts to identify differential expression on the level of transcripts within individual genes. Methods for this fine-grained testing of differential abundance are valuable tools to tackle key biological questions; for example the mechanisms of alternative splicing. Here, we present a statistical testing-framework to address this important need. Most notably, our method can be applied in settings where the complete transcript annotation (TA) is available, but also when it is unknown or incorrect. Our approach is based on a kernel method, called Maximum Mean Discrepancy (MMD), directly testing for differences of the underlying read distributions, inferred from the observed reads. In our model, we incorporate the assumption that reads follow a Poisson distribution and account for biological or technical variability. We show how existing TA, if available, can be exploited to define a maximal discriminative set of regions within a gene, further increasing the accuracy of the method. We analyzed the proposed approach with and without TA, comparing to established methods based on transcript quantification. We looked at simulated read data as well as factual reads generated by the Illumina Genome Analyzer for four C. elegans samples (Barberan-Soler et al. 2009). In our analysis, the MMD test identified differential transcript expression considerably better than methods based on transcript quantification (45% vs. 30% at 1% FPR). Even more striking, in the absence of knowledge about the TA, the MMD test was still able to identify 75% of the true differential cases. Our method is therefore well suited to analyze RNA-Seq experiments where other approaches fail, namely when the TA is incomplete or entirely missing. We further investigated the MMD test on the data from (Hillier et al. 2009), comparing to a second study (Barberan-Soler et al. 2009) of 352 genes with confirmed alternative splicing events in the early development stages of C. elegans. Even when not making use of TA, our method was able to detect between 40% and 85% of the transcripts with at least one log fold change between developmental stages in (Barberan-Soler et al. 2009). This result becomes even more striking when taking the TA into account.