About RNAweasel and MFannot


RNAweasel predicts complex structured mitochondrial RNAs, using ERPIN (1) as a search engine. ERPIN's search algorithm is based on RNA secondary structure profiles, which are computed from RNA sequence alignments plus user-defined secondary structure information as an input. Much of its efficiency stems from the definition of precisely delimited structural elements that can be searched individually or in combination, by using a defined search order ('search strategy'). It is currently the second-most sensitive search algorithm for structured RNAs (following the outstanding covariance-based Infernal program based on Cove [2], which however is currently not suited for RNA molecules that are longer than 500 nt, and which does not allow the modeling of pseudoknots that are an essential feature of some structured RNAs.

The availability of correctly aligned RNA sequences as training sets, and the deduction of precise secondary structure definitions are THE key prerequisites for using ERPIN. We have therefore developed the RNAweasel tools: for the compilation and manipulation of sequence training sets, including easy visualization and editing of alignments and structure definitions (using GDE; 3), automatic alignment of ERPIN results, normalization of training set sequences, and a reiterative mode of search that helps to build trainingsets starting from just a few initial sequences. A set of optimized intron training sets that are used in this service are discussed in (4), and a recent application to finding unorthodox trans-spliced group I introns in Trichoplax mtDNA in (5). The current version of RNAweasel is based on ERPIN version 5.2.1.

RNAweasel searches includes mitochondrial and plastid genomes for:

We have developed automated annotation for mitochondrial and plastid genomes that requires little if any manual corrections (replacing manual annotation of often many days to a few minutes ...). It makes intense use of the RNA/intron detection tools described above, and is particularly helpful with organelle genomes that contain lots of introns. Intron-exon boundaries are identified by a combination of intron splice rules and exon similarities, and are thus precise in most instances. The output of MFannot is listings of gene coordinates either in XML format, a format that can be directly loaded into NCBI sequence submission tools, or in masterfile format (computer-parsible as well as human-readable; annotations embedded into the sequence). In its current form, bilaterian mtDNAs will not be properly processed, a feature that we do not plan to develop.

Results will be provided by Email only. We recommend that you view the email-appended file with an asci viewer that is set to a fixed-width font like Monospace
. In case of problems, comments or questions, please contact      

Franz.Lang [at] Umontreal.ca


(1) Gautheret, D., and A. Lambert (2001)
      Direct RNA motif definition and identification from multiple sequence alignments using secondary structure profiles.
      J Mol Biol 313:1003-1011

(2) Eddy SR and Durbin R (1994)
      RNA sequence analysis using covariance models.
      Nucleic Acids Res 22: 2079-88

(3) Smith, S. W., R. Overbeek, C. R. Woese, W. Gilbert, and P. M. Gillevet (1994).
     The genetic data environment an expandable GUI for multiple sequence analysis.
     Comput Appl Biosci 10:671-675

(4) Lang B.F., M-J. Laforest, and G. Burger (2007)
     Mitochondrial introns: a critical view. 
     Trends Genet 23

(5) Burger G, Yan Y, Javadi P and Lang  BF (2009)
      Group I-intron trans-splicing and mRNA editing in mitochondria of placozoan animals.
      Trend in Genetics 25: 381-6


Support has been generously provided by NSERC, Genome Quebec/Canada and the Canadian Research Chair program. Special thanks to Guy Troughton (guytroughton@bigpond.com) for permitting us to use his artistic weasel drawing.