About RNAweasel and MFannot
predicts complex structured mitochondrial RNAs, using ERPIN (1) as
engine. ERPIN's search algorithm is based on RNA secondary structure
profiles, which are computed from RNA sequence alignments plus
user-defined secondary structure information as an input. Much of its
efficiency stems from the definition of precisely delimited structural
elements that can be searched individually or in combination, by using
a defined search order ('search strategy'). It is currently the
second-most sensitive search algorithm for structured RNAs (following
the outstanding covariance-based Infernal program based on Cove ,
which however is currently not suited for RNA molecules that are longer
than 500 nt, and which does not allow the modeling of pseudoknots that
are an essential feature of some structured RNAs.
The availability of correctly
aligned RNA sequences
as training sets, and the deduction of precise secondary structure
definitions are THE key prerequisites for using ERPIN. We have
therefore developed the RNAweasel tools: for the compilation and
sequence training sets, including easy visualization and editing of
alignments and structure definitions (using GDE; 3), automatic
alignment of ERPIN results, normalization of training set sequences,
and a reiterative mode of search that helps to build trainingsets
starting from just a few initial sequences. A set of optimized intron
training sets that are used in this service are discussed in (4), and a
recent application to finding unorthodox trans-spliced group I introns in Trichoplax mtDNA in (5). The current version
of RNAweasel is
based on ERPIN version 5.2.1.
RNAweasel searches includes mitochondrial and plastid genomes for:
- introns of group I and group II
- RNase P RNA (rnpB)
- 5S (rrn5)
and small subunit rRNA (rns)
We have developed automated annotation for mitochondrial and plastid
genomes that requires little if any manual corrections (replacing
manual annotation of often many days to a few minutes ...). It makes
intense use of the RNA/intron detection tools described above, and is
particularly helpful with organelle genomes that contain lots of
introns. Intron-exon boundaries are identified by a combination of
intron splice rules and exon similarities, and are thus precise in most
instances. The output of MFannot is listings of gene coordinates either
in XML format, a format that can be directly loaded into NCBI sequence
submission tools, or in masterfile format (computer-parsible as well as
human-readable; annotations embedded into the sequence). In its current
form, bilaterian mtDNAs will not be properly processed, a feature that
we do not plan to develop.
Results will be provided by Email only.
We recommend that you view the email-appended file with an asci viewer
that is set to a fixed-width font like Monospace.
In case of problems, comments or questions, please
Franz.Lang [at] Umontreal.ca
(1) Gautheret, D., and A. Lambert (2001)
Direct RNA motif
definition and identification from multiple sequence alignments using
secondary structure profiles.
J Mol Biol 313:1003-1011
(2) Eddy SR and Durbin R (1994)
RNA sequence analysis using covariance
Nucleic Acids Res 22: 2079-88
(3) Smith, S. W., R. Overbeek, C. R. Woese, W. Gilbert, and P. M.
The genetic data environment an expandable GUI
multiple sequence analysis.
Comput Appl Biosci 10:671-675
(4) Lang B.F., M-J. Laforest, and G. Burger (2007)
introns: a critical view.
Trends Genet 23:119-25.
(5) Burger G, Yan Y, Javadi P and Lang BF (2009)
Group I-intron trans-splicing and mRNA
editing in mitochondria of placozoan animals.
Trend in Genetics 25:
Support has been generously provided by NSERC, Genome Quebec/Canada
and the Canadian Research Chair program. Special thanks to Guy
permitting us to use his
artistic weasel drawing.