The masterfile (MF) provides an integration of DNA sequence and its
feature annotations, with the feature annotations embedded at the corresponding
positions in the sequence. The naming rules for genetic elements have
been designed such that they are easily interpretable by a human reader,
but are at the same time computer-readable (the MF format is a superset
of the FASTA format).
We have developed a number of tools
that take MF-files as input (basic sequence operation, ssyntax and logic checking, and gene/intergenic region extraction). Particularly usful is one, developped in collaboration with NCBI,
that permit translation of an MF directly into a GenBank record (ASN.1 and GBF
flatfile format), making it much easier to submit sequences to NCBI's publically-accessible data repositories.
The current document describes the conventions for the MF annotations
that occur in contig headers, comment lines and feature lines.