The OGMP Masterfile Format

The masterfile (MF) provides an integration of DNA sequence and its feature annotations, with the feature annotations embedded at the corresponding positions in the sequence. The naming rules for genetic elements have been designed such that they are easily interpretable by a human reader, but are at the same time computer-readable (the MF format is a superset of the FASTA format).
We have developed a number of tools that take MF-files as input (basic sequence operation, ssyntax and logic checking, and gene/intergenic region extraction). Particularly usful is one, developped in collaboration with NCBI, that permit translation of an MF directly into a GenBank record (ASN.1 and GBF flatfile format), making it much easier to submit sequences to NCBI's publically-accessible data repositories.
The current document describes the conventions for the MF annotations that occur in contig headers, comment lines and feature lines.

Definition of the OGMP masterfile (MF) format

Molecular Sequence Database Definitions

