The Organelle Genome Megasequencing Program (OGMP) shares many of the data management and analysis challenges experienced by other genomics groups: gene identification/inference of function, gene product structural predictions, and integration of molecular, genetic, biochemical and genome organization data. Specific aims of the OGMP informatics efforts include the development of software tools for accessing existing large networked databases; enhancement of existing genome analysis toolsets; development of additional tools for management of large sequencing projects; and provision of comparative genomics analysis tools, with an emphasis on phylogenetics. An overview of the data management infrastructure will be presented. This includes the validation of sequence data; assembly using third- party software; and maintenance of sequence feature annotations throughout the numerous assembly cycles up to the finishing process and subsequent submission of the sequences to public data repositories.
The recently initiated Organelle Genome Database Project (GOBASE) addresses the present difficulty, indeed impossibility, of accessing all of the relevant information associated with organelles. Data are often dispersed among a number of sources, are difficult to locate, are incomplete and may also contain errors. In their current disorganized state, organelle genomic data constitute a major underexploited source of information. GOBASE is intended to rectify this situation. This project is currently in a late prototype phase of development, and different components of the system and the ways in which they interact will be presented. Comments will be made about the particular design decisions that were taken and the informatics tools that were specifically developed for the project. Supported by MRC Canada (SP-34) and CGAT (GO-12323 and GO-12984).