SCaFoS - Output format

Selection, Concatenation and Fusion of Sequences

Output files

In the output directory, and eventually its sub-directories, you obtain various files according to the usage of SCaFoS; all output files are not always available, depending on the chosen options. In the next tables, out means the name of output directory.

Making of file.otu

out-freq.otu	List of species that appear at least in one aligned file; species are displayed indecreasing order of frequency
out-name.otu	List of species that appear in at least one aligned file; species are displayed in alphabetic order
out-taxa.otu	List of species that appear in at least one aligned file; species are displayed in alphabetic order according to a systematic classification provided by the user
out.log	Information about the genes and species content of each aligned file

Assembling the datasets

out.li	Output in MUST format of concatenated sequences according to the OTUs defined in the OTUfile
out.fasta	Output in FASTA format of sequences concatenation according to the OTUs defined in the OTU file
out.phylip	Output in PHYLIP format of sequences concatenation according to the OTUs defined in the OTU file
out.nex	Output in NEXUS format of sequences concatenation according to the OTUs defined in the OTU file, eventually with commands add by the user (caution, these commands aren't checked by the program)
out_PAUP.nex	Output in NEXUS format of sequences concatenation according to the OTUs defined in the OTU file; commands for PAUP are added in this nexus file
out_MB.nex	Output in NEXUS format of sequences concatenation according to the OTUs defined in the OTU file; commands for MrBayes are added in this nexus file
out.len	Maximal sequence length, number of OTUs and position in the concateneted file are displayed for each aligned sequences file
out.dist	Display of evolutionary distances of sequence within each OTU and each file. Those files could be useful to verify which OTUs are removed depending on their too divergent distances. The composition of the sequences is also diplayed.
out.outdist	State file that contains a table of percent of missing positions for each OTU within each file
out.stat	State file that contains a table of percent of missing positions for each OTU within each file
out.cmp	State file that contains a table of composition in amino acids or nucleotides for each sequence within each file
out.seq4otu	Within each file, list of selected sequence for each OTU. Creation of chimera and existence of any sequence within an OTU are indicated
out.chim	Description of each sequence fragment used to create chimera
out.misotu	List of files according number of missing OTUs

The state file is particularly useful to determine which files or which groups must be eliminated according to bulk of missing data.

This analysis could be completed by creating subdirectories misgen and missit where aligned files are copied according to maximum number of missing OTUs and maximum percent of missing sites respectively within each aligned file. These separated aligned sequence files are reconstructed only with the OTUs responding to the selection criteria currently used. For each misgen_X subdirectory, concatenated files are created with the corresponding state file and the corresponding OTU file.

Selected files

According to the user choice, selected files of aligned sequences are placed in the output directory

< Graphical mode

Table of content

Hervé PHILIPPE's Lab