SCaFoS
Selection, Concatenation and Fusion of Sequences

< Graphical mode Table of content
Output files

In the output directory, and eventually its sub-directories, you obtain various files according to the usage of SCaFoS; all output files are not always available, depending on the chosen options. In the next tables, out means the name of output directory.


Making of file.otu

out-freq.otu List of species that appear at least in one aligned file; species are displayed indecreasing order of frequency
out-name.otu List of species that appear in at least one aligned file; species are displayed in alphabetic order
out-taxa.otu List of species that appear in at least one aligned file; species are displayed in alphabetic order according to a systematic classification provided by the user
out.log Information about the genes and species content of each aligned file


Assembling the datasets

out.li Output in MUST format of concatenated sequences according to the OTUs defined in the OTUfile
out.fasta Output in FASTA format of sequences concatenation according to the OTUs defined in the OTU file
out.phylip Output in PHYLIP format of sequences concatenation according to the OTUs defined in the OTU file
out.nex Output in NEXUS format of sequences concatenation according to the OTUs defined in the OTU file, eventually with commands add by the user (caution, these commands aren't checked by the program)
out_PAUP.nex Output in NEXUS format of sequences concatenation according to the OTUs defined in the OTU file; commands for PAUP are added in this nexus file
out_MB.nex Output in NEXUS format of sequences concatenation according to the OTUs defined in the OTU file; commands for MrBayes are added in this nexus file
out.len Maximal sequence length, number of OTUs and position in the concateneted file are displayed for each aligned sequences file
out.dist Display of evolutionary distances of sequence within each OTU and each file. Those files could be useful to verify which OTUs are removed depending on their too divergent distances. The composition of the sequences is also diplayed.
out.outdist State file that contains a table of percent of missing positions for each OTU within each file
out.stat State file that contains a table of percent of missing positions for each OTU within each file
out.cmp State file that contains a table of composition in amino acids or nucleotides for each sequence within each file
out.seq4otu Within each file, list of selected sequence for each OTU. Creation of chimera and existence of any sequence within an OTU are indicated
out.chim Description of each sequence fragment  used to create chimera
out.misotu List of files according number of missing OTUs

The state file is particularly useful to determine which files or which groups must be eliminated according to bulk of missing data.

This analysis could be completed by creating subdirectories misgen and missit where aligned files are copied according to maximum number of missing OTUs and maximum percent of missing sites respectively within each aligned file. These separated aligned sequence files are reconstructed only with the OTUs responding to the selection criteria currently used. For each misgen_X subdirectory, concatenated files are created with the corresponding state file and the corresponding OTU file.


Selected files

According to the user choice, selected files of aligned sequences are placed in the output directory

< Graphical mode Table of content

Hervé PHILIPPE's Lab