_YACO_ (_Yet Another Contig Ordering_) is a command line program written in [Bash](https://www.gnu.org/software/bash/)to determine both the orientation and order of a set of contigs (generally, a draft genome assembly) according to a complete reference genome. Based on reciprocal BLAST searches, _YACO_ is a simple and practical tool, that generally achieves accurate results with acceptable running times (e.g. less than 10 seconds to process a 5 Mbp draft genome using 12 threads).
_YACO_ (_Yet Another Contig Ordering_) is a command line program written in [Bash](https://www.gnu.org/software/bash/)for orienting and ordering contigs (generally, a draft genome assembly) according to a closed reference genome. Based on reciprocal BLAST searches, _YACO_ is a simple and practical tool, which generally achieves accurate results with acceptable running times (e.g. less than 10 seconds to process a 5 Mbp draft genome using 12 threads).
_YACO_ runs on UNIX, Linux and most OS X operating systems.
...
...
@@ -66,7 +66,7 @@ Run _YACO_ without option to read the following documentation:
## Notes
* First, given a fixed fragment length _w_ (option `-w`; default: 400 bps), _YACO_ partitions each contig (option `-i`) into consecutive fragments _f<sub>i</sub>_, and decomposes the reference sequence(s) (option `-r`) into overlapping fragments _f<sub>r</sub>_ (step _w_ ∕ 2). Next, each set of fragments is searched against the other using _blastn_ (Altschul et al. 1990; Camacho et al. 2008) with tuned parameters (as suggested by Goris et al. 2007). Orthologous fragments are assessed by reciprocal best BLAST hits showing ≥ 30 % overall fragment identity on an alignable region ≥ 35% fragment length (as suggested by Lee et al. 2016). Every fragment _f<sub>i</sub>_ associated with an orthologous one _f<sub>r</sub>_ is ranked by the position of _f<sub>r</sub>_ within the reference. A contig is localized when a sufficient proportion of its fragments _f<sub>i</sub>_ are ranked (as set by option `-p`, default: ≥50%). Every localized contig is replaced by its reverse-complement when most of its ranked fragments _f<sub>i</sub>_ are in opposite strand against their orthologous fragments _f<sub>r</sub>_. Finally, the localized contigs are sorted according to the median rank of its fragments _f<sub>i</sub>_. Oriented and ordered contigs are finally written into the specified output file.
* First, given a fixed fragment length _w_ (option `-w`; default: 400 bps), _YACO_ partitions each contig (option `-i`) into consecutive fragments _f<sub>i</sub>_, and decomposes the reference sequence(s) (option `-r`) into overlapping fragments _f<sub>r</sub>_ (step _w_ ∕ 2). Next, each set of fragments is searched against the other using _blastn_ (Altschul et al. 1990; Camacho et al. 2008) with tuned parameters (as suggested by Goris et al. 2007). Orthologous fragments are assessed by reciprocal best BLAST hits showing ≥ 30 % overall fragment identity on an alignable region ≥ 35% fragment length (as suggested by Lee et al. 2016). Every fragment _f<sub>i</sub>_ associated with an orthologous one _f<sub>r</sub>_ is ranked by the position of _f<sub>r</sub>_ within the reference. A contig is localized when a sufficient proportion of its fragments _f<sub>i</sub>_ is ranked (as set by option `-p`, default: ≥50%). Every localized contig is replaced by its reverse-complement when most of its ranked fragments _f<sub>i</sub>_ are in opposite strand against their orthologous fragments _f<sub>r</sub>_. Finally, the localized contigs are sorted according to the median rank of its fragments _f<sub>i</sub>_. Oriented and ordered contigs are finally written into the specified output file.
* When the reference file (option `-r`) contains more than one sequence, (reciprocal) BLAST searches are performed against all of them, but the ordering/orienting procedure (see above) is carried out according to only the first one. This approach can be useful to make a better distinction between a reference chromosome and e.g. several reference plasmids.