1.0

153c3ef0 · Alexis CRISCUOLO · 005aa614 · 153c3ef0
Commit 153c3ef0 authored 3 years ago by Alexis CRISCUOLO
--- a/README.md
+++ b/README.md
@@ -59,6 +59,7 @@ Run _SimiPlot_ without option to read the following documentation:
  -X <int>     x-axis end (default: reference length)
  -y <int>     y-axis start (default: 0)
  -Y <int>     y-axis end (default: 100)
+  -d <int>     dot size factor (default: 1.0)
  -a <real>    aspect ratio (detault: 3.0)
  -t <int>     number of threads (default: 2)
  -h           prints this help and exits
@@ -66,17 +67,17 @@ Run _SimiPlot_ without option to read the following documentation:
 ## Notes
-* For each input file, _SimiPlot_ decomposes the nucleotide sequence(s) into overlapping fragments (step = half the fragment length). Fragment length is set by option `-w` (default: reference sequence length divided by 1,000). Each fragment is searched against the reference sequence (option `-r`) using blastn (Altschul et al. 1990; Camacho et al. 2008) with tuned parameters (as suggested by Goris et al. 2007). For each fragment, only the best BLAST hit is considered (E-value threshold = 0.5). All BLAST hits are graphically represented as a scatter plot, where _x_ is the hit BLAST position within the reference, _y_ is the percentage of similarity, and the dot radius is proportional to the aligned part of the fragment.
+* For each non-reference input file, _SimiPlot_ decomposes the nucleotide sequence(s) into overlapping equal-length fragments (step = half the fragment length). Each fragment is searched against the reference sequence (option `-r`) using blastn (Altschul et al. 1990; Camacho et al. 2008) with tuned parameters (as suggested by Goris et al. 2007). For each fragment, only the best BLAST hit is considered (E-value threshold = 0.5). All BLAST hits are graphically represented as a scatter plot, where _x_ is the hit BLAST position within the reference, _y_ is the percentage of similarity, and the dot radius is proportional to the aligned part of the fragment.
 * Each input file should be in FASTA format, not compressed, and may contain nucleotide sequences. At least one input files should be specified.
-* Faster running times can be obtained by using a large number of threads (option `-t`; default: 2; recommended: &geq; 10). 
+* Fragment length can be modified using option `-w`. By default, the fragment length is the reference sequence length divided by 1,000.
-* The smoothing option `-s` can sometimes be useful to reduce variability between neighbor dots, leading to clearer similarity representations.
+* Faster running times can be obtained by using a large number of threads (option `-t`; default: 2; recommended: &geq; 10). 
 * Specific regions can be represented by specifying start and end positions within the reference sequence using options `-x` and `-X`, respectively. By default, the whole reference sequence is represented. Y-axis range can be also modified using options `-y` and `Y` (default: 0% and 100% similarity, respectively).
-* To obtain convenient figures, the aspect ratio (i.e. width/heigth) of the scatter plot can be modified using option `-a` (default: 3.0). Dot size can be controlled using option `-d`. Fragment length can be also modified using option `-w`, but at the risk of obtaining a less legible figure.
+* To obtain convenient and more readable figures with clearer similarity representation, the smoothing option `-s` can often be useful to reduce variability between neighbor dots. Another way is to increase the aspect ratio (i.e. width/heigth) of the scatter plot using option `-a` (default: 3.0). Dot size can be also controlled using option `-d`. 
 * A different dot color is used for each input file. The first colors are: (1) red, (2) blue, (3) orange, (4) green, (5) gray, (6) brown, (7) dark green, (8) pink, (9) light blue. To associate a given input file to a specific color, change the input file order.