1.0

902e4958 · Alexis CRISCUOLO · 245214b1 · 902e4958
Commit 902e4958 authored 3 years ago by Alexis CRISCUOLO
--- a/README.md
+++ b/README.md
@@ -82,9 +82,9 @@ Run _GenoMed_ without option to read the following documentation:
 ## Notes
-* In brief, _GenoMed_ uses the tool [_mash_](https://mash.readthedocs.io/en/latest/) to compute all pairwise _p_-distances between genomes, and next transforms them into EI/F81 evolutionary distances (see Criscuolo 2020). To obtain accurate _p_-distance estimates with [_mash_](https://mash.readthedocs.io/en/latest/), the sketch size is defined as the average genome length, and the _k_-mer length _k_ is the interger part (floor) of log<sub>4</sub>&nbsp;(_m_<sup>2</sup>-_m_), where _m_ is the maximum genome length (this optimal estimate of _k_ is derived from Formula 1 in Fofanov et al. 2004). All these pairwise evolutionary distances are finally used to compute the average distance &delta;<sub>_g_</sub> of each genome _g_ to all other ones. The medoid genome is the one that minimizes &delta;<sub>_g_</sub>.
+* In brief, _GenoMed_ uses the tool [_mash_](https://mash.readthedocs.io/en/latest/) to compute all pairwise _p_-distances between genomes, and next transforms them into EI/F81 evolutionary distances (see Criscuolo 2020). To obtain accurate _p_-distance estimates with [_mash_](https://mash.readthedocs.io/en/latest/), the sketch size is defined as the average genome length, and the _k_-mer length _k_ as the integer part (floor) of log<sub>4</sub>&nbsp;(_m_<sup>2</sup>-_m_), where _m_ is the maximum genome length (this optimal estimate of _k_ is derived from Formula 1 in Fofanov et al. 2004). All these pairwise evolutionary distances are finally used to compute the average distance &delta;<sub>_g_</sub> of each genome _g_ to all other ones. The medoid genome is the one that minimizes &delta;<sub>_g_</sub>.
-* The medoid genome inference is assessed by an original bootstrap procedure. The initial set of genome is first sampled with replacement (default: 500 resampling). Next, the medoid genome is determined for each resampled set. Finally, a _p_-value is defined as the proportion of times that each genome was a medoid.
+* The medoid genome inference is assessed by an original bootstrap procedure. The initial set of genome is first sampled with replacement (default: 500 resampling). Next, the medoid genome is determined for each resampled set. Finally, a (kind of) _p_-value is defined as the proportion of times that each genome was a medoid in the resampled sets.
 * All input files (at least 3) should be in FASTA format and non compressed. _GenoMed_ is able to consider many input files summarized using [filename expansion](https://tldp.org/LDP/abs/html/globbingref.html), e.g. `dirname/*.fasta`.