diff --git a/README.md b/README.md index ba83c9507f9f7e9f17fe5ef694c80c01e16d99d5..06d0c4b053897e6901157f8bea909715260f6af1 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ _JolyTree_ (named in memory of [Nicolas Joly](https://research.pasteur.fr/en/member/nicolas-joly/)) is a command line script written in [Bash](https://www.gnu.org/software/bash/) that allows a distance-based phylogenetic tree with branch supports to be quickly inferred from non-aligned genome sequences. _JolyTree_ runs on UNIX, Linux and most OS X operating systems. -For more details, see the associated publication ([Criscuolo 2019](https://riojournal.com/article/36178/)). +For more details, see the associated publication (Criscuolo [2019](https://riojournal.com/article/36178/), [2020](https://f1000research.com/articles/9-1309)). ## Installation and execution @@ -139,18 +139,18 @@ Launch _JolyTree_ without option to read the following documentation: | none | _d_ = _p_ | `-c 1 -a $a` with any `$a` > 0 | unaccurate estimate when _p_ > 0.1 | | Poisson | _d_ = - log<sub>_e_</sub>(1 - _p_) | `-a 0` (use `-c 1` to force Poisson correction) | default transformation of _mash_ | | F81/EI | _d_ = -_b_<sub>1</sub> log<sub>_e_</sub>(1 - _p_/_b_<sub>2</sub>) | `-a 0` (use `-c 0` to force F81/EI correction) | formula (4) in [Tamura and Kumar (2002)](https://academic.oup.com/mbe/article/19/10/1727/1258975) | -| F81/EI + Γ | _d_ = -_ab_<sub>1</sub>[(1 - _p_/_b_<sub>2</sub>)<sup>-1/_a_</sup> - 1] | `-a $a` to set Γ shape parameter (default: 1.5) | formula (3) in Criscuolo (2020; _submitted_) | +| F81/EI + Γ | _d_ = -_ab_<sub>1</sub>[(1 - _p_/_b_<sub>2</sub>)<sup>-1/_a_</sup> - 1] | `-a $a` to set Γ shape parameter (default: 1.5) | formula (3) in [Criscuolo 2020](https://f1000research.com/articles/9-1309) | </sub> </div> -* The F81 corrections estimate pairwise distances based on the Equal-Input (EI) model of nucleotide substitution ([Felsenstein 1981](https://link.springer.com/article/10.1007/BF01734359); [Tajima and Nei 1982](https://link.springer.com/article/10.1007/BF01810830), [1984](https://academic.oup.com/mbe/article/1/3/269/1244029), [Tamura and Kumar 2002](https://academic.oup.com/mbe/article/19/10/1727/1258975)). These transformations were chosen because they can be directly computed from _p_-distances, and take into account putative unequal base frequencies and heterogeneous base composition among lineages (option `-f`; for details about the values _b_<sub>1</sub> and _b_<sub>2</sub>, see e.g. [Tamura and Kumar 2002](https://academic.oup.com/mbe/article/19/10/1727/1258975), [Criscuolo 2019](https://riojournal.com/article/36178/)). Thanks to the use of the supplementary gamma shape parameter (option `-a`), F81/EI gamma distance allows approximating evolutionary distances derived from complex nucleotide substitution models (the default parameter _a_ = 1.5 enables pairwise distances to be conveniently estimated in many cases). +* The F81 corrections estimate pairwise distances based on the Equal-Input (EI) model of nucleotide substitution ([Felsenstein 1981](https://link.springer.com/article/10.1007/BF01734359); [Tajima and Nei 1982](https://link.springer.com/article/10.1007/BF01810830), [1984](https://academic.oup.com/mbe/article/1/3/269/1244029), [Tamura and Kumar 2002](https://academic.oup.com/mbe/article/19/10/1727/1258975)). These transformations were chosen because they can be directly computed from _p_-distances, and take into account unequal base frequencies and heterogeneous base composition among lineages (option `-f`; for details about the values _b_<sub>1</sub> and _b_<sub>2</sub>, see e.g. [Tamura and Kumar 2002](https://academic.oup.com/mbe/article/19/10/1727/1258975), [Criscuolo 2019](https://riojournal.com/article/36178/)). Thanks to the use of the supplementary gamma shape parameter (option `-a`), F81/EI gamma distance allows approximating evolutionary distances derived from complex nucleotide substitution models (the default parameter _a_ = 1.5 enables pairwise distances to be conveniently estimated in many cases; [Criscuolo 2020](https://f1000research.com/articles/9-1309)). * Fast running times will be observed when using multiple threads. Since _JolyTree_ v2.0, almost all steps benefit from a large number of threads, i.e. genome parsing and sketching, _p_-distance estimates, and ratchet-based BME tree search. * The verbosity of _JolyTree_ can be reduced by ending the command line by `2>/dev/null` -* To launch _JolyTree_ on multiple cores on a cluster managed by [SLURM](https://slurm.schedmd.com), edit the file `JolyTree.sh` and read the subsection [3] of the _Installation_ section (approximately line 200). +* To launch _JolyTree_ on multiple cores on computing facilities by [SLURM](https://slurm.schedmd.com), edit the file `JolyTree.sh` and read the subsection [3] of the _Installation_ section (approximately line 200). ## Example @@ -221,6 +221,8 @@ As the basename was set to 'klebsiella', _JolyTree_ writes in few minutes the fo Criscuolo A (2019) A fast alignment-free bioinformatics procedure to infer accurate distance-based phylogenetic trees from genome assemblies. Research Ideas and Outcomes, 5:e36178. [doi:10.3897/rio.5.e36178](https://riojournal.com/article/36178/). +Criscuolo A (2020) On the transformation of MinHash-based uncorrected distances into proper evolutionary distances for phylogenetic inference. F1000Research, 9:1309. [doi:10.12688/f1000research.26930.1](https://doi.org/10.12688/f1000research.26930.1). + Felsenstein J (1981) Evolutionary trees from DNA sequences: a maximum likelihood approach. Journal of Molecular Evolution, 17(6):368-376. [doi:10.1007/BF01734359](https://link.springer.com/article/10.1007/BF01734359). Lefort V, Desper R, Gascuel O (2015) FastME 2.0: a comprehensive, accurate, and fast distance-based phylogeny inference program. Molecular Biology and Evolution, 32(10):2798-2800. [doi:10.1093/molbev/msv150](https://doi.org/10.1093/molbev/msv150). @@ -263,4 +265,5 @@ Below is a list of some phylogenetic trees inferred using _JolyTree_: * [_Campylobacter_ genus](https://figshare.com/articles/Data_Sheet_1_Taking_Control_Campylobacter_jejuni_Binding_to_Fibronectin_Sets_the_Stage_for_Cellular_Adherence_and_Invasion_pdf/12103260) ([Konkel et al. 2020](https://doi.org/10.3389/fmicb.2020.00564.s001)) +* [_Flavobacterium_ genus](https://research.pasteur.fr/wp-content/uploads/2020/10/research_pasteur-flavobacterium-panici-sp-nov-isolated-from-the-rhizosphere-of-the-switchgrass-panicum-virgatum-flavo.pxu-55-scaled.jpg) (Kämpfer et al. [2020a](https://www.microbiologyresearch.org/content/journal/ijsem/10.1099/ijsem.0.004510), [2020b](https://www.microbiologyresearch.org/content/journal/ijsem/10.1099/ijsem.0.004482))