Skip to content
Snippets Groups Projects
Commit 91b5c520 authored by Alexis  CRISCUOLO's avatar Alexis CRISCUOLO
Browse files

2.0

parent e0c7a0db
No related branches found
No related tags found
No related merge requests found
...@@ -124,21 +124,21 @@ Launch _JolyTree_ without option to read the following documentation: ...@@ -124,21 +124,21 @@ Launch _JolyTree_ without option to read the following documentation:
* In short, _JolyTree_ uses [_mash_](http://mash.readthedocs.io/en/latest/) to decompose each genome into a sketch of _k_-mers (options `-k`, `-q`, `-s`) and quickly estimate the _p_-distance between each pair of genomes. If required, every _p_-distance is transformed into an evolutionary distance (options `-c`, `-a`, `-f`). A ratchet-based optimal phylogenetic tree search is performed from the pairwise evolutionary distances using [_FastME_](http://www.atgc-montpellier.fr/fastme/usersguide.php) (option `-r`). Branch supports are finally estimated using [_REQ_](https://research.pasteur.fr/en/tool/r%ce%b5q-assessing-branch-supports-o%c6%92-a-distance-based-phylogenetic-tree-with-the-rate-o%c6%92-elementary-quartets/). * In short, _JolyTree_ uses [_mash_](http://mash.readthedocs.io/en/latest/) to decompose each genome into a sketch of _k_-mers (options `-k`, `-q`, `-s`) and quickly estimate the _p_-distance between each pair of genomes. If required, every _p_-distance is transformed into an evolutionary distance (options `-c`, `-a`, `-f`). A ratchet-based optimal phylogenetic tree search is performed from the pairwise evolutionary distances using [_FastME_](http://www.atgc-montpellier.fr/fastme/usersguide.php) (option `-r`). Branch supports are finally estimated using [_REQ_](https://research.pasteur.fr/en/tool/r%ce%b5q-assessing-branch-supports-o%c6%92-a-distance-based-phylogenetic-tree-with-the-rate-o%c6%92-elementary-quartets/).
* It is not recommended to modify the option `-k`. The optimal value of _k_ is automatically estimated by equation (2) in [Ondov et al. (2016)](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0997-x) from the desired probability _q_ of observing a random _k_-mer (option `-q`). Since _JolyTree_ v2.0, the default probability is _q_ = 0.000000001, as it enables pairwise distances to be conveniently estimated in many cases. Increasing _q_ is not recommended, but can be useful to minimize the number of unknown distances when dealing with distantly-related genome sequences. Lowering _q_ leads to larger _k_-mer size that can be useful when dealing with very closely-related genome sequences. * It is not recommended to modify the option `-k`. The optimal value of _k_ is automatically estimated by equation (2) in [Ondov et al. (2016)](https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0997-x) from the desired probability _q_ of observing a random _k_-mer (option `-q`). Since _JolyTree_ v2.0, the default probability is _q_ = 0.000000001, as it enables pairwise distances to be conveniently estimated in many cases. Increasing _q_ is not recommended, but can be useful to minimize the number of unknown distances (i.e. non-overlapping _k_-mer sets) when dealing with distantly-related genome sequences. Lowering _q_ leads to larger _k_-mer size that can be useful when dealing with very closely-related genome sequences.
* Default sketch size is 25% of the average genome size, which enables pairwise distances to be efficiently estimated with fast running times. Increasing the sketch size (option `-s`) yields more accurate estimates (especially with distantly-related genomes), but requires important disk space to store the sketch files. * Default sketch size is 25% of the average genome size, which enables pairwise distances to be efficiently estimated with fast running times. Increasing the sketch size (option `-s`) yields more accurate estimates (especially with distantly-related genomes), but requires important disk space to store the sketch files.
* Lowering the cutoff value for correcting the evolutionary distances (option `-c`) does generally not modify the inferred phylogenetic tree; on the other side, it is generally not recommended to increase this cutoff value. * Lowering the cutoff value for correcting the evolutionary distances (option `-c`) does generally not modify the inferred phylogenetic tree; on the other side, it is inadvisable to increase this cutoff value.
* The option `-c` allows multiple substitutions per character to be accurately estimated when an observed _p_-distance is quite large (e.g. > 0.1; see [Figure 3.1](https://books.google.fr/books?id=3Xc8DwAAQBAJ&pg=PA41) in Nei and Kumar 2000). In such cases, all the _p_-distances _p_ estimated by _mash_ are transformed into proper evolutionary distances _d_. Since _JolyTree_ v2.0, four _p_-distance transformations can be obtained using options `-c` and/or `-a`: * The option `-c` allows multiple substitutions per character to be accurately estimated when an observed _p_-distance is quite large (e.g. > 0.1; see [Figure 3.1](https://books.google.fr/books?id=3Xc8DwAAQBAJ&pg=PA41) in Nei and Kumar 2000). In such cases, all the _p_-distances _p_ estimated by _mash_ are transformed into proper evolutionary distances _d_. Since _JolyTree_ v2.0, four _p_-distance transformations can be obtained using options `-c` and/or `-a`:
<div align="center"> <div align="center">
| transformation | formula | options | | transformation | formula | options | notes |
|:------------------------|:-----------------------------------|:-----------------------------------| |:------------------------|:------------------------------------------------------------------------|:------------------------------------------------------------|:--------------------------------------------------------------------------------------------------|
| none | _d_ = _p_ | `-c 1 -a $a` with any `$a` > 0 | | | none | _d_ = _p_ | `-c 1 -a $a` with any `$a` > 0 | unaccurate estimate when _p_ > 0.1 |
| Poisson correction | _d_ = - log<sub>_e_</sub>(1 - _p_) | `-c 1 -a 0` | | Poisson correction | _d_ = - log<sub>_e_</sub>(1 - _p_) | `-a 0` (use `-c 1` to force Poisson correction) | default _p_-distance transformation of _mash_ |
| F81/EI correction | _d_ = -_b_<sub>1</sub> log<sub>_e_</sub>(1 - _p_/_b_<sub>2</sub>) | `-a 0` | | F81/EI correction | _d_ = -_b_<sub>1</sub> log<sub>_e_</sub>(1 - _p_/_b_<sub>2</sub>) | `-a 0` (use `-c 0` to force F81/EI correction) | formula (4) in [Tamura and Kumar (2002)](https://academic.oup.com/mbe/article/19/10/1727/1258975) |
| F81/EI gamma correction | _d_ = -_ab_<sub>1</sub>[(1 - _p_/_b_<sub>2</sub>)<sup>-1/_a_</sup> - 1] | `-a $a` with specified gamma shape parameter | | F81/EI gamma correction | _d_ = -_ab_<sub>1</sub>[(1 - _p_/_b_<sub>2</sub>)<sup>-1/_a_</sup> - 1] | `-a $a` with specified gamma shape parameter (default: 1.5) | formula (3) in Criscuolo (2020; _submitted_) |
</div> </div>
...@@ -257,7 +257,7 @@ Below is a list of some phylogenetic trees inferred using _JolyTree_: ...@@ -257,7 +257,7 @@ Below is a list of some phylogenetic trees inferred using _JolyTree_:
* [_Klebsiella pneumoniae_](https://www.tandfonline.com/na101/home/literatum/publisher/tandf/journals/content/kgmi20/2020/kgmi20.v011.i05/19490976.2020.1748257/20200625/images/medium/kgmi_a_1748257_f0002_c.jpg) ([Huynh et al. 2020](https://doi.org/10.1080/19490976.2020.1748257 )) * [_Klebsiella pneumoniae_](https://www.tandfonline.com/na101/home/literatum/publisher/tandf/journals/content/kgmi20/2020/kgmi20.v011.i05/19490976.2020.1748257/20200625/images/medium/kgmi_a_1748257_f0002_c.jpg) ([Huynh et al. 2020](https://doi.org/10.1080/19490976.2020.1748257 ))
* [_Stromatolite bacterial communities_](https://www.biorxiv.org/content/biorxiv/early/2020/03/14/818625/F5.large.jpg) ([Waterworth et al. 2020](https://doi.org/10.1101/818625)) * [_Stromatolite_ bacterial communities](https://www.biorxiv.org/content/biorxiv/early/2020/03/14/818625/F5.large.jpg) ([Waterworth et al. 2020](https://doi.org/10.1101/818625))
* [_Campylobacter_ genus](https://figshare.com/articles/Data_Sheet_1_Taking_Control_Campylobacter_jejuni_Binding_to_Fibronectin_Sets_the_Stage_for_Cellular_Adherence_and_Invasion_pdf/12103260) ([Konkel et al. 2020](https://doi.org/10.3389/fmicb.2020.00564.s001)) * [_Campylobacter_ genus](https://figshare.com/articles/Data_Sheet_1_Taking_Control_Campylobacter_jejuni_Binding_to_Fibronectin_Sets_the_Stage_for_Cellular_Adherence_and_Invasion_pdf/12103260) ([Konkel et al. 2020](https://doi.org/10.3389/fmicb.2020.00564.s001))
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment