Commit 78131bd5 authored by Alexis  CRISCUOLO's avatar Alexis CRISCUOLO

little editing

parent 6d7e92d7
...@@ -19,12 +19,12 @@ bibliography: paper.bib ...@@ -19,12 +19,12 @@ bibliography: paper.bib
# Summary # Summary
``REQ`` is a program for quickly estimating a confidence value at each branch of a distance-based phylogenetic tree. Branch support assessment is commonly based on bootstrap procedures [@felsenstein1985; @makarenkov2010; @lemoine2018]. Unfortunately, as they are based on numerous resampling of aligned characters, such procedures require long running times, despite some recent advances [@minh2013; @hoang2018a; @hoang2018b]. In fact, direct branch support methods were already developed for character-based approaches that optimize maximum-parsimony or maximum-likelihood criteria, in order to achieve faster running times [@bremer1988; @bremer1994; @anisimova2006; @anisimova2011]. However, to our knowledge, no practical implementation of direct branch support methods is currently available for distance-based approaches. *REQ* is a program for quickly estimating a confidence value at each branch of a distance-based phylogenetic tree. Branch support assessment is commonly based on bootstrap procedures [@felsenstein1985; @makarenkov2010; @lemoine2018]. Unfortunately, as they are based on numerous resampling of aligned characters, such procedures require long running times, despite some recent advances [@minh2013; @hoang2018a; @hoang2018b]. In fact, direct branch support methods were already developed for character-based approaches that optimize maximum-parsimony or maximum-likelihood criteria, in order to achieve faster running times [@bremer1988; @bremer1994; @anisimova2006; @anisimova2011]. However, to our knowledge, no practical implementation of direct branch support methods is currently available for distance-based approaches.
Distance-based approaches proceed in two steps: a pairwise evolutionary distance is estimated between each pair of (biological) objects, and, next, an algorithm is used to infer the tree with branch lengths that best fits the evolutionary distance matrix [@pardi2016]. Because of their speed, distance-based methods are widely used for inferring phylogenetic trees. Moreover, as such algorithms only need a distance matrix, they allow phylogenetic analyses to be carried out from a wide range of data types, e.g. DNA-DNA hybridization experiments [krajewski1990], gene orders [@wang2006; @house2014], gene content [@spencer2007], or unaligned genome sequences [@chapus2005; @henz2005; @cohen2012; @garcia2018]. Nevertheless, in such cases, standard bootstrap-based methods can not be used for estimating branch confidence values. Distance-based approaches proceed in two steps: a pairwise evolutionary distance is estimated between each pair of (biological) objects, and, next, an algorithm is used to infer the tree with branch lengths that best fits the evolutionary distance matrix [@pardi2016]. Because of their speed, distance-based methods are widely used for inferring phylogenetic trees. Moreover, as such algorithms only need a distance matrix, they allow phylogenetic analyses to be carried out from a wide range of data types, e.g. DNA-DNA hybridization experiments [krajewski1990], gene orders [@wang2006; @house2014], gene content [@spencer2007], or unaligned genome sequences [@chapus2005; @henz2005; @cohen2012; @garcia2018]. Nevertheless, in such cases, standard bootstrap-based methods can not be used for estimating branch confidence values.
In order to fill this void, the program ``REQ`` was developed. This tool estimates the rate of elementary quartets (REQ) for each branch of a given phylogenetic tree from the associated distance matrix, as described by [@guenoche2001]. This method simply computes the proportion of four-leaf subtrees (i.e. quartets) induced by every internal branch that are supported by the four-point condition applied to the six corresponding pairwise evolutionary distances [@zaretskii1965; @buneman1971]. Therefore, this measure is not based on a random sampling (such as bootstrap-based confidence supports). The closer this measure is to 1, the more the corresponding branch is fully supported by the pairwise evolutionary distances. In order to fill this void, the program *REQ* was developed. This tool estimates the rate of elementary quartets (REQ) for each branch of a given phylogenetic tree from the associated distance matrix, as described by [@guenoche2001]. This method simply computes the proportion of four-leaf subtrees (i.e. quartets) induced by every internal branch that are supported by the four-point condition applied to the six corresponding pairwise evolutionary distances [@zaretskii1965; @buneman1971]. Therefore, this measure is not based on a random sampling (such as bootstrap-based confidence supports). The closer this measure is to 1, the more the corresponding branch is fully supported by the pairwise evolutionary distances.
The program ``REQ`` is available on [GitLab](https://gitlab.pasteur.fr/GIPhy/REQ) under the [licence GNU GPLv3](https://www.gnu.org/licenses/gpl-3.0.en.html). Implemented in Java, ``REQ`` could be used on every operating system with a simple command line. ``REQ`` only needs two input files: a distance matrix file in either PHYLIP lower-triangular or square format, and a phylogenetic tree file in NEWICK format created from the distance matrix by any standard phylogenetic tree reconstruction method, e.g. neighbor-joining [@saitou1987; @studier1988], BioNJ [@gascuel1997], FastME [@desper2002]. Although computing the REQ value for every branch of a phylogenetic tree on *n* leaves requires *O*(*n*<sup>5</sup>) time complexity, ``REQ`` running time is quite fast (e.g. ~5 seconds with *n* = 500 on a standard computer) and could therefore be used with large phylogenetic trees. The program *REQ* is available on [GitLab](https://gitlab.pasteur.fr/GIPhy/REQ) under the [licence GNU GPLv3](https://www.gnu.org/licenses/gpl-3.0.en.html). Implemented in Java, ``REQ`` could be used on every operating system with a simple command line. *REQ* only needs two input files: a distance matrix file in either PHYLIP lower-triangular or square format, and a phylogenetic tree file in NEWICK format created from the distance matrix by any standard phylogenetic tree reconstruction method, e.g. neighbor-joining [@saitou1987; @studier1988], BioNJ [@gascuel1997], FastME [@desper2002]. Although computing the REQ value for every branch of a phylogenetic tree on *n* leaves requires *O*(*n*<sup>5</sup>) time complexity, *REQ* running time is quite fast (e.g. ~5 seconds with *n* = 500 on a standard computer) and could therefore be used with large phylogenetic trees.
# References # References
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment