README.md 1.34 KB
Newer Older
Julien GUGLIELMINI's avatar
Julien GUGLIELMINI committed
1
# Description
Julien  GUGLIELMINI's avatar
Julien GUGLIELMINI committed
2 3
When considering a set of aligned sequences, identical character states can be [homoplasitc](https://en.wikipedia.org/wiki/Homoplasy) (_i.e._ convergent evolution), or [synapomorphic](https://en.wikipedia.org/wiki/Synapomorphy_and_apomorphy) (acquired by descent).
This script aims at finding shared character states inside a multiple sequences alignment irrespective of the evolution of the sequences.
Julien GUGLIELMINI's avatar
Julien GUGLIELMINI committed
4
It takes as input a fasta file containing aligned sequences, and a list of sequences whose name match those in the alignment.
Julien GUGLIELMINI's avatar
Julien GUGLIELMINI committed
5 6
The output is a list of the shared character states followed by an entropy score - the closest is the score to 1, the more the result is significant.
If the sequences you gave as parameters share a common ancestor, then the output are synapomorphies. Otherwise they are homoplasies.
Julien GUGLIELMINI's avatar
Julien GUGLIELMINI committed
7 8

# Installation
Julien  GUGLIELMINI's avatar
Julien GUGLIELMINI committed
9 10
This script is written in Bash and as such should run on any Unix platform.
Simply download the file, navigate to its location and then type
Julien GUGLIELMINI's avatar
Julien GUGLIELMINI committed
11 12 13 14 15 16 17 18 19 20 21 22 23 24

```bash
chmod +x find_synapomorphies.sh
```

# Usage

```bash
./find_synapomorphies.sh "seq1,seq2,seq3" <input_file> <output_file>
```

* the input file is a fasta file containing a multiple sequences alignment
* the sequences of interest are comma-separated and inside a pair of double quotes

Julien GUGLIELMINI's avatar
Julien GUGLIELMINI committed
25 26 27 28 29 30
# Example

```bash
./find_synapomorphies.sh "seq1,seq2" example.fasta example.output
```