README.md 1.03 KB
Newer Older
Julien GUGLIELMINI's avatar
Julien GUGLIELMINI committed
1
# Description
Julien GUGLIELMINI's avatar
Typos  
Julien GUGLIELMINI committed
2 3
When considering a set of aligned sequences, identical characters can be the result of convergent evolution, or direct ancestry - in that case it's called a synapomorphy.
This scripts aims at finding shared characters inside a multiple sequences alignment irrespective of the evolution of the sequences.
Julien GUGLIELMINI's avatar
Julien GUGLIELMINI committed
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
It takes as input a fasta file containing aligned sequences, and a list of sequences whose name match those in the alignment.
The output is a list of the shared characters followed by an entropy score - the closest is the score to 1, the more the result is significant.
If the sequences you gave as parameters share a common ancestor, then the output are synapomorphies.

# Installation
Download the file, navigate to its location then type

```bash
chmod +x find_synapomorphies.sh
```

# Usage

```bash
./find_synapomorphies.sh "seq1,seq2,seq3" <input_file> <output_file>
```

* the input file is a fasta file containing a multiple sequences alignment
* the sequences of interest are comma-separated and inside a pair of double quotes