Skip to content
Snippets Groups Projects
Name Last commit Last update
example
src
COPYING
README.md

Description

When considering a set of aligned sequences, identical character states can be homoplasitc (i.e. convergent evolution), or synapomorphic (acquired by descent). This script aims at finding shared character states inside a multiple sequences alignment irrespective of the evolution of the sequences. It takes as input a fasta file containing aligned sequences, and a list of sequences whose name match those in the alignment. The output is a list of the shared character states followed by an entropy score - the closest is the score to 1, the more the result is significant. If the sequences you gave as parameters share a common ancestor, then the output are synapomorphies. Otherwise they are homoplasies.

Installation

This script is written in Bash and as such should run on any Unix platform. Simply download the file, navigate to its location and then type

chmod +x find_synapomorphies.sh

Usage

./find_synapomorphies.sh "seq1,seq2,seq3" <input_file> <output_file>
  • the input file is a fasta file containing a multiple sequences alignment
  • the sequences of interest are comma-separated and inside a pair of double quotes

Example

./find_synapomorphies.sh "seq1,seq2" example.fasta example.output