_Inheritance algorithm_ is a command line program written in [Python](https://www.python.org/) that allows to attribute to each clonal group (CG), an identifier that would maximally reflect the widely adopted 7-gene ST identifier of the corresponding isolates.
Development a set of naming rules that prioritize the most abundant ST observed among isolates of each CG, as well as some supplementary rules in case of ties. This algorithm is summarized below, whereas an example is given in the technical notes pdf file.
## Installation and execution
Clone this repository with the following command line:
Verify that [Python](https://www.python.org/downloads/)(3.6 or higher) is installed, as well as [Pandas](https://pandas.pydata.org/)(x or higher) and [NetworkX](https://networkx.org/)
Execute the file `InheritanceAlgorithm.py` available inside the _src_ directory with the following command line model:
```bash
python InheritanceAlgorithm.py [options]
```
## Usage
Launch _InheritanceAlgorithm_ with option `-h` to read the following documentation:
This tool converts Genbank files into EMBL-like files for submission to ENA
optional arguments:
-h, --help show this help message and exit
-i FILEINPUT (mandatory) input tab-delimited file containing the CGs associated for each isolate
-c COLUMN_CG (mandatory) column(s) of the selected CG(s)
-o FILEOUTPUT (mandatory) output file name
```
## Example
The input file clustering.strain.txt inside the directory _example_ contains three columns; the first one corresponds to the strain identifiers, the next two correspond to the associated CGs.