Commit 11b81e74 authored by Alexis  CRISCUOLO's avatar Alexis CRISCUOLO
Browse files

1.3b

parent 0c7d6ca8
# C2A / A2C # C2A / A2C
_C2A_ and _A2C_ are command line programs written in [Java](https://docs.oracle.com/javase/8/docs/technotes/guides/language/index.html) that allow translating and back-translating FASTA-formatted codon and amino-acid sequence files, respectively. These tools were implemented to easily infer multiple sequence alignments at the codon level. _C2A_ and _A2C_ are command line programs written in [Java](https://docs.oracle.com/javase/8/docs/technotes/guides/language/index.html) to translate and back-translate FASTA-formatted codon and amino-acid sequence files, respectively. These tools were implemented to easily infer multiple sequence alignments at the codon level.
## Compilation and execution ## Compilation and execution
...@@ -8,6 +8,10 @@ The source codes are inside the _src_ directory and could be compiled and execut ...@@ -8,6 +8,10 @@ The source codes are inside the _src_ directory and could be compiled and execut
#### Building an executable jar file #### Building an executable jar file
Clone this repository with the following command line:
```bash
git clone https://gitlab.pasteur.fr/GIPhy/C2A.A2C.git
```
On computers with [Oracle JDK](http://www.oracle.com/technetwork/java/javase/downloads/index.html) (6 or higher) installed, Java executable jar files could be created. In a command-line window, go to the _src_ directory and type: On computers with [Oracle JDK](http://www.oracle.com/technetwork/java/javase/downloads/index.html) (6 or higher) installed, Java executable jar files could be created. In a command-line window, go to the _src_ directory and type:
```bash ```bash
javac C2A.java A2C.java javac C2A.java A2C.java
...@@ -17,7 +21,7 @@ echo Main-Class: A2C > MANIFEST.MF ...@@ -17,7 +21,7 @@ echo Main-Class: A2C > MANIFEST.MF
jar -cmvf MANIFEST.MF A2C.jar A2C.class jar -cmvf MANIFEST.MF A2C.jar A2C.class
rm MANIFEST.MF C2A.class A2C.class rm MANIFEST.MF C2A.class A2C.class
``` ```
This will create the two executable jar files `C2A.jar` and `A2C.jar` that could be launched with the following command line models: This will create the two executable jar files `C2A.jar` and `A2C.jar` that could be run with the following command line models:
```bash ```bash
java -jar C2A.jar [file] java -jar C2A.jar [file]
java -jar A2C.jar [files] java -jar A2C.jar [files]
...@@ -25,38 +29,48 @@ java -jar A2C.jar [files] ...@@ -25,38 +29,48 @@ java -jar A2C.jar [files]
#### Building a native code binary #### Building a native code binary
On computers with the [GNU compiler GCJ](https://gcc.gnu.org/onlinedocs/gcc-4.2.4/gcj/) installed, binaries could also be built. In a command-line window, go to the _src_ directory, and type: Clone this repository with the following command line:
```bash ```bash
make git clone https://gitlab.pasteur.fr/GIPhy/C2A.A2C.git
``` ```
This will create the two executable binary files `c2a` and `a2c` that could be launched with the following command line models: On computers with [GraalVM](hhttps://www.graalvm.org/downloads/) installed, native executables can be built. In a command-line window, go to the _src_ directory, and type:
```bash ```bash
./c2a [file] javac C2A.java A2C.java
./a2c [files] native-image C2A C2A
native-image A2C A2C
rm C2A.class A2C.class
```
This will create the two native executables `C2A` an `A2C` that can be run with the following command line models:
```bash
./C2A [file]
./A2C [files]
``` ```
## Usage ## Usage
Launch _C2A_ without option to read the following documentation: Run _C2A_ without option to read the following documentation:
``` ```
C2A
USAGE: C2A <seq.fna> USAGE: C2A <seq.fna>
where <seq.fna> is a FASTA-formatted codon sequence file. where <seq.fna> is a FASTA-formatted codon sequence file. This will
This will output in stdout the translation (standard output in stdout the translation (standard genetic code) of each
genetic code) of each sequence in the same format. sequence in the same format.
``` ```
Launch _A2C_ without option to read the following documentation: Run _A2C_ without option to read the following documentation:
``` ```
A2C
USAGE: A2C <ali.faa> <seq.fna> USAGE: A2C <ali.faa> <seq.fna>
where <ali.faa> is a FASTA-formatted multiple amino-acid where <ali.faa> is a FASTA-formatted multiple amino acid sequence
sequence alignment file and <seq.ali> a FASTA-formatted alignment file and <seq.ali> a FASTA-formatted file containing the
file containing the associated codon sequences. This associated codon sequences. This will output in stdout the multiple
will output in stdout the multiple back-translated back-translated sequence alignment.
sequence alignment.
``` ```
## Example ## Example
...@@ -67,7 +81,7 @@ First, using _C2A_ allows creating the file _seq.faa_ that contains the translat ...@@ -67,7 +81,7 @@ First, using _C2A_ allows creating the file _seq.faa_ that contains the translat
```bash ```bash
C2A seq.fna > seq.faa C2A seq.fna > seq.faa
``` ```
Second, the created _seq.faa_ could be used to infer a multiple amino-acid sequence alignment, which is expected to be more accurate than the one inferred from the initial codon sequences. The directory _src_ contains such an alignment inside the file _ali.faa_. Second, the created _seq.faa_ could be used to infer a multiple amino-acid sequence alignment, which is expected to be more accurate than the one inferred from the initial codon sequences. The directory _example_ contains such an alignment inside the file _ali.faa_.
Finally, using _A2C_ allows creating the file _ali.fna_ by back-translating the amino-acid sequences inside _ali.faa_ with the associated codon sequences inside _seq.fna_: Finally, using _A2C_ allows creating the file _ali.fna_ by back-translating the amino-acid sequences inside _ali.faa_ with the associated codon sequences inside _seq.fna_:
```bash ```bash
......
/* /*
#################################################################### ########################################################################################################
A2C: back-translating a multiple amino-acid sequence alignment into
a multiple codon sequence alignment
Copyright (C) 2015-2018 Alexis Criscuolo A2C: back-translating a multiple amino-acid sequence alignment into a multiple codon sequence alignment
This program is free software: you can redistribute it and/or modify Copyright (C) 2015-2020 Institut Pasteur
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or This program is free software: you can redistribute it and/or modify it under the terms of the GNU
General Public License as published by the Free Software Foundation, either version 3 of the License, or
(at your option) any later version. (at your option) any later version.
This program is distributed in the hope that it will be useful, but This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even
WITHOUT ANY WARRANTY; without even the implied warranty of the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU License for more details.
General Public License for more details.
You should have received a copy of the GNU General Public License You should have received a copy of the GNU General Public License along with this program. If not, see
along with this program. If not, see <http://www.gnu.org/licenses/>. <http://www.gnu.org/licenses/>.
Contact: Contact:
Institut Pasteur Alexis Criscuolo alexis.criscuolo@pasteur.fr
Bioinformatics and Biostatistics Hub Genome Informatics & Phylogenetics (GIPhy) giphy.pasteur.fr
C3BI, USR 3756 IP CNRS Bioinformatics and Biostatistics Hub research.pasteur.fr/team/hub-giphy
Paris, FRANCE USR 3756 IP CNRS research.pasteur.fr/team/bioinformatics-and-biostatistics-hub
Dpt. Biologie Computationnelle research.pasteur.fr/department/computational-biology
Institut Pasteur, Paris, FRANCE research.pasteur.fr
alexis.criscuolo@pasteur.fr ########################################################################################################
####################################################################
*/ */
import java.io.*; import java.io.*;
import java.util.*; import java.util.*;
public class A2C { public class A2C {
final static String VERSION = "1.3b.201024ac";
static File aafile, ntfile; static File aafile, ntfile;
static BufferedReader in; static BufferedReader in;
static ArrayList<String> fh; static ArrayList<String> fh;
...@@ -40,13 +40,16 @@ public class A2C { ...@@ -40,13 +40,16 @@ public class A2C {
public static void main(String[] args) throws IOException { public static void main(String[] args) throws IOException {
if ( args.length < 2 ) { if ( args.length < 2 ) {
System.out.println(""); System.out.println("");
System.out.println(" USAGE: A2C <ali.faa> <seq.fna>"); System.out.println(""); System.out.println(" A2C v." + VERSION + " Copyright (C) 2015-2020 Institut Pasteur");
System.out.println(" where <ali.faa> is a FASTA-formatted multiple amino-acid"); System.out.println("");
System.out.println(" sequence alignment file and <seq.ali> a FASTA-formatted"); System.out.println(" USAGE: A2C <ali.faa> <seq.fna>");
System.out.println(" file containing the associated codon sequences. This"); System.out.println("");
System.out.println(" will output in stdout the multiple back-translated"); System.out.println(" where <ali.faa> is a FASTA-formatted multiple amino acid sequence");
System.out.println(" sequence alignment."); System.out.println(" alignment file and <seq.ali> a FASTA-formatted file containing the");
System.out.println(""); System.exit(0); System.out.println(" associated codon sequences. This will output in stdout the multiple");
System.out.println(" back-translated sequence alignment.");
System.out.println("");
System.exit(0);
} }
fh = new ArrayList<String>(); aa = new ArrayList<StringBuilder>(); nt = new ArrayList<StringBuilder>(); i = n = -1; fh = new ArrayList<String>(); aa = new ArrayList<StringBuilder>(); nt = new ArrayList<StringBuilder>(); i = n = -1;
if ( ! (aafile=new File(args[0])).exists() ) { System.err.println("file " + args[0] + " does not exist"); System.exit(1); } if ( ! (aafile=new File(args[0])).exists() ) { System.err.println("file " + args[0] + " does not exist"); System.exit(1); }
......
/* /*
#################################################################### ########################################################################################################
C2A: translating a FASTA-formatted codon sequence file into an
amino-acid one
Copyright (C) 2015-2018 Alexis Criscuolo C2A: translating a FASTA-formatted codon sequence file into an amino-acid one
This program is free software: you can redistribute it and/or modify Copyright (C) 2015-2020 Institut Pasteur
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or This program is free software: you can redistribute it and/or modify it under the terms of the GNU
General Public License as published by the Free Software Foundation, either version 3 of the License, or
(at your option) any later version. (at your option) any later version.
This program is distributed in the hope that it will be useful, but This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even
WITHOUT ANY WARRANTY; without even the implied warranty of the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU License for more details.
General Public License for more details.
You should have received a copy of the GNU General Public License You should have received a copy of the GNU General Public License along with this program. If not, see
along with this program. If not, see <http://www.gnu.org/licenses/>. <http://www.gnu.org/licenses/>.
Contact: Contact:
Institut Pasteur Alexis Criscuolo alexis.criscuolo@pasteur.fr
Bioinformatics and Biostatistics Hub Genome Informatics & Phylogenetics (GIPhy) giphy.pasteur.fr
C3BI, USR 3756 IP CNRS Bioinformatics and Biostatistics Hub research.pasteur.fr/team/hub-giphy
Paris, FRANCE USR 3756 IP CNRS research.pasteur.fr/team/bioinformatics-and-biostatistics-hub
Dpt. Biologie Computationnelle research.pasteur.fr/department/computational-biology
Institut Pasteur, Paris, FRANCE research.pasteur.fr
alexis.criscuolo@pasteur.fr ########################################################################################################
####################################################################
*/ */
import java.io.*; import java.io.*;
public class C2A { public class C2A {
final static String VERSION = "1.3b.201024ac";
static BufferedReader in; static BufferedReader in;
static String line, fh; static String line, fh;
static int lgt; static int lgt;
...@@ -36,11 +36,15 @@ public class C2A { ...@@ -36,11 +36,15 @@ public class C2A {
public static void main(String[] args) throws IOException { public static void main(String[] args) throws IOException {
if ( args.length < 1 ) { if ( args.length < 1 ) {
System.out.println(""); System.out.println("");
System.out.println(" USAGE: C2A <seq.fna>"); System.out.println(""); System.out.println(" C2A v." + VERSION + " Copyright (C) 2015-2020 Institut Pasteur");
System.out.println(" where <seq.fna> is a FASTA-formatted codon sequence file."); System.out.println("");
System.out.println(" This will output in stdout the translation (standard"); System.out.println(" USAGE: C2A <seq.fna>");
System.out.println(" genetic code) of each sequence in the same format."); System.out.println("");
System.out.println(""); System.exit(0); System.out.println(" where <seq.fna> is a FASTA-formatted codon sequence file. This will");
System.out.println(" output in stdout the translation (standard genetic code) of each");
System.out.println(" sequence in the same format.");
System.out.println("");
System.exit(0);
} }
try { in = new BufferedReader(new FileReader(new File(args[0]))); sb = new StringBuilder(""); } try { in = new BufferedReader(new FileReader(new File(args[0]))); sb = new StringBuilder(""); }
catch ( FileNotFoundException e ) { System.out.println("file " + args[0] + " does not exist"); System.exit(1); } catch ( FileNotFoundException e ) { System.out.println("file " + args[0] + " does not exist"); System.exit(1); }
......
GCJ=gcj
GCJFLAGS=-fsource=1.6 -march=native -msse2 -O3 -minline-all-stringops -fomit-frame-pointer -momit-leaf-frame-pointer -fstrict-aliasing -fno-store-check -fno-bounds-check -funroll-all-loops -Wall
OTHERFLAGS=-funsafe-math-optimizations -ffast-math
all: C2A A2C
C2A: C2A.java
$(GCJ) $(GCJFLAGS) --main=C2A C2A.java -o c2a
A2C: A2C.java
$(GCJ) $(GCJFLAGS) --main=A2C A2C.java -o a2c
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment