diff --git a/Examples/commands/5-Align.sh b/Examples/commands/5-Align.sh index 81eb407d93173093bef0ff5f489cac0d336e564b..00e43f813ad6659621d93c0251f093fd7877f671 100644 --- a/Examples/commands/5-Align.sh +++ b/Examples/commands/5-Align.sh @@ -1 +1,5 @@ +# run example (after running annotate, pan and core steps) PanACoTA align -c Examples/4-corepers/PersGenome_<pangenome-filename>_1.lst -l Examples/2-res-prokka/LSTINFO-list_genomes.lst -n GENO3_1 -d Examples/2-res-prokka -o Examples/5-align + +# only run align step of example +PanACoTA align -c Examples/input_files/align-input/coregenome-example.lst -l Examples/input_files/pan-input/LSTINFO-list_genomes.lst -n GENO3_1 -d Examples/input_files/pan-input -o Examples/5-align \ No newline at end of file diff --git a/Examples/input_files/align-input/coregenome-example.lst b/Examples/input_files/align-input/coregenome-example.lst new file mode 100644 index 0000000000000000000000000000000000000000..cf2b22cbc020f59b8fd7b577b2f90b62a80a0230 --- /dev/null +++ b/Examples/input_files/align-input/coregenome-example.lst @@ -0,0 +1,6 @@ +3 GEN4.1111.00001.0001i_00005 GENO.0321.00001.0002i_00005 GENO.1216.00002.0001i_00006 +4 GEN4.1111.00001.0001i_00008 GENO.0321.00001.0002i_00010 GENO.1216.00002.0002b_00009 +5 GEN4.1111.00001.0001b_00009 GENO.0321.00001.0002b_00011 GENO.1216.00002.0002b_00010 +7 GEN4.1111.00001.0001i_00002 GENO.0321.00001.0002b_00003 GENO.1216.00002.0001i_00003 +9 GEN4.1111.00001.0001i_00007 GENO.0321.00001.0002i_00009 GENO.1216.00002.0001b_00008 +11 GEN4.1111.00001.0001i_00004 GENO.0321.00001.0002i_00004 GENO.1216.00002.0001i_00005 diff --git a/Examples/input_files/pan-input/Genes/GEN4.1111.00001.gen b/Examples/input_files/pan-input/Genes/GEN4.1111.00001.gen new file mode 100644 index 0000000000000000000000000000000000000000..d5c426371e32800c11fc2fb1013a41642a4b0e09 --- /dev/null +++ b/Examples/input_files/pan-input/Genes/GEN4.1111.00001.gen @@ -0,0 +1,128 @@ +>GEN4.1111.00001.0001b_00001 831 amiD | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 | COG:COG3023 +ATGAAAGCGCTACTGTGGCTGGTGGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGCGAA +AAAGGAATTATCGATAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCC +TATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTG +GCGACGTTAACGGGCCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTA +TATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCG +GGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTG +GAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCG +CAAATTCAGGCATTGATTCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCG +CAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGC +TTCCCGTGGCGCGAGCTGGCGGCGCAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTG +GCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCG +TTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAGCAGCGG +GTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCC +GAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAA +>GEN4.1111.00001.0001i_00002 381 hpcD | 5-carboxymethyl-2-hydroxymuconate Delta-isomerase | 5.3.3.10 | similar to AA sequence:UniProtKB:Q05354 | NA +ATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTTACCCGGC +CTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGATTTTCCCCATCGGCGGTATC +CGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGATTATGCG +TTTGTGCATATGACGCTGAAAATCGGTACCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTC +GGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCGACCTGATGGAGAACCGCTAT +CTGGCGCTGTCGTTTGAGATTGCCGAGCTACATCCGACGCTCAATTACAAACAAAACAAC +GTACACGCGTTATTTAAATAG +>GEN4.1111.00001.0001i_00003 804 hpcG | 2-oxo-hept-4-ene-1,7-dioate hydratase | 4.2.1.163 | similar to AA sequence:UniProtKB:P42270 | NA +ATGCTCGATAAACAGACCCATACCCTGATCGCCCAGCGACTTAATCAGGCTGAAAAACAG +CGTGAACAAATTCGCGCAGTGTCGCTGGATTATCCCAACATCACTATTGAAGATGCCTAT +GCCGTACAGCGTGAATGGGTCAATATCAAGATCGCCGAAGGGCGCACGCTCAAAGGCCAC +AAAATCGGCCTGACCTCAAAAGCGATGCAGGCCAGCTCGCAAATCAGCGAACCGGATTAC +GGCGCGCTGCTTGACGATATGTTCTTCCATGACGGCGGCGATATCCCCACCGACCGTTTT +ATCGTCCCGCGTATTGAAGTGGAGCTGGCGTTCGTGCTGGCGAAACCGCTGCGCGGCCCT +CACTGCACGCTGTTCGACGTCTACAACGCCACGGATTATGTGATTCCGGCGCTGGAACTG +ATTGACGCCCGCAGCCACAACATCGACCCGGAAACCCAGCGCCCGCGCAAAGTGTTCGAC +ACCATTTCCGACAACGCCGCCAACGCCGGGGTGATCCTCGGTGGTCGCCCCATCAAACCA +GACGAGCTGGATCTGCGCTGGATCTCCGCGCTGCTCTATCGCAACGGCGTGATCGAAGAA +ACCGGCGTCGCCGCAGGCGTGCTGAATCATCCGGCCAACGGCGTGGCGTGGCTGGCGAAC +AAGCTTGCCCCCTACGATGTCCAGCTTGAAGCCGGGCAGATCATCCTCGGCGGCTCGTTC +ACCCGCCCGGTGCCGGCGAGCAAGGGCGACACCTTCCATGTCGATTACGGCAACATGGGC +GCGATCAGTTGCCGGTTTGTGTAA +>GEN4.1111.00001.0001i_00004 1404 NA | hypothetical protein | NA | NA | NA +ATGCCCGTGAATAAGTTCTCCCGACGTACCCTCCTGACGGCAGGTTCCGCGCTTGCTGTT +CTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCGAGACCGTCGATATT +AAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGCCGACGGACAG +ACCGTGGTCTTACCGCCAGGATGGGTGTGTGAAAATATCAATGCGGCGATAACGATTCCG +GCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATGGCCGGGGACGGTTTATT +TTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAATGTGACGCTG +GATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGACGATGAGCGGCTTTGGCCCCGTC +GCGCAAATTTTCATCGGCGGTAAGGAACCGCAGGTGATGCGTAATCTCATTATCGATGAC +ATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCATAACCAAATGGAC +GGCGCGCGGATTACGCATAGCCGCTTTAGCGATTTGCAGGGGGACGCCATTGAGTGGAAT +GTCGCGATTCACGACCGCGACATCCTGATTTCCGATCATGTCATCGAACGCATTGATTGT +ACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCACCTATGACAAC +AGTTATCCTGAAGATCAGGCAGTAAAAAACTTTGTGGTGGCCAATATTACCGGATCTGAT +TGCCGACAGCTGGTGCACGTAGAAAATGGCAAACATTTCGTCATTCGCAATGTCAAAGCC +AAAAACATCACGCCCGATTTCAGTAAAAATGCGGGTATTGATAACGCAACGATCGCCATT +TATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAGTGCTGGGATGCTC +ATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAACTTTAAATTAAAC +GCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCAAATTTCCTCC +GGCAACATCCCCTCTTTTGTCGCCATCACCAATGTACGGATGACGCGTGCTACGCTGGAA +CTGCATAATCAACCGCAGCACCTCTTTCTGCGTAATATCAACGTGATGCAAACTTCAGCG +ATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGTAAAGATGTCCGTGGTCAATTTATG +GCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAATGAAAACGGGCAG +AGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGCAGTGAATTTT +TCGCTGCCGAAGCGGGGAGGGTAA +>GEN4.1111.00001.0001i_00005 774 hisP | Histidine transport ATP-binding protein HisP | NA | similar to AA sequence:UniProtKB:P02915 | COG:COG4598 +ATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTG +CTGAAAGGGGTATCGTTGCAGGCCCGCGCCGGAGATGTGATTAGCATCATCGGCTCGTCC +GGCTCCGGTAAAAGCACTTTTTTGCGCTGTATTAACTTCCTCGAAAAACCGAGCGAAGGC +GCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGACGGGCAGCTCAAA +GTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCAC +TTTAACCTCTGGAGCCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTA +CTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGG +ATTGATGAGCGCGCTCAGGGCAAATATCCCGTCCATCTCTCCGGCGGCCAACAGCAGCGC +GTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACA +TCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAA +GAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCCCGCCATGTCTCTTCG +CACGTTATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGCGATCCGGAGCAGGTGTTC +GGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAA +>GEN4.1111.00001.0001i_00006 768 trmD | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 | COG:COG0336 +GTGTTTATTGGCATCGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGG +GTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGAC +TTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCGGGGATG +TTAATGATGGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCAGGTGAA +GGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGC +GAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAG +CGCGTAATTCAGACCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTCAGCGGT +GGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGGGTTCTG +GGGCATGAGGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCAC +TATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCGGGAAAC +CATGCCGAGATACGTCGCTGGCGCTTGAAACAGTCGCTGGGCCGAACCTGGCTTAGAAGA +CCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTC +AAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAG +>GEN4.1111.00001.0001i_00007 261 NA | hypothetical protein | NA | NA | NA +ATGGCTGGGTTGCACGCGCCATATGCATATAGCGCGCATCATGCGGTGAATTTCTGTTCT +GAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGTATGTTCGAAAAGACCGGCGAT +CGTCAACTTAGCGCGTGGGAGGCCGGAATTCTGACGCGTCGCTATGGTCTGGATAAAGAA +ATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATGGCGGTTCGCTTCTTTATGGTC +GGTTATCGACTCGAAGGTTGA +>GEN4.1111.00001.0001i_00008 582 NA | hypothetical protein | NA | NA | NA +ATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGC +CAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTCGAGATGATCGTCCCTCAACTA +CCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAATCTCTCGTGCTTGAGCATGGCGGT +GCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAA +TGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACC +GACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGC +CATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGG +CTACTGCAACAGACGGGCGATGAAGTGCTGGATTACCGCCAGGCGGTGGCATATTACGCC +TCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCACGGGCTTCGAAGATTAT +TTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGA +>GEN4.1111.00001.0001b_00009 1083 dctM | C4-dicarboxylate TRAP transporter large permease protein DctM | NA | similar to AA sequence:UniProtKB:O07838 | COG:COG1593 +ATGAATAGCGGGGGAATTGCCGCCCGGCTGGTCAATTTTGCCAAACTGTTTACTGGCAAA +CTGCCCGGCTCGCTCTCTTATACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCC +GGATCGGCAATTGCCGCCTCAACCTCCATCGGCGGCGTGATGGTGCCGATGAGCGCGCGC +GAAGGTTACGATCGCGGCTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATG +TTAATTCCGCCCACCACGGCTTTTATCCTTTATGCGCTGGCAAGCGGGGGAACATCGATT +GCCGCTCTGTTCGCCGGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTG +GTCACGCTGGTGGTCGCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGC +ATGGCGCTAAAAGTTGCCGTTGAGGCCATTCCCAGCCTGCTGCTGATCGTGATTATCGTC +GGCGGCATTGTGCAGGGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTAT +ACATTATTGTTGACGATGGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATT +TTGCTCCAGACAGTGGTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCG +ATGTCCTTCTCGATGTCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGT +ATTTCCGCCAATAAACTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGC +GCATTTATGGATATCGGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCGATCATG +GCTAAACTGGGCGTCGATCCGGTGCATTTTGGCATTATCATGATCTATAACCTGGCGATT +GGCACCATTACGCCGCCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTC +AAAGTTGAGGAAGTGATTAAACCGTTGCTGCCTTTTTACGGCGCGATTATCGGCGTTCTG +TTATTAATTACCTACATTCCGGAAATCACACTGTTCTTACCCCGTCTACTGGGCATCATG +TAA diff --git a/Examples/input_files/pan-input/Genes/GENO.0321.00001.gen b/Examples/input_files/pan-input/Genes/GENO.0321.00001.gen new file mode 100644 index 0000000000000000000000000000000000000000..7e6ce6accb014d8cee7e5004693b0fe096ccff7f --- /dev/null +++ b/Examples/input_files/pan-input/Genes/GENO.0321.00001.gen @@ -0,0 +1,169 @@ +>GENO.0321.00001.0001b_00001 1668 glnS | Glutamine--tRNA ligase | 6.1.1.18 | similar to AA sequence:UniProtKB:P00962 | COG:COG0008 +ATGAGTGAGGCTGAAGCCCGCCCGACTAACTTTATTCGTCAGATTATTGATGAAGATCTG +GCGAGTGGTAAACATACCACTGTCCATACCCGTTTTCCGCCGGAGCCGAATGGCTATCTG +CATATCGGCCACGCGAAATCTATCTGCCTGAACTTTGGCATCGCGCAAGATTATCAGGGC +CAGTGCAACCTGCGTTTCGATGACACCAACCCGGTAAAAGAAGATATCGAGTACGTTGAT +TCGATCAAAAACGACGTCGAGTGGTTAGGCTTTCACTGGTCTGGCGATATTCGCTACTCC +TCCGATTACTTTGACCAACTGCACGCCTATGCGGTCGAGCTAATCAATAAAGGCCTGGCC +TATGTTGATGAGCTGACGCCGGAGCAGATCCGTGAATACCGCGGTACGCTGACCGCGCCG +GGTAAAAACAGCCCGTTCCGCGATCGCAGCGTCGAAGAGAACCTCGCGCTATTTGAAAAA +ATGCGTACCGGCGGTTTTGAAGAGGGTAAAGCCTGTCTGCGCGCTAAAATCGACATGGCG +TCGCCGTTTATCGTGATGCGCGATCCGGTGCTGTATCGCATTAAATTCGCCGAGCATCAT +CAGACTGGCAACAAGTGGTGCATCTATCCGATGTACGACTTTACTCACTGCATCAGCGAT +GCGCTGGAAGGCATTACTCATTCTCTGTGTACGCTGGAGTTCCAGGATAACCGTCGTCTG +TACGACTGGGTGCTGGACAACATCACCATTCCGGTTCACCCGCGCCAGTACGAATTCTCG +CGCCTGAATCTGGAATACACCGTGATGTCCAAGCGTAAGCTGAACCTGCTGGTGACCGAC +AAACACGTCGAAGGTTGGGACGATCCGCGTATGCCGACTATTTCCGGTCTGCGCCGTCGC +GGCTATACCGCGGCTTCTATTCGTGAGTTCTGCAAACGCATCGGCGTCACCAAGCAGGAC +AACACTATTGAGATGGCGTCGCTGGAATCCTGCATTCGCGAAGATCTGAACGAAAACGCG +CCGCGCGCGATGGCGGTAATCGATCCGGTAAAACTGGTTATCGAAAATTACCCGCAGGGT +GAGAGCGAAATGGTTACCATGCCTAACCATCCGAATAAACCGGAGATGGGCAGCCGTGAA +GTGCCGTTTAGCGGTGAGATCTGGATCGATCGCGCAGATTTCCGCGAAGAAGCGAACAAA +CAGTACAAACGTCTGGTGATGGGCAAAGAAGTGCGTCTGCGTAATGCCTACGTCATTAAA +GCGGAGCGCGTAGAGAAGGATGCCGAAGGGAATATCACCACCATCTTCTGTACCTATGAT +GCTGATACGCTGAGTAAAGATCCGGCTGACGGGCGTAAAGTGAAAGGCGTAATCCACTGG +GTTAGCGCAGCACATGCGCTGCCGATTGAAATTCGTCTCTACGACCGTCTGTTCAGCGTG +CCGAATCCGGGCGCCGCGGAGGACTTCCTGTCTGTTATCAACCCCGAATCATTAGTGATT +AAGCAGGGGTATGGCGAGCCGTCGCTGAAAGCGGCGGTAGCAGGAAAAGCTTTCCAGTTT +GAACGTGAAGGCTACTTCTGCCTCGACAGCCGCTATGCAACGGCCGATAAGCTGGTCTTT +AACCGCACCGTGGGCCTGCGTGATACCTGGGCGAAAGCGGGCGAGTAA +>GENO.0321.00001.0001b_00002 831 amiD | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 | COG:COG3023 +ATGAAAGCGCTACTGTGGCTGGTGGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGCGAA +AAAGGAATTATCGATAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCC +TATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTG +GCGACGTTAACGGGTCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTA +TATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCG +GGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTG +GAAAATCGTGGTTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCG +CAAATTCAGGCATTGATTCCGTTAGCGAAGGATATTATCGCGCGCTATAACATCAAACCG +CAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGC +TTCCCGTGGCGCGAGCTGGCGGCGCAGGGGATTAGCGCCTGGCCTGACGCCCAGCGTGTG +GCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCG +TTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCACGCGAGCAGCAGCGG +GTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCC +GAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAA +>GENO.0321.00001.0002b_00003 381 hpcD | 5-carboxymethyl-2-hydroxymuconate Delta-isomerase | 5.3.3.10 | similar to AA sequence:UniProtKB:Q05354 | NA +ATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTTACCCGGC +CTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGATTTTCCCCATCGGCGGTATC +CGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGATTACGCG +TTTGTGCATATGACGCTGAAAATCGGCGCCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTT +GGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCGACCTGATGGAGAACCGCTAT +CTGGCGCTGTCGTTTGAGATTGCCGAGCTACATCCGACGCTCAATTACAAACAAAACAAC +GTACACGCGTTATTTAAATAG +>GENO.0321.00001.0002i_00004 1410 NA | hypothetical protein | NA | NA | NA +ATGAGCATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTGACGGCAGGTTCTGCGCTT +GCTGTTCTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCGAGACCGTC +GATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGCCGAC +GGACAGACCGTGGTCGTACCGCCAGGATGGGTGTGTGAAAATATCAATGCGGCGATAACG +ATTCCGGCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATGGCCGGGGACGG +TTTATTTTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAATGTG +ACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGGCGATGAGCGGCTTTGGC +CCCGTCGCGCAAATTTTCATCGGTGGTAAGGAACCGCAGGTGATGCGTAATCTCATTATC +GATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCATAACCAA +ATGGATGGCGCGCGGATTACGCATAGCCGCTTTAGCGATTTACAGGGGGACGCCATTGAG +TGGAATGTCGCGATTCACGACCGCGACATCCTGATTTCCGATCATGTCATCGAACGCATT +AATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCACCTAT +GACAACAGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTGGTGGCCAATATTACCGGA +TCTGATTGCCGACAGCTTGTGCACGTAGAAAATGGCAAACATTTCGTCATTCGCAATGTC +AAAGCCAAAAACATCACGCCCGGTTTCAGTAAAAATGCGGGTATTGATAACGCAACGATC +GCAATTTATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAGTGCCGGG +ATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAACTTTAAA +TTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCAAATT +TCCTCCGGCAACACCCCCTCTTTTGTCGCCATCACCAATGTACGGATGACGCGTGCTACG +CTGGAACTGCATAATCAACCGCAGCACCTCTTTCTGCGCAATATCAACGTGATGCAAACT +TCAGCGATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGTAAAGATGTACGTGGTCAA +TTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAATGAAAAC +GGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGCAGTG +AATTTTTCGCTGCCGAAGCGGGGAGGGTAA +>GENO.0321.00001.0002i_00005 774 hisP | Histidine transport ATP-binding protein HisP | NA | similar to AA sequence:UniProtKB:P02915 | COG:COG4598 +ATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTG +CTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGTGATTAGCATCATCGGCTCGTCC +GGCTCCGGTAAAAGCACTTTTTTGCGCTGTATTAACTTCCTCGAAAAACCGAGCGAAGGC +GCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGATGGGCAGCTCAAA +GTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCAC +TTCAACCTCTGGAGCCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTA +CTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGG +ATTGATGAGCGCGCTCAGGGCAAATATCCCGTCCATCTCTCCGGCGGCCAACAGCAGCGC +GTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACA +TCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAA +GAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCTCGCCATGTCTCTTCG +CACGTTATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGTGATCCGGAGCAGGTGTTC +GGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAA +>GENO.0321.00001.0002i_00006 735 trmD_1 | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 | COG:COG0336 +ATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTG +AACATCCAAAGCTGGAGTCCTCGCGACTTCGCGCATGACCGGCACCGTACCGTGGACGAC +CGTCCTTACGGCGGCGGACCGGGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATT +CACGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGA +CGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAGCTTATTCTGGTG +TGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGACCGAAATTGACGAAGAATGG +TCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCC +GTCGCCCGGTTTATACCGGGGGTTCTGGGGCATGAGGCATCAGCAATCGAAGATTCGTTT +GCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGAA +GTACCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCGCTGGCGTTTGAAACAG +TCGCTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAA +GAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGCATAAACAT +GATGGGATGGCATAG +>GENO.0321.00001.0002i_00007 828 trmD_2 | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 | COG:COG0336 +ATGATGGGATGGCATAGCCGTCATATGGGCTCTCAGAGGAGACGCGCTAATGCGCAGTAC +GTGTTTATTGGCATAGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGG +GTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGAC +TTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCGGGGATG +TTAATGATGGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCAGGTGAA +GGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGC +GAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAG +CGCGTAATTCAGACCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTCAGCGGT +GGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGGGTTCTG +GGGCATGAGGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCAC +TATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCGGGAAAC +CATGCTGAGATACGTCGCTGGCGTTTGAAACAGTCGCTGGGCCGAACCTGGCTTAGAAGA +CCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTC +AAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAG +>GENO.0321.00001.0002i_00008 465 NA | hypothetical protein | NA | NA | NA +ATGAAGTTTGTTGCGCCCGAACAGGCACCGGAACAGGCGGAGGTCATCAAAAATACGCCG +TTCTGGCCTGATGTGGACCTGTCGGAATTTCGCAGTGTGATGCGCACTGACGGCACGGTG +ACGCAGCCGCGTTTAAAGCAGGTCGTGCTGACGGCGATCTCTGAGGTTAACGCTGAGCTG +TACGACTTCCGCAACCGTCAGCAGATGCTGGGCTGGCGGACACTTGCTGAGGTTCCCGCA +GAAATGCTGGACGGTAAAAGCGAGCGTATCCGGCACTACCACAACGCTGTTTTTTGCTGG +GCGCGCGCTGTGCTTAATGAGCGTTATCAGGACTATGACGCCACGGCGTCAGGCGTGAAG +CGAGGGGAGGAGCTGGCGGAGGCCAGCGGCGATCTGTGGCGTGATGCCCGCTGGGCCATC +AGCCGGGTGCAGGATGTACCGCACTGTACGGTGGAGCTTATCTGA +>GENO.0321.00001.0002i_00009 261 NA | hypothetical protein | NA | NA | NA +ATGGCCGGGTTGCACGCGCCATATGCATATAGCGCGCATCATGCGGTGAATTTCTGTTCT +GAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGTATGTTCGAAAAGACCGGCGAT +CGTCAACTTAGCGCGTGGGAGGCTGGAATTCTGACGCGTCGCTATGGTCTGGATAAAGAA +ATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATGGCGGTTCGCTTCTTTATGGCC +GGTTATCGACTCGAAGGTTGA +>GENO.0321.00001.0002i_00010 582 NA | hypothetical protein | NA | NA | NA +ATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGC +CAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTTGAGATGATCGTCCCTCAGCTG +CCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAATCTCTCGTGCTTGAGCATGGCGGT +GCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAA +TGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACC +GACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGC +CATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGG +CTACTGCAACAGACGGGCGATGAAGTGCTGTATTACCGCCAGGCGGTGGCATATTACGCC +TCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCACGGGCTTCGAAGATTAT +TTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGA +>GENO.0321.00001.0002b_00011 1308 dctM | C4-dicarboxylate TRAP transporter large permease protein DctM | NA | similar to AA sequence:UniProtKB:O07838 | COG:COG1593 +ATGATTGACCCTATTTTTGCGTCCTGTACGCTAATTGCCGTCTTTGTTGTTTTACTGGCC +ATGGGCGCGCCTATCGGGATCTGCATCGTTATCGCCTCTTTCAGCACCATGATGCTGGTA +CTGCCTTTCGATATTTCGATGTTCGCCACCGCGCAAAAAATGTTCTCCAGCCTGGACAGT +TTTGCCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGCGGGGGA +ATTGCCGCCCGGCTGGTCAATTTTGCCAAACTGTTTACTGGCAAACTGCCCGGCTCGCTC +TCTTATACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCGATTGCC +GCCTCAACCTCCATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTACGATCGC +GGCTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAATTCCGCCCACC +ACGGCTTTTATCCTTTACGCGCTGGCAAGCGGGGGAACATCGATTGCCGCTCTGTTCGCC +GGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTGGTGGTC +GCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGTATGGCGCTAAAAGTT +GCCGTTGAGGCCATTCCCAGCCTGCTGCTGATCGTGATTATTGTCGGCGGCATTGTGCAG +GGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTATACGTTATTGCTGACG +ATGGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAGACAGTG +GTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTCTCGATG +TCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCCAATAAA +CTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATGGATATT +GGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCAATCATGGCTAAACTGGGCGTC +GATCCGGTGCATTTCGGCATTATCATGATCTATAACCTGGCGATTGGCACCATTACGCCG +CCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAGGAAGTG +ATTAAACCGTTGCTGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTATTAATTACCTAC +ATTCCGGAAATCACACTGTTCTTACCCCGTCTACTGGGCATCATGTAA diff --git a/Examples/input_files/pan-input/Genes/GENO.1216.00002.gen b/Examples/input_files/pan-input/Genes/GENO.1216.00002.gen new file mode 100644 index 0000000000000000000000000000000000000000..e58bf05f8b869e5db8d0cc48d26f9110b29b1fb7 --- /dev/null +++ b/Examples/input_files/pan-input/Genes/GENO.1216.00002.gen @@ -0,0 +1,158 @@ +>GENO.1216.00002.0001b_00001 831 amiD_1 | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 | COG:COG3023 +ATGAGAGCGCTACTGTGGCTGGTGGGTCTTGCGTTGCTGTTAACAGGCTGCGCGAGCGAA +AAAGGAATTATCGATGAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCC +TATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTG +GCGACGTTAACGGGCCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTA +TATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCG +GGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTG +GAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCG +CAAATTCAGGCATTGATTCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCG +CAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGC +TTCCCGTGGCGCGAGCTGGCGGCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTG +GCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCG +TTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGG +GTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCC +GAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAA +>GENO.1216.00002.0001i_00002 831 amiD_2 | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 | COG:COG3023 +ATGAGAGCGCTACTGTGGCTGGTGGGTCTTGCGTTGCTGTTAACAGGCTGCGCGAGCGAA +AAAGGAATTATCGATGAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCC +TATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTG +GCGACGTTAACGGGCCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTA +TATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCG +GGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTG +GAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCG +CAAATTCAGGCATTGATTCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCG +CAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGC +TTCCCGTGGCGCGAGCTGGCGGCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTG +GCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCG +TTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGG +GTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCC +GAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAA +>GENO.1216.00002.0001i_00003 474 hpcD | 5-carboxymethyl-2-hydroxymuconate Delta-isomerase | 5.3.3.10 | similar to AA sequence:UniProtKB:Q05354 | NA +ATGCCGAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACAGG +GCGCGGTATCGCGAGGACTCTCTCTTCGAGGACATGCCGCACTTTATTGCTGAATGTACT +GAAAATATTCGCGAGCAGGCTGATTTACCCGGCCTGTTCAGCAAGGTAAACGAGGCGCTG +GCCGCCAGCGGGATTTTCCCCATCGGCGGTATCCGCAGTCGCGCCCACTGGCTGGATACC +TGGCAGATGGCTGACGGTAAGCATGATTACGCGTTTGTGCATATGACGCTGAAAATCGGC +GCCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTCGGCGAAATGCTGTTCGGGCTGATTAAA +GCCCACTTCGCCGACCTGATGGAGAACCGCTATCTGGCGCTGTCGTTTGAGATTGCCGAG +TTACATCCAACGCTCAATTACAAACAAAACAACGTACACGCGTTATTTAAATAG +>GENO.1216.00002.0001i_00004 225 pspB | Phage shock protein B | NA | similar to AA sequence:UniProtKB:P0AFM9 | NA +ATGAGCGCGCTATTTCTGGCCATCCCGTTAACCATTTTTGTGTTGTTTGTGTTACCGATT +TGGCTGTGGCTGCATTACAGCAACCGCGCCGGTCGGGGAGAACTGTCGCAAAGCGAGCAG +CAACGCTTACTGCAACTCACAGACGACGCGCAACGTATGCGCGAGCGCATTCAGGCGCTG +GAAGACATTCTTGATGCAGAGCATCCGAACTGGAGAGAGCGCTAA +>GENO.1216.00002.0001i_00005 1404 NA | hypothetical protein | NA | NA | NA +ATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTGACGGCAGGTTCTGCGCTTGCTGTT +CTTCCTTTCCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCGAGACCGTCGATATT +AAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGCCGACGGACAG +ACCGTGGTCGTACCGCCAGGATGGGTGTGTGAAAATATCAATGCGGCGATAACGATTCCG +GCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATGGCCGGGGACGGTTTATT +TTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAATGTGACGCTG +GATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGGCGATGAGCGGCTTTGGCCCCGTC +GCGCAAATTTTCATCGGTGGTAAGGAACCGCAGGTGATGCGTAATCTCATTATCGATGAC +ATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCATAACCAAATGGAC +GGCGCGAGGATTACGCATAGCCGCTTTAGCGATTTACAGGGGGACGCCATTGAGTGGAAT +GTCGCGATTCACGACCGCGATATCCTGATTTCCGATCATGTCATCGAACGCATTGATTGT +ACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCACCTATGACAAC +AGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTGGTGGCCAATATTACCGGATCTGAT +TGCCGACAGCTTGTGCACGTAGAAAATGGCAAACATTTCGTCATTCGCAATGTCAAAGCC +AAAAACATCACGCCCGATTTCAGTAAAAATGCGGGTATTGATAACGCAACGATCGCAATT +TATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAGTGCCGGGATGCTC +ATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAACTTTAAATTAAAC +GCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCAAATTTCCTCC +GGCAACACCCCCTCTTTTGTCGCCATCACCAATGTACGGATGACGCGTGCTACGCTGGAA +CTGCATAATCAACCGCAGCACCTCTTCCTGCGTAATATCAACGTGATGCAAACTTCAGCG +ATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGTAAAGATGTCCGTGGTCAATTTATG +GCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAATGAAAACGGGCAG +AGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGCAGTGAATTTT +TCGCTGCCGAAGCGGGGAGGGTAA +>GENO.1216.00002.0001i_00006 774 hisP | Histidine transport ATP-binding protein HisP | NA | similar to AA sequence:UniProtKB:P02915 | COG:COG4598 +ATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTG +CTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGTGATTAGCATCATCGGCTCGTCC +GGCTCCGGTAAAAGCACTTTTTTGCGCTGTATTAACTTCCTCGAAAAACCGAGCGAAGAC +GCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGACGGGCAGCTCAAA +GTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCAC +TTTAACCTCTGGAGCCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTA +CTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGG +ATTGATGAGCGCGCTCAGGGCAAATATCCCGTTCATCTCTCCGGTGGCCAACAGCAGCGC +GTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACT +TCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAA +GAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGTTCGCCATGTCTCTTCG +CACGTTATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGCGATCCGGAGCAGGTGTTC +GGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAA +>GENO.1216.00002.0001i_00007 792 trmD | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 | COG:COG0336 +ATGCGCGCGATATATCGGCGATGCGTGTTTATTGGCATCGTTAGCCTGTTTCCTGAAATG +TTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAATGGCCTGCTGAAC +ATCCAAAGCTGGAGTCCTCGCGACTTCACGCATGACCGGCACCGTACCGTGGACGATCGT +CCTTACGGCGGCGGACCAGGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATTCAC +GCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGC +AAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGT +GGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGACCGAAATTGACGAAGAATGGTCA +ATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCCGTC +GCCCGGTTTATACCGGGAGTTCTGGGGCATGAAGCATCAGCAATCGAAGATTCGTTTGCT +GATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTA +CCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCGCTGGCGTTTGAAACAGTCG +CTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAG +CAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGCATAAACATGAT +GGGATGGCATAG +>GENO.1216.00002.0001b_00008 288 NA | hypothetical protein | NA | NA | NA +ATGAATAATCATTTTGGGAAAGGGTTAATGGCCGGGTTGCACGCGCCATATGCATATAGC +GCGCATCATGCGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATTGGGTTTTACA +CACCGTATGTTCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGCTGGAATTCTG +ACGCGTCGCTATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGAGAATCATTCC +GGGATGGCGGTTCGCTTCTTTATGGCCGGTTATCGACTCGAAGGTTGA +>GENO.1216.00002.0002b_00009 582 NA | hypothetical protein | NA | NA | NA +ATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGC +CAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTTGAGATGATCGTCCCTCAACTA +CCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAGTCTCTCGTGCTTGAGCATGGCGGT +GCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAA +TGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACC +GACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGC +CATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGG +CTACTGCAACAGACGGGCGATGAAGTGCTGGATTACCGCCAGGCGGTGGCATATTACGCC +TCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCACGGGCTTCGAAGATTAT +TTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGA +>GENO.1216.00002.0002b_00010 1308 dctM | C4-dicarboxylate TRAP transporter large permease protein DctM | NA | similar to AA sequence:UniProtKB:O07838 | COG:COG1593 +ATGATTGACCCTATTTTTGCGTCCTGTACGCTAATTGCCGTCTTTGTTGTTTTACTGGCC +ATGGGCGCGCCTATCGGGATCTGCATCGTTATCGCCTCTTTCAGCACCATGATGCTGGTA +CTGCCTTTCGATATTTCGATGTTCGCCACCGCGCAAAAAATGTTCTCCAGCCTGGACAGT +TTTGCCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGCGGGGGA +ATTGCCGCCCGGCTGATCAATTTTGCCAAACTGTTTACTGGCAAACTGCCCGGTTCGCTC +TCTTATACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCAATTGCC +GCCTCAACCTCCATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTACGATCGC +GGCTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAATTCCGCCCACC +ACGGCTTTTATCCTTTATGCGCTGGCAAGCGGGGGAACATCGATTGCCGCTCTGTTCGCC +GGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTGGTAGTC +GCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGCATGGCGCTAAAAGTT +GCCGTTGAGGCCATTCCCAGCCTGTTACTGATCGTGATTATCGTCGGCGGCATTGTGCAG +GGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTATACGTTATTGCTGACG +ATGGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAGACAGTG +GTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTCTCAATG +TCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCCAATAAA +CTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATGGATATC +GGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCGATCATGGCTAAACTGGGCGTC +GATCCGGTGCATTTGGGCATTATCATGATCTATAACCTGGCGATTGGCACCATTACGCCG +CCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAGGAAGTG +ATTAAACCGTTGCTGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTATTAATTACCTAC +ATTCCGGAAATCATACTGTTCTTACCCCGTCTACTGGGCATCATGTAA +>GENO.1216.00002.0003b_00011 255 NA | hypothetical protein | NA | NA | NA +ATGATAATTAATGGGAAATTAATTAAAGCAAAAGACTTAGCTAAGGCTGCAGGTGTATCT +CGTTCAACAGTGATTAAATATTACGGCATTAGCCGTGAGAATTACGAAAGGGTAGCAACT +GAAAGAAGGAAGCTTGCTTTTGAACTAAGAGCATCAGGTTTAAAATGGAAAGAAGTTGCT +GAAAAAATGAACACGACAAAATATAGCGCAATTGCATATTATAGACGATATTTAGCATTA +GAGAAAAACAAATAA +>GENO.1216.00002.0003b_00012 780 NA | IS21 family transposase ISSen3 | NA | similar to AA sequence:ISfinder:ISSen3 | NA +ATGGTCGAACTGCAACATCAACGGCTGATGGTGCTTGCCGAACAGCTCCAGCTGGACAGT +CTTATCGGCGCAGCGCCGGCGCTGTCGCAACAGGCGGTGGATCAGGAATGGAGCTACATG +GACTTCCTGGAGCACCTGTTACATGAGGAGAAACTGGCCCGGCATCAGCGTAAACAGGCG +ATGTACACGCGGATGGCAGCCTTCCCGGCGGTAAAGACGTTCGAGGAGTACGACTTCACC +TTCGCCACCGGCGCTCCTCAGAAGCAAATCCAGTCGCTGCGATCCCTGAGCTTCATAGAG +CGTAACGAAAACATCGTGTTGCTGGGGCCATCGGGCGTGGGAAAAACGCATCTGGCGATA +GCCATGGGCTACGAAGCAGTACGGGCGGGCATCAAGGTTCGCTTCACAACAGCAGCGGAC +CTGCTGCTACAGCTGTCCACTTCACAGCGTCAGGGCCGTTACAAAACGACTCTCAATCGT +GGTGTCATGGCCCCGAAGCTGCTTATCATCGATGAAATAGGTTATCTGCCGTTCAGTCAG +GAGGAAGCCAAGCTGTTCTTCCAGGTCATCGCCAAACGTTACGAGAAGAGCGCGATGATC +CTGACCTCCAACCTGCCGTTCGGGCAGTGGGATCAGACGTTCGCCGGTGATGCAGCGCTG +ACATCGGCGATGCTGGACCGGATCTTACATCACTCACACGTCGTGCAAATAAAAGGGGAA +AGCTATCGACTGAAGCAGAAACGAAAGGCCGGGGTTATAGCTGAAGCTAATCCTGAGTAA diff --git a/Examples/input_files/pan-input/LSTINFO-list_genomes.lst b/Examples/input_files/pan-input/LSTINFO-list_genomes.lst new file mode 100644 index 0000000000000000000000000000000000000000..3b4150bc114612a9d45b684c710f91de4cc0dae1 --- /dev/null +++ b/Examples/input_files/pan-input/LSTINFO-list_genomes.lst @@ -0,0 +1,4 @@ +gembase_name orig_name to_annotate gsize nb_conts L90 +GEN4.1111.00001 genome4.fst Examples/input_files/pan-init-files/tmp_files/genome4.fst_prokka-split5N.fna 7134 1 1 +GENO.0321.00001 genome1.fst Examples/input_files/pan-init-files/tmp_files/genome1.fst_prokka-split5N.fna 9808 2 2 +GENO.1216.00002 genome3-chromo.fst-all.fna Examples/input_files/pan-init-files/tmp_files/genome3-chromo.fst-all.fna_prokka-split5N.fna 8817 3 3 diff --git a/Examples/input_files/pan-input/Proteins/GEN4.1111.00001.prt b/Examples/input_files/pan-input/Proteins/GEN4.1111.00001.prt new file mode 100644 index 0000000000000000000000000000000000000000..841d437809448de6299df1faef13f9da3d26df92 --- /dev/null +++ b/Examples/input_files/pan-input/Proteins/GEN4.1111.00001.prt @@ -0,0 +1,52 @@ +>GEN4.1111.00001.0001b_00001 831 amiD | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 | COG:COG3023 +MKALLWLVGLALLLTGCASEKGIIDKEGYQLDTRHRAQAAYPRIKVLVIHYTAENFDVSL +ATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGVSFWRGATRLNDTSIGIEL +ENRGWRMSGGVKSFAPFESAQIQALIPLAKDIIARYDIKPQNVVAHADIAPQRKDDPGPR +FPWRELAAQGIGAWPDAQRVAFYLAGRAPYTPVDTATVLALLSRYGYEVKADMTAREQQR +VIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD +>GEN4.1111.00001.0001i_00002 381 hpcD | 5-carboxymethyl-2-hydroxymuconate Delta-isomerase | 5.3.3.10 | similar to AA sequence:UniProtKB:Q05354 | NA +MPHFIAECTENIREQADLPGLFSKVNEALAASGIFPIGGIRSRAHWLDTWQMADGKHDYA +FVHMTLKIGTGRSLESRQEVGEMLFGLIKAHFADLMENRYLALSFEIAELHPTLNYKQNN +VHALFK +>GEN4.1111.00001.0001i_00003 804 hpcG | 2-oxo-hept-4-ene-1,7-dioate hydratase | 4.2.1.163 | similar to AA sequence:UniProtKB:P42270 | NA +MLDKQTHTLIAQRLNQAEKQREQIRAVSLDYPNITIEDAYAVQREWVNIKIAEGRTLKGH +KIGLTSKAMQASSQISEPDYGALLDDMFFHDGGDIPTDRFIVPRIEVELAFVLAKPLRGP +HCTLFDVYNATDYVIPALELIDARSHNIDPETQRPRKVFDTISDNAANAGVILGGRPIKP +DELDLRWISALLYRNGVIEETGVAAGVLNHPANGVAWLANKLAPYDVQLEAGQIILGGSF +TRPVPASKGDTFHVDYGNMGAISCRFV +>GEN4.1111.00001.0001i_00004 1404 NA | hypothetical protein | NA | NA | NA +MPVNKFSRRTLLTAGSALAVLPFLRALPVQAREPRETVDIKDYPADDGIASFKQAFADGQ +TVVLPPGWVCENINAAITIPAGKTLRVQGAVRGNGRGRFILQDGCQVVGEQGGSLHNVTL +DVRGSDCVIKGVTMSGFGPVAQIFIGGKEPQVMRNLIIDDITVTHANYAILRQGFHNQMD +GARITHSRFSDLQGDAIEWNVAIHDRDILISDHVIERIDCTNGKINWGIGIGLAGSTYDN +SYPEDQAVKNFVVANITGSDCRQLVHVENGKHFVIRNVKAKNITPDFSKNAGIDNATIAI +YGCDNFVIDNIDMTNSAGMLIGYGVVKGKYLSIPQNFKLNAIRLDNRQVAYKLRGIQISS +GNIPSFVAITNVRMTRATLELHNQPQHLFLRNINVMQTSAIGPALKMHFDLRKDVRGQFM +ARQDTLLSLANVHAINENGQSSVDIDRINHQTVNVEAVNFSLPKRGG +>GEN4.1111.00001.0001i_00005 774 hisP | Histidine transport ATP-binding protein HisP | NA | similar to AA sequence:UniProtKB:P02915 | COG:COG4598 +MSENKLHVIDLHKRYGGHEVLKGVSLQARAGDVISIIGSSGSGKSTFLRCINFLEKPSEG +AIIVNGQNINLVRDKDGQLKVADKNQLRLLRTRLTMVFQHFNLWSHMTVLENVMEAPIQV +LGLSKHDARERALKYLAKVGIDERAQGKYPVHLSGGQQQRVSIARALAMEPDVLLFDEPT +SALDPELVGEVLRIMQQLAEEGKTMVVVTHEMGFARHVSSHVIFLHQGKIEEEGDPEQVF +GNPQSPRLQQFLKGSLK +>GEN4.1111.00001.0001i_00006 768 trmD | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 | COG:COG0336 +MFIGIVSLFPEMFRAITDYGVTGRAVKKGLLNIQSWSPRDFAHDRHRTVDDRPYGGGPGM +LMMVQPLRDAIHAAKAAAGEGAKVIYLSPQGRKLDQAGVSELATNQKLILVCGRYEGVDE +RVIQTEIDEEWSIGDYVLSGGELPAMTLIDSVARFIPGVLGHEASAIEDSFADGLLDCPH +YTRPEVLEGMEVPPVLLSGNHAEIRRWRLKQSLGRTWLRRPELLENLALTEEQARLLAEF +KTEHAQQQHKHDGMA +>GEN4.1111.00001.0001i_00007 261 NA | hypothetical protein | NA | NA | NA +MAGLHAPYAYSAHHAVNFCSEYKRGFVLGFTHRMFEKTGDRQLSAWEAGILTRRYGLDKE +MVMDFFKENHSGMAVRFFMVGYRLEG +>GEN4.1111.00001.0001i_00008 582 NA | hypothetical protein | NA | NA | NA +MSTLLYLHGFNSSPRSAKACQLKNWLAERHPHVEMIVPQLPPYPADAAELLESLVLEHGG +APLGLVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDYLGQNENPYTGQQYVLESR +HIYDLKVMQIDPLEAPDLIWLLQQTGDEVLDYRQAVAYYASCRQTVTEGGNHAFTGFEDY +FNQIVDFLGLHSC +>GEN4.1111.00001.0001b_00009 1083 dctM | C4-dicarboxylate TRAP transporter large permease protein DctM | NA | similar to AA sequence:UniProtKB:O07838 | COG:COG1593 +MNSGGIAARLVNFAKLFTGKLPGSLSYTNIVGNMMFGAISGSAIAASTSIGGVMVPMSAR +EGYDRGFAAAVNIASAPTGMLIPPTTAFILYALASGGTSIAALFAGGLVAGVLWGVGCML +VTLVVAKRRNYRVFFTVQKGMALKVAVEAIPSLLLIVIIVGGIVQGIFTAIEASAIAVVY +TLLLTMVFYRTLKIKDLPSILLQTVVMTGVIMFLLATSSAMSFSMSITNIPAALSDMILG +ISANKLVILLVITVFLLIIGAFMDIGPAILIFTPILLPIMAKLGVDPVHFGIIMIYNLAI +GTITPPVGSGLYVGASVGKVKVEEVIKPLLPFYGAIIGVLLLITYIPEITLFLPRLLGIM diff --git a/Examples/input_files/pan-input/Proteins/GENO.0321.00001.prt b/Examples/input_files/pan-input/Proteins/GENO.0321.00001.prt new file mode 100644 index 0000000000000000000000000000000000000000..209d0341635283d5ae2f4c84f935bb820e54b15d --- /dev/null +++ b/Examples/input_files/pan-input/Proteins/GENO.0321.00001.prt @@ -0,0 +1,69 @@ +>GENO.0321.00001.0001b_00001 1668 glnS | Glutamine--tRNA ligase | 6.1.1.18 | similar to AA sequence:UniProtKB:P00962 | COG:COG0008 +MSEAEARPTNFIRQIIDEDLASGKHTTVHTRFPPEPNGYLHIGHAKSICLNFGIAQDYQG +QCNLRFDDTNPVKEDIEYVDSIKNDVEWLGFHWSGDIRYSSDYFDQLHAYAVELINKGLA +YVDELTPEQIREYRGTLTAPGKNSPFRDRSVEENLALFEKMRTGGFEEGKACLRAKIDMA +SPFIVMRDPVLYRIKFAEHHQTGNKWCIYPMYDFTHCISDALEGITHSLCTLEFQDNRRL +YDWVLDNITIPVHPRQYEFSRLNLEYTVMSKRKLNLLVTDKHVEGWDDPRMPTISGLRRR +GYTAASIREFCKRIGVTKQDNTIEMASLESCIREDLNENAPRAMAVIDPVKLVIENYPQG +ESEMVTMPNHPNKPEMGSREVPFSGEIWIDRADFREEANKQYKRLVMGKEVRLRNAYVIK +AERVEKDAEGNITTIFCTYDADTLSKDPADGRKVKGVIHWVSAAHALPIEIRLYDRLFSV +PNPGAAEDFLSVINPESLVIKQGYGEPSLKAAVAGKAFQFEREGYFCLDSRYATADKLVF +NRTVGLRDTWAKAGE +>GENO.0321.00001.0001b_00002 831 amiD | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 | COG:COG3023 +MKALLWLVGLALLLTGCASEKGIIDKEGYQLDTRHRAQAAYPRIKVLVIHYTAENFDVSL +ATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGVSFWRGATRLNDTSIGIEL +ENRGWRMSGGVKSFAPFESAQIQALIPLAKDIIARYNIKPQNVVAHADIAPQRKDDPGPR +FPWRELAAQGISAWPDAQRVAFYLAGRAPYTPVDTATVLALLSRYGYEVKADMTAREQQR +VIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD +>GENO.0321.00001.0002b_00003 381 hpcD | 5-carboxymethyl-2-hydroxymuconate Delta-isomerase | 5.3.3.10 | similar to AA sequence:UniProtKB:Q05354 | NA +MPHFIAECTENIREQADLPGLFSKVNEALAASGIFPIGGIRSRAHWLDTWQMADGKHDYA +FVHMTLKIGAGRSLESRQEVGEMLFGLIKAHFADLMENRYLALSFEIAELHPTLNYKQNN +VHALFK +>GENO.0321.00001.0002i_00004 1410 NA | hypothetical protein | NA | NA | NA +MSMPATKFSRRTLLTAGSALAVLPFLRALPVQAREPRETVDIKDYPADDGIASFKQAFAD +GQTVVVPPGWVCENINAAITIPAGKTLRVQGAVRGNGRGRFILQDGCQVVGEQGGSLHNV +TLDVRGSDCVIKGVAMSGFGPVAQIFIGGKEPQVMRNLIIDDITVTHANYAILRQGFHNQ +MDGARITHSRFSDLQGDAIEWNVAIHDRDILISDHVIERINCTNGKINWGIGIGLAGSTY +DNSYPEDQAVKNFVVANITGSDCRQLVHVENGKHFVIRNVKAKNITPGFSKNAGIDNATI +AIYGCDNFVIDNIDMTNSAGMLIGYGVVKGKYLSIPQNFKLNAIRLDNRQVAYKLRGIQI +SSGNTPSFVAITNVRMTRATLELHNQPQHLFLRNINVMQTSAIGPALKMHFDLRKDVRGQ +FMARQDTLLSLANVHAINENGQSSVDIDRINHQTVNVEAVNFSLPKRGG +>GENO.0321.00001.0002i_00005 774 hisP | Histidine transport ATP-binding protein HisP | NA | similar to AA sequence:UniProtKB:P02915 | COG:COG4598 +MSENKLHVIDLHKRYGGHEVLKGVSLQARAGDVISIIGSSGSGKSTFLRCINFLEKPSEG +AIIVNGQNINLVRDKDGQLKVADKNQLRLLRTRLTMVFQHFNLWSHMTVLENVMEAPIQV +LGLSKHDARERALKYLAKVGIDERAQGKYPVHLSGGQQQRVSIARALAMEPDVLLFDEPT +SALDPELVGEVLRIMQQLAEEGKTMVVVTHEMGFARHVSSHVIFLHQGKIEEEGDPEQVF +GNPQSPRLQQFLKGSLK +>GENO.0321.00001.0002i_00006 735 trmD_1 | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 | COG:COG0336 +MFRAITDYGVTGRAVKKGLLNIQSWSPRDFAHDRHRTVDDRPYGGGPGMLMMVQPLRDAI +HAAKAAAGEGAKVIYLSPQGRKLDQAGVSELATNQKLILVCGRYEGVDERVIQTEIDEEW +SIGDYVLSGGELPAMTLIDSVARFIPGVLGHEASAIEDSFADGLLDCPHYTRPEVLEGME +VPPVLLSGNHAEIRRWRLKQSLGRTWLRRPELLENLALTEEQARLLAEFKTEHAQQQHKH +DGMA +>GENO.0321.00001.0002i_00007 828 trmD_2 | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 | COG:COG0336 +MMGWHSRHMGSQRRRANAQYVFIGIVSLFPEMFRAITDYGVTGRAVKKGLLNIQSWSPRD +FAHDRHRTVDDRPYGGGPGMLMMVQPLRDAIHAAKAAAGEGAKVIYLSPQGRKLDQAGVS +ELATNQKLILVCGRYEGVDERVIQTEIDEEWSIGDYVLSGGELPAMTLIDSVARFIPGVL +GHEASAIEDSFADGLLDCPHYTRPEVLEGMEVPPVLLSGNHAEIRRWRLKQSLGRTWLRR +PELLENLALTEEQARLLAEFKTEHAQQQHKHDGMA +>GENO.0321.00001.0002i_00008 465 NA | hypothetical protein | NA | NA | NA +MKFVAPEQAPEQAEVIKNTPFWPDVDLSEFRSVMRTDGTVTQPRLKQVVLTAISEVNAEL +YDFRNRQQMLGWRTLAEVPAEMLDGKSERIRHYHNAVFCWARAVLNERYQDYDATASGVK +RGEELAEASGDLWRDARWAISRVQDVPHCTVELI +>GENO.0321.00001.0002i_00009 261 NA | hypothetical protein | NA | NA | NA +MAGLHAPYAYSAHHAVNFCSEYKRGFVLGFTHRMFEKTGDRQLSAWEAGILTRRYGLDKE +MVMDFFKENHSGMAVRFFMAGYRLEG +>GENO.0321.00001.0002i_00010 582 NA | hypothetical protein | NA | NA | NA +MSTLLYLHGFNSSPRSAKACQLKNWLAERHPHVEMIVPQLPPYPADAAELLESLVLEHGG +APLGLVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDYLGQNENPYTGQQYVLESR +HIYDLKVMQIDPLEAPDLIWLLQQTGDEVLYYRQAVAYYASCRQTVTEGGNHAFTGFEDY +FNQIVDFLGLHSC +>GENO.0321.00001.0002b_00011 1308 dctM | C4-dicarboxylate TRAP transporter large permease protein DctM | NA | similar to AA sequence:UniProtKB:O07838 | COG:COG1593 +MIDPIFASCTLIAVFVVLLAMGAPIGICIVIASFSTMMLVLPFDISMFATAQKMFSSLDS +FALLAVPFFVLSGVIMNSGGIAARLVNFAKLFTGKLPGSLSYTNIVGNMMFGAISGSAIA +ASTSIGGVMVPMSAREGYDRGFAAAVNIASAPTGMLIPPTTAFILYALASGGTSIAALFA +GGLVAGVLWGVGCMLVTLVVAKRRNYRVFFTVQKGMALKVAVEAIPSLLLIVIIVGGIVQ +GIFTAIEASAIAVVYTLLLTMVFYRTLKIKDLPSILLQTVVMTGVIMFLLATSSAMSFSM +SITNIPAALSDMILGISANKLVILLVITVFLLIIGAFMDIGPAILIFTPILLPIMAKLGV +DPVHFGIIMIYNLAIGTITPPVGSGLYVGASVGKVKVEEVIKPLLPFYGAIIGVLLLITY +IPEITLFLPRLLGIM diff --git a/Examples/input_files/pan-input/Proteins/GENO.1216.00002.prt b/Examples/input_files/pan-input/Proteins/GENO.1216.00002.prt new file mode 100644 index 0000000000000000000000000000000000000000..7c1048821cbe65e197fbc2c0672bcffe010aa868 --- /dev/null +++ b/Examples/input_files/pan-input/Proteins/GENO.1216.00002.prt @@ -0,0 +1,66 @@ +>GENO.1216.00002.0001b_00001 831 amiD_1 | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 | COG:COG3023 +MRALLWLVGLALLLTGCASEKGIIDEEGYQLDTRHRAQAAYPRIKVLVIHYTAENFDVSL +ATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGVSFWRGATRLNDTSIGIEL +ENRGWRMSGGVKSFAPFESAQIQALIPLAKDIIARYDIKPQNVVAHADIAPQRKDDPGPR +FPWRELAAQGIGAWPDAQRVAFYLAGRAPYTPVDTATVLALLSRYGYEVKADMTAREQQR +VIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD +>GENO.1216.00002.0001i_00002 831 amiD_2 | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 | COG:COG3023 +MRALLWLVGLALLLTGCASEKGIIDEEGYQLDTRHRAQAAYPRIKVLVIHYTAENFDVSL +ATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGVSFWRGATRLNDTSIGIEL +ENRGWRMSGGVKSFAPFESAQIQALIPLAKDIIARYDIKPQNVVAHADIAPQRKDDPGPR +FPWRELAAQGIGAWPDAQRVAFYLAGRAPYTPVDTATVLALLSRYGYEVKADMTAREQQR +VIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD +>GENO.1216.00002.0001i_00003 474 hpcD | 5-carboxymethyl-2-hydroxymuconate Delta-isomerase | 5.3.3.10 | similar to AA sequence:UniProtKB:Q05354 | NA +MPKRRRLPKHYWRSTARINRARYREDSLFEDMPHFIAECTENIREQADLPGLFSKVNEAL +AASGIFPIGGIRSRAHWLDTWQMADGKHDYAFVHMTLKIGAGRSLESRQEVGEMLFGLIK +AHFADLMENRYLALSFEIAELHPTLNYKQNNVHALFK +>GENO.1216.00002.0001i_00004 225 pspB | Phage shock protein B | NA | similar to AA sequence:UniProtKB:P0AFM9 | NA +MSALFLAIPLTIFVLFVLPIWLWLHYSNRAGRGELSQSEQQRLLQLTDDAQRMRERIQAL +EDILDAEHPNWRER +>GENO.1216.00002.0001i_00005 1404 NA | hypothetical protein | NA | NA | NA +MPATKFSRRTLLTAGSALAVLPFLRALPVQAREPRETVDIKDYPADDGIASFKQAFADGQ +TVVVPPGWVCENINAAITIPAGKTLRVQGAVRGNGRGRFILQDGCQVVGEQGGSLHNVTL +DVRGSDCVIKGVAMSGFGPVAQIFIGGKEPQVMRNLIIDDITVTHANYAILRQGFHNQMD +GARITHSRFSDLQGDAIEWNVAIHDRDILISDHVIERIDCTNGKINWGIGIGLAGSTYDN +SYPEDQAVKNFVVANITGSDCRQLVHVENGKHFVIRNVKAKNITPDFSKNAGIDNATIAI +YGCDNFVIDNIDMTNSAGMLIGYGVVKGKYLSIPQNFKLNAIRLDNRQVAYKLRGIQISS +GNTPSFVAITNVRMTRATLELHNQPQHLFLRNINVMQTSAIGPALKMHFDLRKDVRGQFM +ARQDTLLSLANVHAINENGQSSVDIDRINHQTVNVEAVNFSLPKRGG +>GENO.1216.00002.0001i_00006 774 hisP | Histidine transport ATP-binding protein HisP | NA | similar to AA sequence:UniProtKB:P02915 | COG:COG4598 +MSENKLHVIDLHKRYGGHEVLKGVSLQARAGDVISIIGSSGSGKSTFLRCINFLEKPSED +AIIVNGQNINLVRDKDGQLKVADKNQLRLLRTRLTMVFQHFNLWSHMTVLENVMEAPIQV +LGLSKHDARERALKYLAKVGIDERAQGKYPVHLSGGQQQRVSIARALAMEPDVLLFDEPT +SALDPELVGEVLRIMQQLAEEGKTMVVVTHEMGFVRHVSSHVIFLHQGKIEEEGDPEQVF +GNPQSPRLQQFLKGSLK +>GENO.1216.00002.0001i_00007 792 trmD | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 | COG:COG0336 +MRAIYRRCVFIGIVSLFPEMFRAITDYGVTGRAVKNGLLNIQSWSPRDFTHDRHRTVDDR +PYGGGPGMLMMVQPLRDAIHAAKAAAGEGAKVIYLSPQGRKLDQAGVSELATNQKLILVC +GRYEGVDERVIQTEIDEEWSIGDYVLSGGELPAMTLIDSVARFIPGVLGHEASAIEDSFA +DGLLDCPHYTRPEVLEGMEVPPVLLSGNHAEIRRWRLKQSLGRTWLRRPELLENLALTEE +QARLLAEFKTEHAQQQHKHDGMA +>GENO.1216.00002.0001b_00008 288 NA | hypothetical protein | NA | NA | NA +MNNHFGKGLMAGLHAPYAYSAHHAVNFCSEYKRGFVLGFTHRMFEKTGDRQLSAWEAGIL +TRRYGLDKEMVMDFFKENHSGMAVRFFMAGYRLEG +>GENO.1216.00002.0002b_00009 582 NA | hypothetical protein | NA | NA | NA +MSTLLYLHGFNSSPRSAKACQLKNWLAERHPHVEMIVPQLPPYPADAAELLESLVLEHGG +APLGLVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDYLGQNENPYTGQQYVLESR +HIYDLKVMQIDPLEAPDLIWLLQQTGDEVLDYRQAVAYYASCRQTVTEGGNHAFTGFEDY +FNQIVDFLGLHSC +>GENO.1216.00002.0002b_00010 1308 dctM | C4-dicarboxylate TRAP transporter large permease protein DctM | NA | similar to AA sequence:UniProtKB:O07838 | COG:COG1593 +MIDPIFASCTLIAVFVVLLAMGAPIGICIVIASFSTMMLVLPFDISMFATAQKMFSSLDS +FALLAVPFFVLSGVIMNSGGIAARLINFAKLFTGKLPGSLSYTNIVGNMMFGAISGSAIA +ASTSIGGVMVPMSAREGYDRGFAAAVNIASAPTGMLIPPTTAFILYALASGGTSIAALFA +GGLVAGVLWGVGCMLVTLVVAKRRNYRVFFTVQKGMALKVAVEAIPSLLLIVIIVGGIVQ +GIFTAIEASAIAVVYTLLLTMVFYRTLKIKDLPSILLQTVVMTGVIMFLLATSSAMSFSM +SITNIPAALSDMILGISANKLVILLVITVFLLIIGAFMDIGPAILIFTPILLPIMAKLGV +DPVHLGIIMIYNLAIGTITPPVGSGLYVGASVGKVKVEEVIKPLLPFYGAIIGVLLLITY +IPEIILFLPRLLGIM +>GENO.1216.00002.0003b_00011 255 NA | hypothetical protein | NA | NA | NA +MIINGKLIKAKDLAKAAGVSRSTVIKYYGISRENYERVATERRKLAFELRASGLKWKEVA +EKMNTTKYSAIAYYRRYLALEKNK +>GENO.1216.00002.0003b_00012 780 NA | IS21 family transposase ISSen3 | NA | similar to AA sequence:ISfinder:ISSen3 | NA +MVELQHQRLMVLAEQLQLDSLIGAAPALSQQAVDQEWSYMDFLEHLLHEEKLARHQRKQA +MYTRMAAFPAVKTFEEYDFTFATGAPQKQIQSLRSLSFIERNENIVLLGPSGVGKTHLAI +AMGYEAVRAGIKVRFTTAADLLLQLSTSQRQGRYKTTLNRGVMAPKLLIIDEIGYLPFSQ +EEAKLFFQVIAKRYEKSAMILTSNLPFGQWDQTFAGDAALTSAMLDRILHHSHVVQIKGE +SYRLKQKRKAGVIAEANPE diff --git a/Examples/input_files/pan-input/Proteins/GENO3.All.prt b/Examples/input_files/pan-input/Proteins/GENO3.All.prt new file mode 100644 index 0000000000000000000000000000000000000000..97f24aea3e7de95bd89ac9630961b0e67ea8475c --- /dev/null +++ b/Examples/input_files/pan-input/Proteins/GENO3.All.prt @@ -0,0 +1,187 @@ +>GEN4.1111.00001.0001b_00001 831 amiD | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 | COG:COG3023 +MKALLWLVGLALLLTGCASEKGIIDKEGYQLDTRHRAQAAYPRIKVLVIHYTAENFDVSL +ATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGVSFWRGATRLNDTSIGIEL +ENRGWRMSGGVKSFAPFESAQIQALIPLAKDIIARYDIKPQNVVAHADIAPQRKDDPGPR +FPWRELAAQGIGAWPDAQRVAFYLAGRAPYTPVDTATVLALLSRYGYEVKADMTAREQQR +VIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD +>GEN4.1111.00001.0001i_00002 381 hpcD | 5-carboxymethyl-2-hydroxymuconate Delta-isomerase | 5.3.3.10 | similar to AA sequence:UniProtKB:Q05354 | NA +MPHFIAECTENIREQADLPGLFSKVNEALAASGIFPIGGIRSRAHWLDTWQMADGKHDYA +FVHMTLKIGTGRSLESRQEVGEMLFGLIKAHFADLMENRYLALSFEIAELHPTLNYKQNN +VHALFK +>GEN4.1111.00001.0001i_00003 804 hpcG | 2-oxo-hept-4-ene-1,7-dioate hydratase | 4.2.1.163 | similar to AA sequence:UniProtKB:P42270 | NA +MLDKQTHTLIAQRLNQAEKQREQIRAVSLDYPNITIEDAYAVQREWVNIKIAEGRTLKGH +KIGLTSKAMQASSQISEPDYGALLDDMFFHDGGDIPTDRFIVPRIEVELAFVLAKPLRGP +HCTLFDVYNATDYVIPALELIDARSHNIDPETQRPRKVFDTISDNAANAGVILGGRPIKP +DELDLRWISALLYRNGVIEETGVAAGVLNHPANGVAWLANKLAPYDVQLEAGQIILGGSF +TRPVPASKGDTFHVDYGNMGAISCRFV +>GEN4.1111.00001.0001i_00004 1404 NA | hypothetical protein | NA | NA | NA +MPVNKFSRRTLLTAGSALAVLPFLRALPVQAREPRETVDIKDYPADDGIASFKQAFADGQ +TVVLPPGWVCENINAAITIPAGKTLRVQGAVRGNGRGRFILQDGCQVVGEQGGSLHNVTL +DVRGSDCVIKGVTMSGFGPVAQIFIGGKEPQVMRNLIIDDITVTHANYAILRQGFHNQMD +GARITHSRFSDLQGDAIEWNVAIHDRDILISDHVIERIDCTNGKINWGIGIGLAGSTYDN +SYPEDQAVKNFVVANITGSDCRQLVHVENGKHFVIRNVKAKNITPDFSKNAGIDNATIAI +YGCDNFVIDNIDMTNSAGMLIGYGVVKGKYLSIPQNFKLNAIRLDNRQVAYKLRGIQISS +GNIPSFVAITNVRMTRATLELHNQPQHLFLRNINVMQTSAIGPALKMHFDLRKDVRGQFM +ARQDTLLSLANVHAINENGQSSVDIDRINHQTVNVEAVNFSLPKRGG +>GEN4.1111.00001.0001i_00005 774 hisP | Histidine transport ATP-binding protein HisP | NA | similar to AA sequence:UniProtKB:P02915 | COG:COG4598 +MSENKLHVIDLHKRYGGHEVLKGVSLQARAGDVISIIGSSGSGKSTFLRCINFLEKPSEG +AIIVNGQNINLVRDKDGQLKVADKNQLRLLRTRLTMVFQHFNLWSHMTVLENVMEAPIQV +LGLSKHDARERALKYLAKVGIDERAQGKYPVHLSGGQQQRVSIARALAMEPDVLLFDEPT +SALDPELVGEVLRIMQQLAEEGKTMVVVTHEMGFARHVSSHVIFLHQGKIEEEGDPEQVF +GNPQSPRLQQFLKGSLK +>GEN4.1111.00001.0001i_00006 768 trmD | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 | COG:COG0336 +MFIGIVSLFPEMFRAITDYGVTGRAVKKGLLNIQSWSPRDFAHDRHRTVDDRPYGGGPGM +LMMVQPLRDAIHAAKAAAGEGAKVIYLSPQGRKLDQAGVSELATNQKLILVCGRYEGVDE +RVIQTEIDEEWSIGDYVLSGGELPAMTLIDSVARFIPGVLGHEASAIEDSFADGLLDCPH +YTRPEVLEGMEVPPVLLSGNHAEIRRWRLKQSLGRTWLRRPELLENLALTEEQARLLAEF +KTEHAQQQHKHDGMA +>GEN4.1111.00001.0001i_00007 261 NA | hypothetical protein | NA | NA | NA +MAGLHAPYAYSAHHAVNFCSEYKRGFVLGFTHRMFEKTGDRQLSAWEAGILTRRYGLDKE +MVMDFFKENHSGMAVRFFMVGYRLEG +>GEN4.1111.00001.0001i_00008 582 NA | hypothetical protein | NA | NA | NA +MSTLLYLHGFNSSPRSAKACQLKNWLAERHPHVEMIVPQLPPYPADAAELLESLVLEHGG +APLGLVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDYLGQNENPYTGQQYVLESR +HIYDLKVMQIDPLEAPDLIWLLQQTGDEVLDYRQAVAYYASCRQTVTEGGNHAFTGFEDY +FNQIVDFLGLHSC +>GEN4.1111.00001.0001b_00009 1083 dctM | C4-dicarboxylate TRAP transporter large permease protein DctM | NA | similar to AA sequence:UniProtKB:O07838 | COG:COG1593 +MNSGGIAARLVNFAKLFTGKLPGSLSYTNIVGNMMFGAISGSAIAASTSIGGVMVPMSAR +EGYDRGFAAAVNIASAPTGMLIPPTTAFILYALASGGTSIAALFAGGLVAGVLWGVGCML +VTLVVAKRRNYRVFFTVQKGMALKVAVEAIPSLLLIVIIVGGIVQGIFTAIEASAIAVVY +TLLLTMVFYRTLKIKDLPSILLQTVVMTGVIMFLLATSSAMSFSMSITNIPAALSDMILG +ISANKLVILLVITVFLLIIGAFMDIGPAILIFTPILLPIMAKLGVDPVHFGIIMIYNLAI +GTITPPVGSGLYVGASVGKVKVEEVIKPLLPFYGAIIGVLLLITYIPEITLFLPRLLGIM +>GENO.0321.00001.0001b_00001 1668 glnS | Glutamine--tRNA ligase | 6.1.1.18 | similar to AA sequence:UniProtKB:P00962 | COG:COG0008 +MSEAEARPTNFIRQIIDEDLASGKHTTVHTRFPPEPNGYLHIGHAKSICLNFGIAQDYQG +QCNLRFDDTNPVKEDIEYVDSIKNDVEWLGFHWSGDIRYSSDYFDQLHAYAVELINKGLA +YVDELTPEQIREYRGTLTAPGKNSPFRDRSVEENLALFEKMRTGGFEEGKACLRAKIDMA +SPFIVMRDPVLYRIKFAEHHQTGNKWCIYPMYDFTHCISDALEGITHSLCTLEFQDNRRL +YDWVLDNITIPVHPRQYEFSRLNLEYTVMSKRKLNLLVTDKHVEGWDDPRMPTISGLRRR +GYTAASIREFCKRIGVTKQDNTIEMASLESCIREDLNENAPRAMAVIDPVKLVIENYPQG +ESEMVTMPNHPNKPEMGSREVPFSGEIWIDRADFREEANKQYKRLVMGKEVRLRNAYVIK +AERVEKDAEGNITTIFCTYDADTLSKDPADGRKVKGVIHWVSAAHALPIEIRLYDRLFSV +PNPGAAEDFLSVINPESLVIKQGYGEPSLKAAVAGKAFQFEREGYFCLDSRYATADKLVF +NRTVGLRDTWAKAGE +>GENO.0321.00001.0001b_00002 831 amiD | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 | COG:COG3023 +MKALLWLVGLALLLTGCASEKGIIDKEGYQLDTRHRAQAAYPRIKVLVIHYTAENFDVSL +ATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGVSFWRGATRLNDTSIGIEL +ENRGWRMSGGVKSFAPFESAQIQALIPLAKDIIARYNIKPQNVVAHADIAPQRKDDPGPR +FPWRELAAQGISAWPDAQRVAFYLAGRAPYTPVDTATVLALLSRYGYEVKADMTAREQQR +VIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD +>GENO.0321.00001.0002b_00003 381 hpcD | 5-carboxymethyl-2-hydroxymuconate Delta-isomerase | 5.3.3.10 | similar to AA sequence:UniProtKB:Q05354 | NA +MPHFIAECTENIREQADLPGLFSKVNEALAASGIFPIGGIRSRAHWLDTWQMADGKHDYA +FVHMTLKIGAGRSLESRQEVGEMLFGLIKAHFADLMENRYLALSFEIAELHPTLNYKQNN +VHALFK +>GENO.0321.00001.0002i_00004 1410 NA | hypothetical protein | NA | NA | NA +MSMPATKFSRRTLLTAGSALAVLPFLRALPVQAREPRETVDIKDYPADDGIASFKQAFAD +GQTVVVPPGWVCENINAAITIPAGKTLRVQGAVRGNGRGRFILQDGCQVVGEQGGSLHNV +TLDVRGSDCVIKGVAMSGFGPVAQIFIGGKEPQVMRNLIIDDITVTHANYAILRQGFHNQ +MDGARITHSRFSDLQGDAIEWNVAIHDRDILISDHVIERINCTNGKINWGIGIGLAGSTY +DNSYPEDQAVKNFVVANITGSDCRQLVHVENGKHFVIRNVKAKNITPGFSKNAGIDNATI +AIYGCDNFVIDNIDMTNSAGMLIGYGVVKGKYLSIPQNFKLNAIRLDNRQVAYKLRGIQI +SSGNTPSFVAITNVRMTRATLELHNQPQHLFLRNINVMQTSAIGPALKMHFDLRKDVRGQ +FMARQDTLLSLANVHAINENGQSSVDIDRINHQTVNVEAVNFSLPKRGG +>GENO.0321.00001.0002i_00005 774 hisP | Histidine transport ATP-binding protein HisP | NA | similar to AA sequence:UniProtKB:P02915 | COG:COG4598 +MSENKLHVIDLHKRYGGHEVLKGVSLQARAGDVISIIGSSGSGKSTFLRCINFLEKPSEG +AIIVNGQNINLVRDKDGQLKVADKNQLRLLRTRLTMVFQHFNLWSHMTVLENVMEAPIQV +LGLSKHDARERALKYLAKVGIDERAQGKYPVHLSGGQQQRVSIARALAMEPDVLLFDEPT +SALDPELVGEVLRIMQQLAEEGKTMVVVTHEMGFARHVSSHVIFLHQGKIEEEGDPEQVF +GNPQSPRLQQFLKGSLK +>GENO.0321.00001.0002i_00006 735 trmD_1 | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 | COG:COG0336 +MFRAITDYGVTGRAVKKGLLNIQSWSPRDFAHDRHRTVDDRPYGGGPGMLMMVQPLRDAI +HAAKAAAGEGAKVIYLSPQGRKLDQAGVSELATNQKLILVCGRYEGVDERVIQTEIDEEW +SIGDYVLSGGELPAMTLIDSVARFIPGVLGHEASAIEDSFADGLLDCPHYTRPEVLEGME +VPPVLLSGNHAEIRRWRLKQSLGRTWLRRPELLENLALTEEQARLLAEFKTEHAQQQHKH +DGMA +>GENO.0321.00001.0002i_00007 828 trmD_2 | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 | COG:COG0336 +MMGWHSRHMGSQRRRANAQYVFIGIVSLFPEMFRAITDYGVTGRAVKKGLLNIQSWSPRD +FAHDRHRTVDDRPYGGGPGMLMMVQPLRDAIHAAKAAAGEGAKVIYLSPQGRKLDQAGVS +ELATNQKLILVCGRYEGVDERVIQTEIDEEWSIGDYVLSGGELPAMTLIDSVARFIPGVL +GHEASAIEDSFADGLLDCPHYTRPEVLEGMEVPPVLLSGNHAEIRRWRLKQSLGRTWLRR +PELLENLALTEEQARLLAEFKTEHAQQQHKHDGMA +>GENO.0321.00001.0002i_00008 465 NA | hypothetical protein | NA | NA | NA +MKFVAPEQAPEQAEVIKNTPFWPDVDLSEFRSVMRTDGTVTQPRLKQVVLTAISEVNAEL +YDFRNRQQMLGWRTLAEVPAEMLDGKSERIRHYHNAVFCWARAVLNERYQDYDATASGVK +RGEELAEASGDLWRDARWAISRVQDVPHCTVELI +>GENO.0321.00001.0002i_00009 261 NA | hypothetical protein | NA | NA | NA +MAGLHAPYAYSAHHAVNFCSEYKRGFVLGFTHRMFEKTGDRQLSAWEAGILTRRYGLDKE +MVMDFFKENHSGMAVRFFMAGYRLEG +>GENO.0321.00001.0002i_00010 582 NA | hypothetical protein | NA | NA | NA +MSTLLYLHGFNSSPRSAKACQLKNWLAERHPHVEMIVPQLPPYPADAAELLESLVLEHGG +APLGLVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDYLGQNENPYTGQQYVLESR +HIYDLKVMQIDPLEAPDLIWLLQQTGDEVLYYRQAVAYYASCRQTVTEGGNHAFTGFEDY +FNQIVDFLGLHSC +>GENO.0321.00001.0002b_00011 1308 dctM | C4-dicarboxylate TRAP transporter large permease protein DctM | NA | similar to AA sequence:UniProtKB:O07838 | COG:COG1593 +MIDPIFASCTLIAVFVVLLAMGAPIGICIVIASFSTMMLVLPFDISMFATAQKMFSSLDS +FALLAVPFFVLSGVIMNSGGIAARLVNFAKLFTGKLPGSLSYTNIVGNMMFGAISGSAIA +ASTSIGGVMVPMSAREGYDRGFAAAVNIASAPTGMLIPPTTAFILYALASGGTSIAALFA +GGLVAGVLWGVGCMLVTLVVAKRRNYRVFFTVQKGMALKVAVEAIPSLLLIVIIVGGIVQ +GIFTAIEASAIAVVYTLLLTMVFYRTLKIKDLPSILLQTVVMTGVIMFLLATSSAMSFSM +SITNIPAALSDMILGISANKLVILLVITVFLLIIGAFMDIGPAILIFTPILLPIMAKLGV +DPVHFGIIMIYNLAIGTITPPVGSGLYVGASVGKVKVEEVIKPLLPFYGAIIGVLLLITY +IPEITLFLPRLLGIM +>GENO.1216.00002.0001b_00001 831 amiD_1 | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 | COG:COG3023 +MRALLWLVGLALLLTGCASEKGIIDEEGYQLDTRHRAQAAYPRIKVLVIHYTAENFDVSL +ATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGVSFWRGATRLNDTSIGIEL +ENRGWRMSGGVKSFAPFESAQIQALIPLAKDIIARYDIKPQNVVAHADIAPQRKDDPGPR +FPWRELAAQGIGAWPDAQRVAFYLAGRAPYTPVDTATVLALLSRYGYEVKADMTAREQQR +VIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD +>GENO.1216.00002.0001i_00002 831 amiD_2 | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 | COG:COG3023 +MRALLWLVGLALLLTGCASEKGIIDEEGYQLDTRHRAQAAYPRIKVLVIHYTAENFDVSL +ATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGVSFWRGATRLNDTSIGIEL +ENRGWRMSGGVKSFAPFESAQIQALIPLAKDIIARYDIKPQNVVAHADIAPQRKDDPGPR +FPWRELAAQGIGAWPDAQRVAFYLAGRAPYTPVDTATVLALLSRYGYEVKADMTAREQQR +VIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD +>GENO.1216.00002.0001i_00003 474 hpcD | 5-carboxymethyl-2-hydroxymuconate Delta-isomerase | 5.3.3.10 | similar to AA sequence:UniProtKB:Q05354 | NA +MPKRRRLPKHYWRSTARINRARYREDSLFEDMPHFIAECTENIREQADLPGLFSKVNEAL +AASGIFPIGGIRSRAHWLDTWQMADGKHDYAFVHMTLKIGAGRSLESRQEVGEMLFGLIK +AHFADLMENRYLALSFEIAELHPTLNYKQNNVHALFK +>GENO.1216.00002.0001i_00004 225 pspB | Phage shock protein B | NA | similar to AA sequence:UniProtKB:P0AFM9 | NA +MSALFLAIPLTIFVLFVLPIWLWLHYSNRAGRGELSQSEQQRLLQLTDDAQRMRERIQAL +EDILDAEHPNWRER +>GENO.1216.00002.0001i_00005 1404 NA | hypothetical protein | NA | NA | NA +MPATKFSRRTLLTAGSALAVLPFLRALPVQAREPRETVDIKDYPADDGIASFKQAFADGQ +TVVVPPGWVCENINAAITIPAGKTLRVQGAVRGNGRGRFILQDGCQVVGEQGGSLHNVTL +DVRGSDCVIKGVAMSGFGPVAQIFIGGKEPQVMRNLIIDDITVTHANYAILRQGFHNQMD +GARITHSRFSDLQGDAIEWNVAIHDRDILISDHVIERIDCTNGKINWGIGIGLAGSTYDN +SYPEDQAVKNFVVANITGSDCRQLVHVENGKHFVIRNVKAKNITPDFSKNAGIDNATIAI +YGCDNFVIDNIDMTNSAGMLIGYGVVKGKYLSIPQNFKLNAIRLDNRQVAYKLRGIQISS +GNTPSFVAITNVRMTRATLELHNQPQHLFLRNINVMQTSAIGPALKMHFDLRKDVRGQFM +ARQDTLLSLANVHAINENGQSSVDIDRINHQTVNVEAVNFSLPKRGG +>GENO.1216.00002.0001i_00006 774 hisP | Histidine transport ATP-binding protein HisP | NA | similar to AA sequence:UniProtKB:P02915 | COG:COG4598 +MSENKLHVIDLHKRYGGHEVLKGVSLQARAGDVISIIGSSGSGKSTFLRCINFLEKPSED +AIIVNGQNINLVRDKDGQLKVADKNQLRLLRTRLTMVFQHFNLWSHMTVLENVMEAPIQV +LGLSKHDARERALKYLAKVGIDERAQGKYPVHLSGGQQQRVSIARALAMEPDVLLFDEPT +SALDPELVGEVLRIMQQLAEEGKTMVVVTHEMGFVRHVSSHVIFLHQGKIEEEGDPEQVF +GNPQSPRLQQFLKGSLK +>GENO.1216.00002.0001i_00007 792 trmD | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 | COG:COG0336 +MRAIYRRCVFIGIVSLFPEMFRAITDYGVTGRAVKNGLLNIQSWSPRDFTHDRHRTVDDR +PYGGGPGMLMMVQPLRDAIHAAKAAAGEGAKVIYLSPQGRKLDQAGVSELATNQKLILVC +GRYEGVDERVIQTEIDEEWSIGDYVLSGGELPAMTLIDSVARFIPGVLGHEASAIEDSFA +DGLLDCPHYTRPEVLEGMEVPPVLLSGNHAEIRRWRLKQSLGRTWLRRPELLENLALTEE +QARLLAEFKTEHAQQQHKHDGMA +>GENO.1216.00002.0001b_00008 288 NA | hypothetical protein | NA | NA | NA +MNNHFGKGLMAGLHAPYAYSAHHAVNFCSEYKRGFVLGFTHRMFEKTGDRQLSAWEAGIL +TRRYGLDKEMVMDFFKENHSGMAVRFFMAGYRLEG +>GENO.1216.00002.0002b_00009 582 NA | hypothetical protein | NA | NA | NA +MSTLLYLHGFNSSPRSAKACQLKNWLAERHPHVEMIVPQLPPYPADAAELLESLVLEHGG +APLGLVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDYLGQNENPYTGQQYVLESR +HIYDLKVMQIDPLEAPDLIWLLQQTGDEVLDYRQAVAYYASCRQTVTEGGNHAFTGFEDY +FNQIVDFLGLHSC +>GENO.1216.00002.0002b_00010 1308 dctM | C4-dicarboxylate TRAP transporter large permease protein DctM | NA | similar to AA sequence:UniProtKB:O07838 | COG:COG1593 +MIDPIFASCTLIAVFVVLLAMGAPIGICIVIASFSTMMLVLPFDISMFATAQKMFSSLDS +FALLAVPFFVLSGVIMNSGGIAARLINFAKLFTGKLPGSLSYTNIVGNMMFGAISGSAIA +ASTSIGGVMVPMSAREGYDRGFAAAVNIASAPTGMLIPPTTAFILYALASGGTSIAALFA +GGLVAGVLWGVGCMLVTLVVAKRRNYRVFFTVQKGMALKVAVEAIPSLLLIVIIVGGIVQ +GIFTAIEASAIAVVYTLLLTMVFYRTLKIKDLPSILLQTVVMTGVIMFLLATSSAMSFSM +SITNIPAALSDMILGISANKLVILLVITVFLLIIGAFMDIGPAILIFTPILLPIMAKLGV +DPVHLGIIMIYNLAIGTITPPVGSGLYVGASVGKVKVEEVIKPLLPFYGAIIGVLLLITY +IPEIILFLPRLLGIM +>GENO.1216.00002.0003b_00011 255 NA | hypothetical protein | NA | NA | NA +MIINGKLIKAKDLAKAAGVSRSTVIKYYGISRENYERVATERRKLAFELRASGLKWKEVA +EKMNTTKYSAIAYYRRYLALEKNK +>GENO.1216.00002.0003b_00012 780 NA | IS21 family transposase ISSen3 | NA | similar to AA sequence:ISfinder:ISSen3 | NA +MVELQHQRLMVLAEQLQLDSLIGAAPALSQQAVDQEWSYMDFLEHLLHEEKLARHQRKQA +MYTRMAAFPAVKTFEEYDFTFATGAPQKQIQSLRSLSFIERNENIVLLGPSGVGKTHLAI +AMGYEAVRAGIKVRFTTAADLLLQLSTSQRQGRYKTTLNRGVMAPKLLIIDEIGYLPFSQ +EEAKLFFQVIAKRYEKSAMILTSNLPFGQWDQTFAGDAALTSAMLDRILHHSHVVQIKGE +SYRLKQKRKAGVIAEANPE