diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index ab50ac48d7faf4cefc048a68684b589f57b411bd..7d84e726a385a324b7719bb18211da56ac046add 100755 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -70,7 +70,11 @@ unit-test-ubuntu16.04: - pip3 install --upgrade pip - pip3 install -r requirements-dev.txt - ./make + - prokka --version + - mmseqs -h | head + - mafft --version - fastme -h + script: - py.test test/test_unit - more test_mmseq_cluster.log diff --git a/Examples/1-res-Annotate/Genes/EXAM.0219.00001.gen b/Examples/1-res-Annotate/Genes/EXAM.0219.00001.gen new file mode 100644 index 0000000000000000000000000000000000000000..b51d618b9eb022c635025692e6e73618f80224f5 --- /dev/null +++ b/Examples/1-res-Annotate/Genes/EXAM.0219.00001.gen @@ -0,0 +1,169 @@ +>EXAM.0219.00001.b0001_00001 1668 glnS | Glutamine--tRNA ligase | 6.1.1.18 | similar to AA sequence:UniProtKB:P00962 +ATGAGTGAGGCTGAAGCCCGCCCGACTAACTTTATTCGTCAGATTATTGATGAAGATCTG +GCGAGTGGTAAACATACCACTGTCCATACCCGTTTTCCGCCGGAGCCGAATGGCTATCTG +CATATCGGCCACGCGAAATCTATCTGCCTGAACTTTGGCATCGCGCAAGATTATCAGGGC +CAGTGCAACCTGCGTTTCGATGACACCAACCCGGTAAAAGAAGATATCGAGTACGTTGAT +TCGATCAAAAACGACGTCGAGTGGTTAGGCTTTCACTGGTCTGGCGATATTCGCTACTCC +TCCGATTACTTTGACCAACTGCACGCCTATGCGGTCGAGCTAATCAATAAAGGCCTGGCC +TATGTTGATGAGCTGACGCCGGAGCAGATCCGTGAATACCGCGGTACGCTGACCGCGCCG +GGTAAAAACAGCCCGTTCCGCGATCGCAGCGTCGAAGAGAACCTCGCGCTATTTGAAAAA +ATGCGTACCGGCGGTTTTGAAGAGGGTAAAGCCTGTCTGCGCGCTAAAATCGACATGGCG +TCGCCGTTTATCGTGATGCGCGATCCGGTGCTGTATCGCATTAAATTCGCCGAGCATCAT +CAGACTGGCAACAAGTGGTGCATCTATCCGATGTACGACTTTACTCACTGCATCAGCGAT +GCGCTGGAAGGCATTACTCATTCTCTGTGTACGCTGGAGTTCCAGGATAACCGTCGTCTG +TACGACTGGGTGCTGGACAACATCACCATTCCGGTTCACCCGCGCCAGTACGAATTCTCG +CGCCTGAATCTGGAATACACCGTGATGTCCAAGCGTAAGCTGAACCTGCTGGTGACCGAC +AAACACGTCGAAGGTTGGGACGATCCGCGTATGCCGACTATTTCCGGTCTGCGCCGTCGC +GGCTATACCGCGGCTTCTATTCGTGAGTTCTGCAAACGCATCGGCGTCACCAAGCAGGAC +AACACTATTGAGATGGCGTCGCTGGAATCCTGCATTCGCGAAGATCTGAACGAAAACGCG +CCGCGCGCGATGGCGGTAATCGATCCGGTAAAACTGGTTATCGAAAATTACCCGCAGGGT +GAGAGCGAAATGGTTACCATGCCTAACCATCCGAATAAACCGGAGATGGGCAGCCGTGAA +GTGCCGTTTAGCGGTGAGATCTGGATCGATCGCGCAGATTTCCGCGAAGAAGCGAACAAA +CAGTACAAACGTCTGGTGATGGGCAAAGAAGTGCGTCTGCGTAATGCCTACGTCATTAAA +GCGGAGCGCGTAGAGAAGGATGCCGAAGGGAATATCACCACCATCTTCTGTACCTATGAT +GCTGATACGCTGAGTAAAGATCCGGCTGACGGGCGTAAAGTGAAAGGCGTAATCCACTGG +GTTAGCGCAGCACATGCGCTGCCGATTGAAATTCGTCTCTACGACCGTCTGTTCAGCGTG +CCGAATCCGGGCGCCGCGGAGGACTTCCTGTCTGTTATCAACCCCGAATCATTAGTGATT +AAGCAGGGGTATGGCGAGCCGTCGCTGAAAGCGGCGGTAGCAGGAAAAGCTTTCCAGTTT +GAACGTGAAGGCTACTTCTGCCTCGACAGCCGCTATGCAACGGCCGATAAGCTGGTCTTT +AACCGCACCGTGGGCCTGCGTGATACCTGGGCGAAAGCGGGCGAGTAA +>EXAM.0219.00001.b0001_00002 831 amiD | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 +ATGAAAGCGCTACTGTGGCTGGTGGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGCGAA +AAAGGAATTATCGATAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCC +TATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTG +GCGACGTTAACGGGTCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTA +TATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCG +GGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTG +GAAAATCGTGGTTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCG +CAAATTCAGGCATTGATTCCGTTAGCGAAGGATATTATCGCGCGCTATAACATCAAACCG +CAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGC +TTCCCGTGGCGCGAGCTGGCGGCGCAGGGGATTAGCGCCTGGCCTGACGCCCAGCGTGTG +GCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCG +TTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCACGCGAGCAGCAGCGG +GTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCC +GAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAA +>EXAM.0219.00001.b0002_00003 381 hpcD | 5-carboxymethyl-2-hydroxymuconate Delta-isomerase | 5.3.3.10 | similar to AA sequence:UniProtKB:Q05354 +ATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTTACCCGGC +CTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGATTTTCCCCATCGGCGGTATC +CGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGATTACGCG +TTTGTGCATATGACGCTGAAAATCGGCGCCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTT +GGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCGACCTGATGGAGAACCGCTAT +CTGGCGCTGTCGTTTGAGATTGCCGAGCTACATCCGACGCTCAATTACAAACAAAACAAC +GTACACGCGTTATTTAAATAG +>EXAM.0219.00001.i0002_00004 1410 NA | hypothetical protein | NA | NA +ATGAGCATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTGACGGCAGGTTCTGCGCTT +GCTGTTCTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCGAGACCGTC +GATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGCCGAC +GGACAGACCGTGGTCGTACCGCCAGGATGGGTGTGTGAAAATATCAATGCGGCGATAACG +ATTCCGGCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATGGCCGGGGACGG +TTTATTTTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAATGTG +ACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGGCGATGAGCGGCTTTGGC +CCCGTCGCGCAAATTTTCATCGGTGGTAAGGAACCGCAGGTGATGCGTAATCTCATTATC +GATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCATAACCAA +ATGGATGGCGCGCGGATTACGCATAGCCGCTTTAGCGATTTACAGGGGGACGCCATTGAG +TGGAATGTCGCGATTCACGACCGCGACATCCTGATTTCCGATCATGTCATCGAACGCATT +AATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCACCTAT +GACAACAGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTGGTGGCCAATATTACCGGA +TCTGATTGCCGACAGCTTGTGCACGTAGAAAATGGCAAACATTTCGTCATTCGCAATGTC +AAAGCCAAAAACATCACGCCCGGTTTCAGTAAAAATGCGGGTATTGATAACGCAACGATC +GCAATTTATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAGTGCCGGG +ATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAACTTTAAA +TTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCAAATT +TCCTCCGGCAACACCCCCTCTTTTGTCGCCATCACCAATGTACGGATGACGCGTGCTACG +CTGGAACTGCATAATCAACCGCAGCACCTCTTTCTGCGCAATATCAACGTGATGCAAACT +TCAGCGATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGTAAAGATGTACGTGGTCAA +TTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAATGAAAAC +GGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGCAGTG +AATTTTTCGCTGCCGAAGCGGGGAGGGTAA +>EXAM.0219.00001.i0002_00005 774 hisP | Histidine transport ATP-binding protein HisP | NA | similar to AA sequence:UniProtKB:P02915 +ATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTG +CTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGTGATTAGCATCATCGGCTCGTCC +GGCTCCGGTAAAAGCACTTTTTTGCGCTGTATTAACTTCCTCGAAAAACCGAGCGAAGGC +GCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGATGGGCAGCTCAAA +GTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCAC +TTCAACCTCTGGAGCCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTA +CTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGG +ATTGATGAGCGCGCTCAGGGCAAATATCCCGTCCATCTCTCCGGCGGCCAACAGCAGCGC +GTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACA +TCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAA +GAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCTCGCCATGTCTCTTCG +CACGTTATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGTGATCCGGAGCAGGTGTTC +GGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAA +>EXAM.0219.00001.i0002_00006 735 trmD_1 | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 +ATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTG +AACATCCAAAGCTGGAGTCCTCGCGACTTCGCGCATGACCGGCACCGTACCGTGGACGAC +CGTCCTTACGGCGGCGGACCGGGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATT +CACGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGA +CGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAGCTTATTCTGGTG +TGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGACCGAAATTGACGAAGAATGG +TCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCC +GTCGCCCGGTTTATACCGGGGGTTCTGGGGCATGAGGCATCAGCAATCGAAGATTCGTTT +GCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGAA +GTACCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCGCTGGCGTTTGAAACAG +TCGCTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAA +GAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGCATAAACAT +GATGGGATGGCATAG +>EXAM.0219.00001.i0002_00007 828 trmD_2 | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 +ATGATGGGATGGCATAGCCGTCATATGGGCTCTCAGAGGAGACGCGCTAATGCGCAGTAC +GTGTTTATTGGCATAGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGG +GTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGAC +TTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCGGGGATG +TTAATGATGGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCAGGTGAA +GGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGC +GAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAG +CGCGTAATTCAGACCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTCAGCGGT +GGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGGGTTCTG +GGGCATGAGGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCAC +TATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCGGGAAAC +CATGCTGAGATACGTCGCTGGCGTTTGAAACAGTCGCTGGGCCGAACCTGGCTTAGAAGA +CCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTC +AAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAG +>EXAM.0219.00001.i0002_00008 465 NA | hypothetical protein | NA | NA +ATGAAGTTTGTTGCGCCCGAACAGGCACCGGAACAGGCGGAGGTCATCAAAAATACGCCG +TTCTGGCCTGATGTGGACCTGTCGGAATTTCGCAGTGTGATGCGCACTGACGGCACGGTG +ACGCAGCCGCGTTTAAAGCAGGTCGTGCTGACGGCGATCTCTGAGGTTAACGCTGAGCTG +TACGACTTCCGCAACCGTCAGCAGATGCTGGGCTGGCGGACACTTGCTGAGGTTCCCGCA +GAAATGCTGGACGGTAAAAGCGAGCGTATCCGGCACTACCACAACGCTGTTTTTTGCTGG +GCGCGCGCTGTGCTTAATGAGCGTTATCAGGACTATGACGCCACGGCGTCAGGCGTGAAG +CGAGGGGAGGAGCTGGCGGAGGCCAGCGGCGATCTGTGGCGTGATGCCCGCTGGGCCATC +AGCCGGGTGCAGGATGTACCGCACTGTACGGTGGAGCTTATCTGA +>EXAM.0219.00001.i0002_00009 261 NA | hypothetical protein | NA | NA +ATGGCCGGGTTGCACGCGCCATATGCATATAGCGCGCATCATGCGGTGAATTTCTGTTCT +GAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGTATGTTCGAAAAGACCGGCGAT +CGTCAACTTAGCGCGTGGGAGGCTGGAATTCTGACGCGTCGCTATGGTCTGGATAAAGAA +ATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATGGCGGTTCGCTTCTTTATGGCC +GGTTATCGACTCGAAGGTTGA +>EXAM.0219.00001.i0002_00010 582 NA | hypothetical protein | NA | NA +ATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGC +CAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTTGAGATGATCGTCCCTCAGCTG +CCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAATCTCTCGTGCTTGAGCATGGCGGT +GCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAA +TGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACC +GACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGC +CATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGG +CTACTGCAACAGACGGGCGATGAAGTGCTGTATTACCGCCAGGCGGTGGCATATTACGCC +TCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCACGGGCTTCGAAGATTAT +TTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGA +>EXAM.0219.00001.b0002_00011 1308 dctM | C4-dicarboxylate TRAP transporter large permease protein DctM | NA | similar to AA sequence:UniProtKB:O07838 +ATGATTGACCCTATTTTTGCGTCCTGTACGCTAATTGCCGTCTTTGTTGTTTTACTGGCC +ATGGGCGCGCCTATCGGGATCTGCATCGTTATCGCCTCTTTCAGCACCATGATGCTGGTA +CTGCCTTTCGATATTTCGATGTTCGCCACCGCGCAAAAAATGTTCTCCAGCCTGGACAGT +TTTGCCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGCGGGGGA +ATTGCCGCCCGGCTGGTCAATTTTGCCAAACTGTTTACTGGCAAACTGCCCGGCTCGCTC +TCTTATACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCGATTGCC +GCCTCAACCTCCATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTACGATCGC +GGCTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAATTCCGCCCACC +ACGGCTTTTATCCTTTACGCGCTGGCAAGCGGGGGAACATCGATTGCCGCTCTGTTCGCC +GGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTGGTGGTC +GCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGTATGGCGCTAAAAGTT +GCCGTTGAGGCCATTCCCAGCCTGCTGCTGATCGTGATTATTGTCGGCGGCATTGTGCAG +GGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTATACGTTATTGCTGACG +ATGGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAGACAGTG +GTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTCTCGATG +TCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCCAATAAA +CTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATGGATATT +GGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCAATCATGGCTAAACTGGGCGTC +GATCCGGTGCATTTCGGCATTATCATGATCTATAACCTGGCGATTGGCACCATTACGCCG +CCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAGGAAGTG +ATTAAACCGTTGCTGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTATTAATTACCTAC +ATTCCGGAAATCACACTGTTCTTACCCCGTCTACTGGGCATCATGTAA diff --git a/Examples/1-res-Annotate/Genes/EXAM.1216.00002.gen b/Examples/1-res-Annotate/Genes/EXAM.1216.00002.gen new file mode 100644 index 0000000000000000000000000000000000000000..be5612c5b28d275f90fb47566718e6fcbb595233 --- /dev/null +++ b/Examples/1-res-Annotate/Genes/EXAM.1216.00002.gen @@ -0,0 +1,158 @@ +>EXAM.1216.00002.b0001_00001 831 amiD_1 | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 +ATGAGAGCGCTACTGTGGCTGGTGGGTCTTGCGTTGCTGTTAACAGGCTGCGCGAGCGAA +AAAGGAATTATCGATGAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCC +TATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTG +GCGACGTTAACGGGCCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTA +TATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCG +GGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTG +GAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCG +CAAATTCAGGCATTGATTCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCG +CAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGC +TTCCCGTGGCGCGAGCTGGCGGCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTG +GCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCG +TTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGG +GTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCC +GAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAA +>EXAM.1216.00002.i0001_00002 831 amiD_2 | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 +ATGAGAGCGCTACTGTGGCTGGTGGGTCTTGCGTTGCTGTTAACAGGCTGCGCGAGCGAA +AAAGGAATTATCGATGAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCC +TATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTG +GCGACGTTAACGGGCCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTA +TATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCG +GGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTG +GAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCG +CAAATTCAGGCATTGATTCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCG +CAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGC +TTCCCGTGGCGCGAGCTGGCGGCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTG +GCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCG +TTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGG +GTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCC +GAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAA +>EXAM.1216.00002.i0001_00003 474 hpcD | 5-carboxymethyl-2-hydroxymuconate Delta-isomerase | 5.3.3.10 | similar to AA sequence:UniProtKB:Q05354 +ATGCCGAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACAGG +GCGCGGTATCGCGAGGACTCTCTCTTCGAGGACATGCCGCACTTTATTGCTGAATGTACT +GAAAATATTCGCGAGCAGGCTGATTTACCCGGCCTGTTCAGCAAGGTAAACGAGGCGCTG +GCCGCCAGCGGGATTTTCCCCATCGGCGGTATCCGCAGTCGCGCCCACTGGCTGGATACC +TGGCAGATGGCTGACGGTAAGCATGATTACGCGTTTGTGCATATGACGCTGAAAATCGGC +GCCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTCGGCGAAATGCTGTTCGGGCTGATTAAA +GCCCACTTCGCCGACCTGATGGAGAACCGCTATCTGGCGCTGTCGTTTGAGATTGCCGAG +TTACATCCAACGCTCAATTACAAACAAAACAACGTACACGCGTTATTTAAATAG +>EXAM.1216.00002.i0001_00004 225 pspB | Phage shock protein B | NA | similar to AA sequence:UniProtKB:P0AFM9 +ATGAGCGCGCTATTTCTGGCCATCCCGTTAACCATTTTTGTGTTGTTTGTGTTACCGATT +TGGCTGTGGCTGCATTACAGCAACCGCGCCGGTCGGGGAGAACTGTCGCAAAGCGAGCAG +CAACGCTTACTGCAACTCACAGACGACGCGCAACGTATGCGCGAGCGCATTCAGGCGCTG +GAAGACATTCTTGATGCAGAGCATCCGAACTGGAGAGAGCGCTAA +>EXAM.1216.00002.i0001_00005 1404 NA | hypothetical protein | NA | NA +ATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTGACGGCAGGTTCTGCGCTTGCTGTT +CTTCCTTTCCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCGAGACCGTCGATATT +AAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGCCGACGGACAG +ACCGTGGTCGTACCGCCAGGATGGGTGTGTGAAAATATCAATGCGGCGATAACGATTCCG +GCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATGGCCGGGGACGGTTTATT +TTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAATGTGACGCTG +GATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGGCGATGAGCGGCTTTGGCCCCGTC +GCGCAAATTTTCATCGGTGGTAAGGAACCGCAGGTGATGCGTAATCTCATTATCGATGAC +ATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCATAACCAAATGGAC +GGCGCGAGGATTACGCATAGCCGCTTTAGCGATTTACAGGGGGACGCCATTGAGTGGAAT +GTCGCGATTCACGACCGCGATATCCTGATTTCCGATCATGTCATCGAACGCATTGATTGT +ACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCACCTATGACAAC +AGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTGGTGGCCAATATTACCGGATCTGAT +TGCCGACAGCTTGTGCACGTAGAAAATGGCAAACATTTCGTCATTCGCAATGTCAAAGCC +AAAAACATCACGCCCGATTTCAGTAAAAATGCGGGTATTGATAACGCAACGATCGCAATT +TATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAGTGCCGGGATGCTC +ATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAACTTTAAATTAAAC +GCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCAAATTTCCTCC +GGCAACACCCCCTCTTTTGTCGCCATCACCAATGTACGGATGACGCGTGCTACGCTGGAA +CTGCATAATCAACCGCAGCACCTCTTCCTGCGTAATATCAACGTGATGCAAACTTCAGCG +ATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGTAAAGATGTCCGTGGTCAATTTATG +GCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAATGAAAACGGGCAG +AGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGCAGTGAATTTT +TCGCTGCCGAAGCGGGGAGGGTAA +>EXAM.1216.00002.i0001_00006 774 hisP | Histidine transport ATP-binding protein HisP | NA | similar to AA sequence:UniProtKB:P02915 +ATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTG +CTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGTGATTAGCATCATCGGCTCGTCC +GGCTCCGGTAAAAGCACTTTTTTGCGCTGTATTAACTTCCTCGAAAAACCGAGCGAAGAC +GCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGACGGGCAGCTCAAA +GTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCAC +TTTAACCTCTGGAGCCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTA +CTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGG +ATTGATGAGCGCGCTCAGGGCAAATATCCCGTTCATCTCTCCGGTGGCCAACAGCAGCGC +GTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACT +TCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAA +GAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGTTCGCCATGTCTCTTCG +CACGTTATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGCGATCCGGAGCAGGTGTTC +GGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAA +>EXAM.1216.00002.i0001_00007 792 trmD | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 +ATGCGCGCGATATATCGGCGATGCGTGTTTATTGGCATCGTTAGCCTGTTTCCTGAAATG +TTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAATGGCCTGCTGAAC +ATCCAAAGCTGGAGTCCTCGCGACTTCACGCATGACCGGCACCGTACCGTGGACGATCGT +CCTTACGGCGGCGGACCAGGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATTCAC +GCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGC +AAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGT +GGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGACCGAAATTGACGAAGAATGGTCA +ATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCCGTC +GCCCGGTTTATACCGGGAGTTCTGGGGCATGAAGCATCAGCAATCGAAGATTCGTTTGCT +GATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTA +CCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCGCTGGCGTTTGAAACAGTCG +CTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAG +CAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGCATAAACATGAT +GGGATGGCATAG +>EXAM.1216.00002.b0001_00008 288 NA | hypothetical protein | NA | NA +ATGAATAATCATTTTGGGAAAGGGTTAATGGCCGGGTTGCACGCGCCATATGCATATAGC +GCGCATCATGCGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATTGGGTTTTACA +CACCGTATGTTCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGCTGGAATTCTG +ACGCGTCGCTATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGAGAATCATTCC +GGGATGGCGGTTCGCTTCTTTATGGCCGGTTATCGACTCGAAGGTTGA +>EXAM.1216.00002.b0002_00009 582 NA | hypothetical protein | NA | NA +ATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGC +CAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTTGAGATGATCGTCCCTCAACTA +CCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAGTCTCTCGTGCTTGAGCATGGCGGT +GCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAA +TGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACC +GACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGC +CATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGG +CTACTGCAACAGACGGGCGATGAAGTGCTGGATTACCGCCAGGCGGTGGCATATTACGCC +TCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCACGGGCTTCGAAGATTAT +TTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGA +>EXAM.1216.00002.b0002_00010 1308 dctM | C4-dicarboxylate TRAP transporter large permease protein DctM | NA | similar to AA sequence:UniProtKB:O07838 +ATGATTGACCCTATTTTTGCGTCCTGTACGCTAATTGCCGTCTTTGTTGTTTTACTGGCC +ATGGGCGCGCCTATCGGGATCTGCATCGTTATCGCCTCTTTCAGCACCATGATGCTGGTA +CTGCCTTTCGATATTTCGATGTTCGCCACCGCGCAAAAAATGTTCTCCAGCCTGGACAGT +TTTGCCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGCGGGGGA +ATTGCCGCCCGGCTGATCAATTTTGCCAAACTGTTTACTGGCAAACTGCCCGGTTCGCTC +TCTTATACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCAATTGCC +GCCTCAACCTCCATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTACGATCGC +GGCTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAATTCCGCCCACC +ACGGCTTTTATCCTTTATGCGCTGGCAAGCGGGGGAACATCGATTGCCGCTCTGTTCGCC +GGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTGGTAGTC +GCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGCATGGCGCTAAAAGTT +GCCGTTGAGGCCATTCCCAGCCTGTTACTGATCGTGATTATCGTCGGCGGCATTGTGCAG +GGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTATACGTTATTGCTGACG +ATGGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAGACAGTG +GTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTCTCAATG +TCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCCAATAAA +CTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATGGATATC +GGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCGATCATGGCTAAACTGGGCGTC +GATCCGGTGCATTTGGGCATTATCATGATCTATAACCTGGCGATTGGCACCATTACGCCG +CCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAGGAAGTG +ATTAAACCGTTGCTGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTATTAATTACCTAC +ATTCCGGAAATCATACTGTTCTTACCCCGTCTACTGGGCATCATGTAA +>EXAM.1216.00002.b0003_00011 255 NA | hypothetical protein | NA | NA +ATGATAATTAATGGGAAATTAATTAAAGCAAAAGACTTAGCTAAGGCTGCAGGTGTATCT +CGTTCAACAGTGATTAAATATTACGGCATTAGCCGTGAGAATTACGAAAGGGTAGCAACT +GAAAGAAGGAAGCTTGCTTTTGAACTAAGAGCATCAGGTTTAAAATGGAAAGAAGTTGCT +GAAAAAATGAACACGACAAAATATAGCGCAATTGCATATTATAGACGATATTTAGCATTA +GAGAAAAACAAATAA +>EXAM.1216.00002.b0003_00012 780 NA | Insertion sequence IS5376 putative ATP-binding protein | NA | similar to AA sequence:UniProtKB:Q45619 +ATGGTCGAACTGCAACATCAACGGCTGATGGTGCTTGCCGAACAGCTCCAGCTGGACAGT +CTTATCGGCGCAGCGCCGGCGCTGTCGCAACAGGCGGTGGATCAGGAATGGAGCTACATG +GACTTCCTGGAGCACCTGTTACATGAGGAGAAACTGGCCCGGCATCAGCGTAAACAGGCG +ATGTACACGCGGATGGCAGCCTTCCCGGCGGTAAAGACGTTCGAGGAGTACGACTTCACC +TTCGCCACCGGCGCTCCTCAGAAGCAAATCCAGTCGCTGCGATCCCTGAGCTTCATAGAG +CGTAACGAAAACATCGTGTTGCTGGGGCCATCGGGCGTGGGAAAAACGCATCTGGCGATA +GCCATGGGCTACGAAGCAGTACGGGCGGGCATCAAGGTTCGCTTCACAACAGCAGCGGAC +CTGCTGCTACAGCTGTCCACTTCACAGCGTCAGGGCCGTTACAAAACGACTCTCAATCGT +GGTGTCATGGCCCCGAAGCTGCTTATCATCGATGAAATAGGTTATCTGCCGTTCAGTCAG +GAGGAAGCCAAGCTGTTCTTCCAGGTCATCGCCAAACGTTACGAGAAGAGCGCGATGATC +CTGACCTCCAACCTGCCGTTCGGGCAGTGGGATCAGACGTTCGCCGGTGATGCAGCGCTG +ACATCGGCGATGCTGGACCGGATCTTACATCACTCACACGTCGTGCAAATAAAAGGGGAA +AGCTATCGACTGAAGCAGAAACGAAAGGCCGGGGTTATAGCTGAAGCTAATCCTGAGTAA diff --git a/Examples/1-res-Annotate/Genes/GEN2.0219.00001.gen b/Examples/1-res-Annotate/Genes/GEN2.0219.00001.gen new file mode 100644 index 0000000000000000000000000000000000000000..63fed11564a754973c279b1a561afd4d445a59a3 --- /dev/null +++ b/Examples/1-res-Annotate/Genes/GEN2.0219.00001.gen @@ -0,0 +1,185 @@ +>GEN2.0219.00001.b0001_00001 774 hisP | Histidine transport ATP-binding protein HisP | NA | similar to AA sequence:UniProtKB:P02915 +ATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTG +CTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGTGATTAGCATTATCGGCTCGTCC +GGTTCCGGTAAAAGCACTTTTTTGCGCTGCATTAACTTCCTCGAAAAATCGAGCGAAGGC +GCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGACGGGCAGCTCAAA +GTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCAC +TTTAACCTCTGGAACCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTA +CTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGG +ATTGATGAGCGCGCTCAGGGCAAATATCCCGTCCATCTCTCCGGCGGCCAACAGCAGCGC +GTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACA +TCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAA +GAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCCCGCCATGTCTCTTCG +CACGTGATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGCAATCCGGAGCAGGTGTTC +GGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAA +>GEN2.0219.00001.b0001_00002 1404 NA | hypothetical protein | NA | NA +ATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTGACGGCAGGTTCCGCGCTTGCTGTT +CTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCCAGACCGTCGATATT +AAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGCCGACGGGCAG +ACTGTGGTCGTGCCGTCAGGATGGGTGTGTGAAAATATCAATGCGGCGATAACGATTCCG +GCGGGAAAAACGCTGCGGATACAGGGCGCGGTGCGTGGGAATGGCCGGGGACGGTTTATT +TTGCTGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAATGTGACGCTG +GATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGACGATGAGCGGCTTTGGCCCCGTC +GCGCAAATTTTCATCGGCGGTAAGGAACCGCAGGTGATGCGTAATCTCATTATCGATGAC +ATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCATAACCAAATGGAC +GGCGCGCGGATTACGCATAGTCGCTTTAGCGATTTGCAGGGGGACGCCATTGAGTGGAAT +GTCGCGATTCATGACCGCGACATCCTGATTTCCGATCATGTCATCGAACGCATTGATTGT +ACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCGCCTATGACAAT +AGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTGGTGGCCAATATTACCGGATCTGAT +TGCCGACAACTGGTACACGTAGAAAATGGCAAACATTTCGTCATTCGCAATGTCAAAGCC +AAAAACATCACGCCCGATTTCAGTAAAAATGCGGGTATTGATAACGCAACGATCGCCATT +TATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAGTGCCGGGATGCTC +ATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAACTTTAAATTAAAC +GCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCAAATTTCCTCC +GGTAACGCCCCCTCATTTGTTGCCATCACCAATGTACGGATGACGCGTGCTACGCTGGAA +CTGCATAATCAACCGCAGCACCTCTTTTTGCGTAATATCAACGTGATGCAAACTTCAGCG +ATTGGCCCGGCGTTAAAAATGCATTTTGATTTGCGTAAAGATGTCCGTGGTCAATTTATG +GCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAATGAAAACGGGCAG +AGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGCAGTGAATTTT +TCGCTGCCGAAGCGGGGAGGGTAA +>GEN2.0219.00001.b0002_00003 225 pspB | Phage shock protein B | NA | similar to AA sequence:UniProtKB:P0AFM9 +ATGAGCGCGCTATTTCTGGCCATCCCGTTAACCATTTTTGTGTTGTTTGTGTTACCGATT +TGGCTGTGGCTGCATTACAGCAACCGCGCCGGTCGGGGAGAACTGTCGCAAAGCGAGCAG +CAACGCTTACTGCAACTCACAGACGACGCGCAACGTATGCGCGAGCGCATTCAGGCGCTG +GAAGATATTCTTGATGCAGAGCATCCGAACTGGAGAGAGCGCTAA +>GEN2.0219.00001.i0002_00004 381 hpcD | 5-carboxymethyl-2-hydroxymuconate Delta-isomerase | 5.3.3.10 | similar to AA sequence:UniProtKB:Q05354 +ATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTTACCAAGC +CTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCACCGGGATTTTCCCCATCGGCGGTATC +CGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGATTACGCG +TTTGTGCATATGACGCTGAAAATCGGCGCCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTC +GGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCGACCTGATGGAGAACCGCTAT +CTGGCGCTGTCGTTTGAGATTGCCGAGTTACATCCAACGCTCAATTACAAACAAAACAAC +GTACACGCGTTATTTAAATAG +>GEN2.0219.00001.i0002_00005 834 amiD | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 +ATGATGAAAGCGCTACTGTGGCTGGTTGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGC +GAAAAAGGAATTATCGATAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCG +GCCTATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCG +CTGGCGACGTTAACGGGTCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCA +TTATATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCTTGGCAT +GCGGGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAG +CTGGAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCC +GCGCAAATTCAGGCATTGATCCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAA +CCGCAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCG +CGCTTCCCGTGGCGCGAGCTGGCGGCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGT +GTGGCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTT +GCGTTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAG +CGGGTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGAT +GCCGAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAA +>GEN2.0219.00001.b0002_00006 1668 glnS | Glutamine--tRNA ligase | 6.1.1.18 | similar to AA sequence:UniProtKB:P00962 +ATGAGTGAGGCTGAAGCCCGCCCGACTAACTTTATTCGTCAGATTATTGATGAAGATCTG +GCGAGTGGTAAACATACCACTGTCCATACCCGTTTTCCGCCGGAGCCGAATGGCTATCTG +CACATCGGCCACGCGAAATCTATCTGCCTGAACTTTGGCATCGCGCAAGATTATCAGGGC +CAGTGCAACCTGCGTTTCGATGACACCAACCCGGTAAAAGAAGATATCGAGTACGTTGAT +TCGATCAAAAACGACGTCGAGTGGTTAGGCTTTCACTGGTCTGGCGATATTCGCTACTCC +TCCGATTACTTTGACCAACTGCACGCCTATGCGGTCGAGCTCATCAATAAAGGCCTGGCC +TACGTTGATGAGCTGACGCCGGAGCAGATCCGCGAATACCGTGGTACGCTGACCGCGCCG +GGTAAAAACAGCCCGTTCCGCGATCGCAGCGTGGAAGAAAACCTCGCGCTATTTGAAAAA +ATGCGTACCGGCGGTTTTGAAGAGGGTAAAGCCTGTCTGCGCGCTAAAATCGACATGGCG +TCGCCGTTTATCGTGATGCGCGATCCGGTGCTGTATCGCATTAAATTCGCCGAGCATCAT +CAGACCGGCACGAAGTGGTGCATCTATCCGATGTACGACTTTACTCACTGCATCAGCGAT +GCGCTGGAAGGCATTACTCATTCTCTGTGTACGCTGGAGTTCCAGGACAACCGTCGTCTG +TACGACTGGGTGCTGGACAACATCACCATTCCGGTTCACCCGCGCCAGTACGAATTCTCG +CGCCTGAATCTGGAATACACCGTGATGTCCAAGCGTAAGCTGAACCTGCTGGTGACCGAC +AAACACGTCGAAGGTTGGGATGATCCGCGTATGCCGACTATTTCCGGTCTGCGCCGTCGC +GGCTATACCGCGGCTTCTATTCGTGAGTTCTGCAAACGCATCGGCGTCACCAAGCAGGAC +AACACTATTGAGATGGCGTCGCTGGAATCCTGCATTCGCGAAGATCTGAACGAAAACGCG +CCGCGCGCGATGGCGGTAATCGATCCGGTAAAACTGGTTATCGAAAACTACCCGCAGGGC +GAGAGCGAAATGGTTACCATGCCTAACCATCCGAATAAACCGGAGATGGGCAGCCGTGAA +GTGCCGTTTAGCGGTGAGATCTGGATCGATCGCGCGGATTTCCGCGAAGAAGCGAACAAA +CAGTACAAACGTCTGGTGATGGGCAAAGAAGTGCGTCTGCGTAATGCCTACGTCATTAAA +GCGGAGCGCGTAGAGAAGGATGCCGAAGGGAATATCACCACCATCTTCTGTACCTATGAT +GCTGATACGCTGAGTAAAGATCCGGCTGACGGGCGTAAAGTGAAAGGCGTAATCCACTGG +GTTAGCGTAGCACATGCGCTGCCGATTGAAATTCGTCTCTACGACCGTCTGTTCAGCGTG +CCGAACCCGGGCGCCGCGGAGGACTTCCTGTCTGTTATCAACCCCGAATCATTAGTGATT +AAGCAGGGGTATGGCGAGCCGTCGCTGAAAGCGGCGGTAGCAGGGAAAGCTTTCCAGTTT +GAACGTGAAGGTTACTTCTGCCTTGACAGCCGCTATGCAACGGCCGATAAGCTGGTCTTT +AACCGCACCGTGGGCCTGCGTGATACCTGGGCGAAAGCGGGCGAATAA +>GEN2.0219.00001.b0003_00007 768 trmD | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 +GTGTTTATTGGCATCGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGG +GTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGAC +TTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCGGGGATG +TTAATGATGGTGCAACCCTTGCGGGACGCCATTCATGCAGCAAAAGCCGCGGCAGGTGAA +GGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGC +GAGCTGGCCACGAATCAGAAACTCATTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAG +CGCGTAATTCAGGCCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTCAGCGGT +GGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGGGTTCTG +GGGCATGAGGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCAC +TATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCGGGAAAC +CATGCTGAGATACGTCGCTGGCGTTTGAAACAGTCACTGGGCCGAACCTGGCTTAGAAGA +CCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTC +AAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAG +>GEN2.0219.00001.i0003_00008 261 NA | hypothetical protein | NA | NA +ATGGCCGGGTTGCACGCGCCATATGCATATAGCGCGCATCATGCGGTGAATTTCTGTTCT +GAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGTATGTTCGAAAAGACCGGCGAT +CGTCAACTTAGCGCGTGGGAGGCCGGAATTCTGACGCGTCGCTATGGTCTGGATAAAGAA +ATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATGGCGGTTCGCTTCTTTATGGCT +GGTTATCGACTCGAAGGTTGA +>GEN2.0219.00001.i0003_00009 582 NA | hypothetical protein | NA | NA +ATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGC +CAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTTGAGATGATCGTCCCTCAACTG +CCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAATCTCTCGTACTTGAGCATGGCGGT +GCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAA +TGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACC +GACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGC +CATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGG +CTACTGCAACAGACGGGCGATGAAGTGCTGGATTACCGCCAGGCGGTGGCATATTACGCC +TCCTGCCGTCAGACAGTGACCGAGGGGGGTAATCACGCATTCACGGGCTTCGAAGATTAT +TTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGA +>GEN2.0219.00001.b0003_00010 369 NA | hypothetical protein | NA | NA +ATGATGCGCACTGATGGCACGGTGACGCAGCCGCGTTTAAAGCAGGTTGCGCTGTCGGCA +ATTTCGGAGGTCAACGCAGAGCTGTATGAGTTTCGCAGACGCCAGCAGATGCTGGGGTAT +GCCTCGCTGGCAGAAGTCCCGGCGGAACAACTGGACGGCAAAAGCGAGCGCATTCAGCAC +TATTTCAACGCGGTTTACTGCTGGGCACGCGCCATGCTCAACGAACGTTACCAGGACTAT +GACGCCACGGCATCCGGTGTGAAGCGGGGCGAAGAACTGGCAGAAGCCAGCGGTGATTTG +TGGCGTGACGCCCGCTGGGCCATCAGCCGGGTGCAGGATGCGCCGCACTGCACAGTGGAG +CTTATCTGA +>GEN2.0219.00001.b0004_00011 1308 dctM | C4-dicarboxylate TRAP transporter large permease protein DctM | NA | similar to AA sequence:UniProtKB:O07838 +ATGATTGACCCTATTTTTGCGTCCTGTACGCTAATTGCCGTCTTTGTTGTTTTACTGGCC +ATGGGCGCGCCTATCGGGATCTGCATCGTTATCGCCTCTTTCAGCACCATGATGCTGGTA +CTGCCTTTCGATATTTCGATGTTCGCCACCGCGCAAAAAATGTTCTCCAGCCTGGACAGT +TTTGCCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGCGGGGGA +ATTGCCGCCCGACTGGTCAATTTTGCCAAACTGTTTACTGGCAAACTGCCCGGCTCGCTC +TCTTACACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCGATTGCC +GCCTCAACCTCTATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTACGATCGC +GGTTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAATTCCGCCCACC +ACGGCTTTTATCCTTTATGCGCTGGCAAGCGGGGGAACATCGATTGCCGCTCTGTTCGCC +GGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTGGTGGTC +GCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGCATGGCGCTAAAAGTT +GCCGTTGAGGCCATTCCCAGCCTGTTACTGATCGTGATTATTGTCGGCGGCATTGTGCAG +GGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTATACGTTATTGCTGACG +ATAGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAGACAGTG +GTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTCTCGATG +TCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCCAATAAA +CTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATGGATATC +GGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCGATTATGACTAAACTGGGCGTC +GATCCGGTGCATTTCGGCATTATCATGATCTATAACCTGGCGATAGGCACCATTACGCCG +CCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAGGACGTT +ATCAAACCGTTGATGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTATTAATTACCTAC +ATTCCGGAAATCACACTGTTTTTACCCCGTCTACTGGGCATCATGTAA +>GEN2.0219.00001.i0004_00012 465 NA | hypothetical protein | NA | NA +ATGAAGTTTGTTGCGCCAGAACAGGCACCGGAACAGGCGGAGGTCATCAAAAATACGCCG +TTCTGGCCTGATGTGAACCTGTCGGAATTTCGCAGTGTGATGCGCACTGACGGCACGGTG +ACGCAGCCGCGTTTAAAGCAGGTCGTGCTGACGGCGATCTCTGAGGTTAACGCTGAGCTG +TACGACTTCCGCAACCGTCAGCAGATGCTGGGCTGGCGGACACTTGCTGAGGTTCCCGCA +GAAATGCTGGACGGTAAAAGCGAGCGTATCCGGCACTACCACAACGCTGTTTTTTGCTGG +GCGCGCGCTGTGCTTAATGAGCGTTATCAGGACTATGACGCCACGGCGTCAGGCGTGAAG +CGAGGGGAGGAGCTGGCGGAGGCCAGCGGCGATCTGTGGCGTGATGCCCGCTGGGCCATC +AGCCGGGTGCAGGATGCACCGCACTGTACGGTGGAGCTTATCTGA +>GEN2.0219.00001.b0004_00013 1014 xerC | Tyrosine recombinase XerC | NA | similar to AA sequence:UniProtKB:P39776 +ATGGAAACAAATATTACCTGGCAACAATTGATAGATGAATATTTCTTCGCAAAACCTCTG +CGCTCAGCATCTGAATGGAGTTACACCAAAGTCTTCAAATCATTTGTACATTATATGGGG +CCGTTAAGCTGCCCTAATGATGTGACATATCACAAAGTGCTTGCCTGGCGCCGTTTTCTT +TTAAAAGAGAAAAAGCTGTCCGGACGTACCTGGAATAACAAGGTGGCGCATATGCGGGCC +ATCTTTAACTACGGAATACAGCGAGGGTTACTGCACTATGACGAAAATCCGTTTAACAAT +TCGGTAGTTAAACCGGACAAGAAGAGAAAGAAAACGCTCACTCAGGCACAGATTGAGTAT +GCCTATCAGATCATGGAGCAGTATGAAAATCAGGAGAATACAGGGCTGGGACTGAAATAT +TCCCGCTGCGCCTTATTTCCTGCATGGTTCTGGCTCACTGTCCTGGATACGCTCTATTAC +ACAGGGATACGTCAGAACCAGTTATTACATATTCGGCTGAATGATGTTGATTTGAGAGAA +GGGCAGATTCGGCTGATTACGGAGGGGTGTAAAAATCACAAAGAACACTATGTGCCGGTG +ATCAGTTTTCTGCGTCCACGGCTGACCTGTTTAATGGAGAAAGCGCAGAGCGAAGGATTG +AAAGGTAATGACCGCCTGTTCAATATTGCACTTTTTACCGGCAAAGATCCCGCCATTGGC +GATGACATGGATTCTCCTCAGGTAAGAGCATTCTTCCGTCGTCTGTCCAAGGAGTGTCAG +TTTGCGATCAGTCCTCATCGTTTCAGACACACGCTGGCCACGGAGATGATGAAAATGCCG +GAACAGAATCTGCATATGGCGCAAAGTGTGCTGGGTCATTCAAACATGAAATCCACGCTG +GAGTATGTGGAGAATGATATTGCAGTGATGGGGAGGGCTCTGGAAGCGCAGTTTATGCAG +ATTAAGGCAGCACATGCCCGAAGCATTTACAGTGGGTTGACAAAGAATAGATAA diff --git a/Examples/1-res-Annotate/Genes/GEN4.1111.00001.gen b/Examples/1-res-Annotate/Genes/GEN4.1111.00001.gen new file mode 100644 index 0000000000000000000000000000000000000000..91d8ef700197e5048f23c5da39581ad95e81c110 --- /dev/null +++ b/Examples/1-res-Annotate/Genes/GEN4.1111.00001.gen @@ -0,0 +1,128 @@ +>GEN4.1111.00001.b0001_00001 831 amiD | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 +ATGAAAGCGCTACTGTGGCTGGTGGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGCGAA +AAAGGAATTATCGATAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCC +TATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTG +GCGACGTTAACGGGCCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTA +TATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCG +GGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTG +GAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCG +CAAATTCAGGCATTGATTCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCG +CAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGC +TTCCCGTGGCGCGAGCTGGCGGCGCAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTG +GCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCG +TTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAGCAGCGG +GTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCC +GAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAA +>GEN4.1111.00001.i0001_00002 381 hpcD | 5-carboxymethyl-2-hydroxymuconate Delta-isomerase | 5.3.3.10 | similar to AA sequence:UniProtKB:Q05354 +ATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTTACCCGGC +CTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGATTTTCCCCATCGGCGGTATC +CGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGATTATGCG +TTTGTGCATATGACGCTGAAAATCGGTACCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTC +GGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCGACCTGATGGAGAACCGCTAT +CTGGCGCTGTCGTTTGAGATTGCCGAGCTACATCCGACGCTCAATTACAAACAAAACAAC +GTACACGCGTTATTTAAATAG +>GEN4.1111.00001.i0001_00003 804 hpcG | 2-oxo-hept-4-ene-1,7-dioate hydratase | 4.2.1.163 | similar to AA sequence:UniProtKB:P42270 +ATGCTCGATAAACAGACCCATACCCTGATCGCCCAGCGACTTAATCAGGCTGAAAAACAG +CGTGAACAAATTCGCGCAGTGTCGCTGGATTATCCCAACATCACTATTGAAGATGCCTAT +GCCGTACAGCGTGAATGGGTCAATATCAAGATCGCCGAAGGGCGCACGCTCAAAGGCCAC +AAAATCGGCCTGACCTCAAAAGCGATGCAGGCCAGCTCGCAAATCAGCGAACCGGATTAC +GGCGCGCTGCTTGACGATATGTTCTTCCATGACGGCGGCGATATCCCCACCGACCGTTTT +ATCGTCCCGCGTATTGAAGTGGAGCTGGCGTTCGTGCTGGCGAAACCGCTGCGCGGCCCT +CACTGCACGCTGTTCGACGTCTACAACGCCACGGATTATGTGATTCCGGCGCTGGAACTG +ATTGACGCCCGCAGCCACAACATCGACCCGGAAACCCAGCGCCCGCGCAAAGTGTTCGAC +ACCATTTCCGACAACGCCGCCAACGCCGGGGTGATCCTCGGTGGTCGCCCCATCAAACCA +GACGAGCTGGATCTGCGCTGGATCTCCGCGCTGCTCTATCGCAACGGCGTGATCGAAGAA +ACCGGCGTCGCCGCAGGCGTGCTGAATCATCCGGCCAACGGCGTGGCGTGGCTGGCGAAC +AAGCTTGCCCCCTACGATGTCCAGCTTGAAGCCGGGCAGATCATCCTCGGCGGCTCGTTC +ACCCGCCCGGTGCCGGCGAGCAAGGGCGACACCTTCCATGTCGATTACGGCAACATGGGC +GCGATCAGTTGCCGGTTTGTGTAA +>GEN4.1111.00001.i0001_00004 1404 NA | hypothetical protein | NA | NA +ATGCCCGTGAATAAGTTCTCCCGACGTACCCTCCTGACGGCAGGTTCCGCGCTTGCTGTT +CTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCGAGACCGTCGATATT +AAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGCCGACGGACAG +ACCGTGGTCTTACCGCCAGGATGGGTGTGTGAAAATATCAATGCGGCGATAACGATTCCG +GCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATGGCCGGGGACGGTTTATT +TTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAATGTGACGCTG +GATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGACGATGAGCGGCTTTGGCCCCGTC +GCGCAAATTTTCATCGGCGGTAAGGAACCGCAGGTGATGCGTAATCTCATTATCGATGAC +ATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCATAACCAAATGGAC +GGCGCGCGGATTACGCATAGCCGCTTTAGCGATTTGCAGGGGGACGCCATTGAGTGGAAT +GTCGCGATTCACGACCGCGACATCCTGATTTCCGATCATGTCATCGAACGCATTGATTGT +ACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCACCTATGACAAC +AGTTATCCTGAAGATCAGGCAGTAAAAAACTTTGTGGTGGCCAATATTACCGGATCTGAT +TGCCGACAGCTGGTGCACGTAGAAAATGGCAAACATTTCGTCATTCGCAATGTCAAAGCC +AAAAACATCACGCCCGATTTCAGTAAAAATGCGGGTATTGATAACGCAACGATCGCCATT +TATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAGTGCTGGGATGCTC +ATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAACTTTAAATTAAAC +GCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCAAATTTCCTCC +GGCAACATCCCCTCTTTTGTCGCCATCACCAATGTACGGATGACGCGTGCTACGCTGGAA +CTGCATAATCAACCGCAGCACCTCTTTCTGCGTAATATCAACGTGATGCAAACTTCAGCG +ATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGTAAAGATGTCCGTGGTCAATTTATG +GCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAATGAAAACGGGCAG +AGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGCAGTGAATTTT +TCGCTGCCGAAGCGGGGAGGGTAA +>GEN4.1111.00001.i0001_00005 774 hisP | Histidine transport ATP-binding protein HisP | NA | similar to AA sequence:UniProtKB:P02915 +ATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTG +CTGAAAGGGGTATCGTTGCAGGCCCGCGCCGGAGATGTGATTAGCATCATCGGCTCGTCC +GGCTCCGGTAAAAGCACTTTTTTGCGCTGTATTAACTTCCTCGAAAAACCGAGCGAAGGC +GCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGACGGGCAGCTCAAA +GTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCAC +TTTAACCTCTGGAGCCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTA +CTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGG +ATTGATGAGCGCGCTCAGGGCAAATATCCCGTCCATCTCTCCGGCGGCCAACAGCAGCGC +GTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACA +TCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAA +GAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCCCGCCATGTCTCTTCG +CACGTTATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGCGATCCGGAGCAGGTGTTC +GGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAA +>GEN4.1111.00001.i0001_00006 768 trmD | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 +GTGTTTATTGGCATCGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGG +GTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGAC +TTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCGGGGATG +TTAATGATGGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCAGGTGAA +GGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGC +GAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAG +CGCGTAATTCAGACCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTCAGCGGT +GGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGGGTTCTG +GGGCATGAGGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCAC +TATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCGGGAAAC +CATGCCGAGATACGTCGCTGGCGCTTGAAACAGTCGCTGGGCCGAACCTGGCTTAGAAGA +CCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTC +AAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAG +>GEN4.1111.00001.i0001_00007 261 NA | hypothetical protein | NA | NA +ATGGCTGGGTTGCACGCGCCATATGCATATAGCGCGCATCATGCGGTGAATTTCTGTTCT +GAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGTATGTTCGAAAAGACCGGCGAT +CGTCAACTTAGCGCGTGGGAGGCCGGAATTCTGACGCGTCGCTATGGTCTGGATAAAGAA +ATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATGGCGGTTCGCTTCTTTATGGTC +GGTTATCGACTCGAAGGTTGA +>GEN4.1111.00001.i0001_00008 582 NA | hypothetical protein | NA | NA +ATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGC +CAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTCGAGATGATCGTCCCTCAACTA +CCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAATCTCTCGTGCTTGAGCATGGCGGT +GCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAA +TGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACC +GACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGC +CATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGG +CTACTGCAACAGACGGGCGATGAAGTGCTGGATTACCGCCAGGCGGTGGCATATTACGCC +TCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCACGGGCTTCGAAGATTAT +TTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGA +>GEN4.1111.00001.b0001_00009 1083 dctM | C4-dicarboxylate TRAP transporter large permease protein DctM | NA | similar to AA sequence:UniProtKB:O07838 +ATGAATAGCGGGGGAATTGCCGCCCGGCTGGTCAATTTTGCCAAACTGTTTACTGGCAAA +CTGCCCGGCTCGCTCTCTTATACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCC +GGATCGGCAATTGCCGCCTCAACCTCCATCGGCGGCGTGATGGTGCCGATGAGCGCGCGC +GAAGGTTACGATCGCGGCTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATG +TTAATTCCGCCCACCACGGCTTTTATCCTTTATGCGCTGGCAAGCGGGGGAACATCGATT +GCCGCTCTGTTCGCCGGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTG +GTCACGCTGGTGGTCGCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGC +ATGGCGCTAAAAGTTGCCGTTGAGGCCATTCCCAGCCTGCTGCTGATCGTGATTATCGTC +GGCGGCATTGTGCAGGGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTAT +ACATTATTGTTGACGATGGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATT +TTGCTCCAGACAGTGGTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCG +ATGTCCTTCTCGATGTCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGT +ATTTCCGCCAATAAACTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGC +GCATTTATGGATATCGGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCGATCATG +GCTAAACTGGGCGTCGATCCGGTGCATTTTGGCATTATCATGATCTATAACCTGGCGATT +GGCACCATTACGCCGCCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTC +AAAGTTGAGGAAGTGATTAAACCGTTGCTGCCTTTTTACGGCGCGATTATCGGCGTTCTG +TTATTAATTACCTACATTCCGGAAATCACACTGTTCTTACCCCGTCTACTGGGCATCATG +TAA diff --git a/Examples/1-res-Annotate/LSTINFO-list_genomes.lst b/Examples/1-res-Annotate/LSTINFO-list_genomes.lst new file mode 100644 index 0000000000000000000000000000000000000000..7c51ff2c46ca7362b2d6fb2576eb4fd7a5eebf23 --- /dev/null +++ b/Examples/1-res-Annotate/LSTINFO-list_genomes.lst @@ -0,0 +1,5 @@ +gembase_name orig_name gsize nb_conts L90 +EXAM.0219.00001 genome1.fst 9808 2 2 +EXAM.1216.00002 genome3-chromo.fst-all.fna 8817 3 3 +GEN2.0219.00001 genome2.fst 10711 4 4 +GEN4.1111.00001 genome4.fst 7134 1 1 diff --git a/Examples/1-res-Annotate/LSTINFO/EXAM.0219.00001.lst b/Examples/1-res-Annotate/LSTINFO/EXAM.0219.00001.lst new file mode 100644 index 0000000000000000000000000000000000000000..2311964a8a8a2534bd269f71ba0f8c3400e480bd --- /dev/null +++ b/Examples/1-res-Annotate/LSTINFO/EXAM.0219.00001.lst @@ -0,0 +1,11 @@ +213 1880 D CDS EXAM.0219.00001.b0001_00001 glnS | Glutamine--tRNA ligase | 6.1.1.18 | similar to AA sequence:UniProtKB:P00962 +1916 2746 D CDS EXAM.0219.00001.b0001_00002 amiD | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 +16 396 D CDS EXAM.0219.00001.b0002_00003 hpcD | 5-carboxymethyl-2-hydroxymuconate Delta-isomerase | 5.3.3.10 | similar to AA sequence:UniProtKB:Q05354 +428 1837 D CDS EXAM.0219.00001.i0002_00004 NA | hypothetical protein | NA | NA +1869 2642 D CDS EXAM.0219.00001.i0002_00005 hisP | Histidine transport ATP-binding protein HisP | NA | similar to AA sequence:UniProtKB:P02915 +2709 3443 D CDS EXAM.0219.00001.i0002_00006 trmD_1 | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 +3427 4254 D CDS EXAM.0219.00001.i0002_00007 trmD_2 | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 +4299 4763 D CDS EXAM.0219.00001.i0002_00008 NA | hypothetical protein | NA | NA +4818 5078 D CDS EXAM.0219.00001.i0002_00009 NA | hypothetical protein | NA | NA +5117 5698 D CDS EXAM.0219.00001.i0002_00010 NA | hypothetical protein | NA | NA +5717 7024 D CDS EXAM.0219.00001.b0002_00011 dctM | C4-dicarboxylate TRAP transporter large permease protein DctM | NA | similar to AA sequence:UniProtKB:O07838 diff --git a/Examples/1-res-Annotate/LSTINFO/EXAM.1216.00002.lst b/Examples/1-res-Annotate/LSTINFO/EXAM.1216.00002.lst new file mode 100644 index 0000000000000000000000000000000000000000..8bf48ea4de46727cf88a6a26b61df13a601de0ed --- /dev/null +++ b/Examples/1-res-Annotate/LSTINFO/EXAM.1216.00002.lst @@ -0,0 +1,12 @@ +1 831 D CDS EXAM.1216.00002.b0001_00001 amiD_1 | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 +880 1710 D CDS EXAM.1216.00002.i0001_00002 amiD_2 | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 +1655 2128 D CDS EXAM.1216.00002.i0001_00003 hpcD | 5-carboxymethyl-2-hydroxymuconate Delta-isomerase | 5.3.3.10 | similar to AA sequence:UniProtKB:Q05354 +2172 2396 D CDS EXAM.1216.00002.i0001_00004 pspB | Phage shock protein B | NA | similar to AA sequence:UniProtKB:P0AFM9 +2425 3828 D CDS EXAM.1216.00002.i0001_00005 NA | hypothetical protein | NA | NA +3863 4636 D CDS EXAM.1216.00002.i0001_00006 hisP | Histidine transport ATP-binding protein HisP | NA | similar to AA sequence:UniProtKB:P02915 +4641 5432 D CDS EXAM.1216.00002.i0001_00007 trmD | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 +5460 5747 D CDS EXAM.1216.00002.b0001_00008 NA | hypothetical protein | NA | NA +1 582 D CDS EXAM.1216.00002.b0002_00009 NA | hypothetical protein | NA | NA +629 1936 D CDS EXAM.1216.00002.b0002_00010 dctM | C4-dicarboxylate TRAP transporter large permease protein DctM | NA | similar to AA sequence:UniProtKB:O07838 +26 280 D CDS EXAM.1216.00002.b0003_00011 NA | hypothetical protein | NA | NA +323 1102 D CDS EXAM.1216.00002.b0003_00012 NA | Insertion sequence IS5376 putative ATP-binding protein | NA | similar to AA sequence:UniProtKB:Q45619 diff --git a/Examples/1-res-Annotate/LSTINFO/GEN2.0219.00001.lst b/Examples/1-res-Annotate/LSTINFO/GEN2.0219.00001.lst new file mode 100644 index 0000000000000000000000000000000000000000..5c3a772d94307bed42c76d1adf8ec2c760d4c62a --- /dev/null +++ b/Examples/1-res-Annotate/LSTINFO/GEN2.0219.00001.lst @@ -0,0 +1,13 @@ +1 774 D CDS GEN2.0219.00001.b0001_00001 hisP | Histidine transport ATP-binding protein HisP | NA | similar to AA sequence:UniProtKB:P02915 +857 2260 D CDS GEN2.0219.00001.b0001_00002 NA | hypothetical protein | NA | NA +28 252 D CDS GEN2.0219.00001.b0002_00003 pspB | Phage shock protein B | NA | similar to AA sequence:UniProtKB:P0AFM9 +301 681 D CDS GEN2.0219.00001.i0002_00004 hpcD | 5-carboxymethyl-2-hydroxymuconate Delta-isomerase | 5.3.3.10 | similar to AA sequence:UniProtKB:Q05354 +764 1597 D CDS GEN2.0219.00001.i0002_00005 amiD | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 +1628 3295 D CDS GEN2.0219.00001.b0002_00006 glnS | Glutamine--tRNA ligase | 6.1.1.18 | similar to AA sequence:UniProtKB:P00962 +30 797 D CDS GEN2.0219.00001.b0003_00007 trmD | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 +862 1122 D CDS GEN2.0219.00001.i0003_00008 NA | hypothetical protein | NA | NA +1170 1751 D CDS GEN2.0219.00001.i0003_00009 NA | hypothetical protein | NA | NA +1895 2263 D CDS GEN2.0219.00001.b0003_00010 NA | hypothetical protein | NA | NA +1 1308 D CDS GEN2.0219.00001.b0004_00011 dctM | C4-dicarboxylate TRAP transporter large permease protein DctM | NA | similar to AA sequence:UniProtKB:O07838 +1340 1804 D CDS GEN2.0219.00001.i0004_00012 NA | hypothetical protein | NA | NA +1832 2845 D CDS GEN2.0219.00001.b0004_00013 xerC | Tyrosine recombinase XerC | NA | similar to AA sequence:UniProtKB:P39776 diff --git a/Examples/1-res-Annotate/LSTINFO/GEN4.1111.00001.lst b/Examples/1-res-Annotate/LSTINFO/GEN4.1111.00001.lst new file mode 100644 index 0000000000000000000000000000000000000000..7a86a11ffd5632a867c73ee5a267edc8a956a903 --- /dev/null +++ b/Examples/1-res-Annotate/LSTINFO/GEN4.1111.00001.lst @@ -0,0 +1,9 @@ +1 831 D CDS GEN4.1111.00001.b0001_00001 amiD | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 +861 1241 D CDS GEN4.1111.00001.i0001_00002 hpcD | 5-carboxymethyl-2-hydroxymuconate Delta-isomerase | 5.3.3.10 | similar to AA sequence:UniProtKB:Q05354 +1271 2074 D CDS GEN4.1111.00001.i0001_00003 hpcG | 2-oxo-hept-4-ene-1,7-dioate hydratase | 4.2.1.163 | similar to AA sequence:UniProtKB:P42270 +2111 3514 D CDS GEN4.1111.00001.i0001_00004 NA | hypothetical protein | NA | NA +3535 4308 D CDS GEN4.1111.00001.i0001_00005 hisP | Histidine transport ATP-binding protein HisP | NA | similar to AA sequence:UniProtKB:P02915 +4327 5094 D CDS GEN4.1111.00001.i0001_00006 trmD | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 +5138 5398 D CDS GEN4.1111.00001.i0001_00007 NA | hypothetical protein | NA | NA +5413 5994 D CDS GEN4.1111.00001.i0001_00008 NA | hypothetical protein | NA | NA +6052 7134 D CDS GEN4.1111.00001.b0001_00009 dctM | C4-dicarboxylate TRAP transporter large permease protein DctM | NA | similar to AA sequence:UniProtKB:O07838 diff --git a/Examples/1-res-Annotate/Proteins/EXAM.0219.00001.prt b/Examples/1-res-Annotate/Proteins/EXAM.0219.00001.prt new file mode 100644 index 0000000000000000000000000000000000000000..e07e3d351df0e2622d0d41e7bc0876cf14a9fbbc --- /dev/null +++ b/Examples/1-res-Annotate/Proteins/EXAM.0219.00001.prt @@ -0,0 +1,69 @@ +>EXAM.0219.00001.b0001_00001 1668 glnS | Glutamine--tRNA ligase | 6.1.1.18 | similar to AA sequence:UniProtKB:P00962 +MSEAEARPTNFIRQIIDEDLASGKHTTVHTRFPPEPNGYLHIGHAKSICLNFGIAQDYQG +QCNLRFDDTNPVKEDIEYVDSIKNDVEWLGFHWSGDIRYSSDYFDQLHAYAVELINKGLA +YVDELTPEQIREYRGTLTAPGKNSPFRDRSVEENLALFEKMRTGGFEEGKACLRAKIDMA +SPFIVMRDPVLYRIKFAEHHQTGNKWCIYPMYDFTHCISDALEGITHSLCTLEFQDNRRL +YDWVLDNITIPVHPRQYEFSRLNLEYTVMSKRKLNLLVTDKHVEGWDDPRMPTISGLRRR +GYTAASIREFCKRIGVTKQDNTIEMASLESCIREDLNENAPRAMAVIDPVKLVIENYPQG +ESEMVTMPNHPNKPEMGSREVPFSGEIWIDRADFREEANKQYKRLVMGKEVRLRNAYVIK +AERVEKDAEGNITTIFCTYDADTLSKDPADGRKVKGVIHWVSAAHALPIEIRLYDRLFSV +PNPGAAEDFLSVINPESLVIKQGYGEPSLKAAVAGKAFQFEREGYFCLDSRYATADKLVF +NRTVGLRDTWAKAGE +>EXAM.0219.00001.b0001_00002 831 amiD | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 +MKALLWLVGLALLLTGCASEKGIIDKEGYQLDTRHRAQAAYPRIKVLVIHYTAENFDVSL +ATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGVSFWRGATRLNDTSIGIEL +ENRGWRMSGGVKSFAPFESAQIQALIPLAKDIIARYNIKPQNVVAHADIAPQRKDDPGPR +FPWRELAAQGISAWPDAQRVAFYLAGRAPYTPVDTATVLALLSRYGYEVKADMTAREQQR +VIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD +>EXAM.0219.00001.b0002_00003 381 hpcD | 5-carboxymethyl-2-hydroxymuconate Delta-isomerase | 5.3.3.10 | similar to AA sequence:UniProtKB:Q05354 +MPHFIAECTENIREQADLPGLFSKVNEALAASGIFPIGGIRSRAHWLDTWQMADGKHDYA +FVHMTLKIGAGRSLESRQEVGEMLFGLIKAHFADLMENRYLALSFEIAELHPTLNYKQNN +VHALFK +>EXAM.0219.00001.i0002_00004 1410 NA | hypothetical protein | NA | NA +MSMPATKFSRRTLLTAGSALAVLPFLRALPVQAREPRETVDIKDYPADDGIASFKQAFAD +GQTVVVPPGWVCENINAAITIPAGKTLRVQGAVRGNGRGRFILQDGCQVVGEQGGSLHNV +TLDVRGSDCVIKGVAMSGFGPVAQIFIGGKEPQVMRNLIIDDITVTHANYAILRQGFHNQ +MDGARITHSRFSDLQGDAIEWNVAIHDRDILISDHVIERINCTNGKINWGIGIGLAGSTY +DNSYPEDQAVKNFVVANITGSDCRQLVHVENGKHFVIRNVKAKNITPGFSKNAGIDNATI +AIYGCDNFVIDNIDMTNSAGMLIGYGVVKGKYLSIPQNFKLNAIRLDNRQVAYKLRGIQI +SSGNTPSFVAITNVRMTRATLELHNQPQHLFLRNINVMQTSAIGPALKMHFDLRKDVRGQ +FMARQDTLLSLANVHAINENGQSSVDIDRINHQTVNVEAVNFSLPKRGG +>EXAM.0219.00001.i0002_00005 774 hisP | Histidine transport ATP-binding protein HisP | NA | similar to AA sequence:UniProtKB:P02915 +MSENKLHVIDLHKRYGGHEVLKGVSLQARAGDVISIIGSSGSGKSTFLRCINFLEKPSEG +AIIVNGQNINLVRDKDGQLKVADKNQLRLLRTRLTMVFQHFNLWSHMTVLENVMEAPIQV +LGLSKHDARERALKYLAKVGIDERAQGKYPVHLSGGQQQRVSIARALAMEPDVLLFDEPT +SALDPELVGEVLRIMQQLAEEGKTMVVVTHEMGFARHVSSHVIFLHQGKIEEEGDPEQVF +GNPQSPRLQQFLKGSLK +>EXAM.0219.00001.i0002_00006 735 trmD_1 | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 +MFRAITDYGVTGRAVKKGLLNIQSWSPRDFAHDRHRTVDDRPYGGGPGMLMMVQPLRDAI +HAAKAAAGEGAKVIYLSPQGRKLDQAGVSELATNQKLILVCGRYEGVDERVIQTEIDEEW +SIGDYVLSGGELPAMTLIDSVARFIPGVLGHEASAIEDSFADGLLDCPHYTRPEVLEGME +VPPVLLSGNHAEIRRWRLKQSLGRTWLRRPELLENLALTEEQARLLAEFKTEHAQQQHKH +DGMA +>EXAM.0219.00001.i0002_00007 828 trmD_2 | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 +MMGWHSRHMGSQRRRANAQYVFIGIVSLFPEMFRAITDYGVTGRAVKKGLLNIQSWSPRD +FAHDRHRTVDDRPYGGGPGMLMMVQPLRDAIHAAKAAAGEGAKVIYLSPQGRKLDQAGVS +ELATNQKLILVCGRYEGVDERVIQTEIDEEWSIGDYVLSGGELPAMTLIDSVARFIPGVL +GHEASAIEDSFADGLLDCPHYTRPEVLEGMEVPPVLLSGNHAEIRRWRLKQSLGRTWLRR +PELLENLALTEEQARLLAEFKTEHAQQQHKHDGMA +>EXAM.0219.00001.i0002_00008 465 NA | hypothetical protein | NA | NA +MKFVAPEQAPEQAEVIKNTPFWPDVDLSEFRSVMRTDGTVTQPRLKQVVLTAISEVNAEL +YDFRNRQQMLGWRTLAEVPAEMLDGKSERIRHYHNAVFCWARAVLNERYQDYDATASGVK +RGEELAEASGDLWRDARWAISRVQDVPHCTVELI +>EXAM.0219.00001.i0002_00009 261 NA | hypothetical protein | NA | NA +MAGLHAPYAYSAHHAVNFCSEYKRGFVLGFTHRMFEKTGDRQLSAWEAGILTRRYGLDKE +MVMDFFKENHSGMAVRFFMAGYRLEG +>EXAM.0219.00001.i0002_00010 582 NA | hypothetical protein | NA | NA +MSTLLYLHGFNSSPRSAKACQLKNWLAERHPHVEMIVPQLPPYPADAAELLESLVLEHGG +APLGLVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDYLGQNENPYTGQQYVLESR +HIYDLKVMQIDPLEAPDLIWLLQQTGDEVLYYRQAVAYYASCRQTVTEGGNHAFTGFEDY +FNQIVDFLGLHSC +>EXAM.0219.00001.b0002_00011 1308 dctM | C4-dicarboxylate TRAP transporter large permease protein DctM | NA | similar to AA sequence:UniProtKB:O07838 +MIDPIFASCTLIAVFVVLLAMGAPIGICIVIASFSTMMLVLPFDISMFATAQKMFSSLDS +FALLAVPFFVLSGVIMNSGGIAARLVNFAKLFTGKLPGSLSYTNIVGNMMFGAISGSAIA +ASTSIGGVMVPMSAREGYDRGFAAAVNIASAPTGMLIPPTTAFILYALASGGTSIAALFA +GGLVAGVLWGVGCMLVTLVVAKRRNYRVFFTVQKGMALKVAVEAIPSLLLIVIIVGGIVQ +GIFTAIEASAIAVVYTLLLTMVFYRTLKIKDLPSILLQTVVMTGVIMFLLATSSAMSFSM +SITNIPAALSDMILGISANKLVILLVITVFLLIIGAFMDIGPAILIFTPILLPIMAKLGV +DPVHFGIIMIYNLAIGTITPPVGSGLYVGASVGKVKVEEVIKPLLPFYGAIIGVLLLITY +IPEITLFLPRLLGIM diff --git a/Examples/1-res-Annotate/Proteins/EXAM.1216.00002.prt b/Examples/1-res-Annotate/Proteins/EXAM.1216.00002.prt new file mode 100644 index 0000000000000000000000000000000000000000..82124bdfe988a7543ff47e3597fab93a41155985 --- /dev/null +++ b/Examples/1-res-Annotate/Proteins/EXAM.1216.00002.prt @@ -0,0 +1,66 @@ +>EXAM.1216.00002.b0001_00001 831 amiD_1 | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 +MRALLWLVGLALLLTGCASEKGIIDEEGYQLDTRHRAQAAYPRIKVLVIHYTAENFDVSL +ATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGVSFWRGATRLNDTSIGIEL +ENRGWRMSGGVKSFAPFESAQIQALIPLAKDIIARYDIKPQNVVAHADIAPQRKDDPGPR +FPWRELAAQGIGAWPDAQRVAFYLAGRAPYTPVDTATVLALLSRYGYEVKADMTAREQQR +VIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD +>EXAM.1216.00002.i0001_00002 831 amiD_2 | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 +MRALLWLVGLALLLTGCASEKGIIDEEGYQLDTRHRAQAAYPRIKVLVIHYTAENFDVSL +ATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGVSFWRGATRLNDTSIGIEL +ENRGWRMSGGVKSFAPFESAQIQALIPLAKDIIARYDIKPQNVVAHADIAPQRKDDPGPR +FPWRELAAQGIGAWPDAQRVAFYLAGRAPYTPVDTATVLALLSRYGYEVKADMTAREQQR +VIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD +>EXAM.1216.00002.i0001_00003 474 hpcD | 5-carboxymethyl-2-hydroxymuconate Delta-isomerase | 5.3.3.10 | similar to AA sequence:UniProtKB:Q05354 +MPKRRRLPKHYWRSTARINRARYREDSLFEDMPHFIAECTENIREQADLPGLFSKVNEAL +AASGIFPIGGIRSRAHWLDTWQMADGKHDYAFVHMTLKIGAGRSLESRQEVGEMLFGLIK +AHFADLMENRYLALSFEIAELHPTLNYKQNNVHALFK +>EXAM.1216.00002.i0001_00004 225 pspB | Phage shock protein B | NA | similar to AA sequence:UniProtKB:P0AFM9 +MSALFLAIPLTIFVLFVLPIWLWLHYSNRAGRGELSQSEQQRLLQLTDDAQRMRERIQAL +EDILDAEHPNWRER +>EXAM.1216.00002.i0001_00005 1404 NA | hypothetical protein | NA | NA +MPATKFSRRTLLTAGSALAVLPFLRALPVQAREPRETVDIKDYPADDGIASFKQAFADGQ +TVVVPPGWVCENINAAITIPAGKTLRVQGAVRGNGRGRFILQDGCQVVGEQGGSLHNVTL +DVRGSDCVIKGVAMSGFGPVAQIFIGGKEPQVMRNLIIDDITVTHANYAILRQGFHNQMD +GARITHSRFSDLQGDAIEWNVAIHDRDILISDHVIERIDCTNGKINWGIGIGLAGSTYDN +SYPEDQAVKNFVVANITGSDCRQLVHVENGKHFVIRNVKAKNITPDFSKNAGIDNATIAI +YGCDNFVIDNIDMTNSAGMLIGYGVVKGKYLSIPQNFKLNAIRLDNRQVAYKLRGIQISS +GNTPSFVAITNVRMTRATLELHNQPQHLFLRNINVMQTSAIGPALKMHFDLRKDVRGQFM +ARQDTLLSLANVHAINENGQSSVDIDRINHQTVNVEAVNFSLPKRGG +>EXAM.1216.00002.i0001_00006 774 hisP | Histidine transport ATP-binding protein HisP | NA | similar to AA sequence:UniProtKB:P02915 +MSENKLHVIDLHKRYGGHEVLKGVSLQARAGDVISIIGSSGSGKSTFLRCINFLEKPSED +AIIVNGQNINLVRDKDGQLKVADKNQLRLLRTRLTMVFQHFNLWSHMTVLENVMEAPIQV +LGLSKHDARERALKYLAKVGIDERAQGKYPVHLSGGQQQRVSIARALAMEPDVLLFDEPT +SALDPELVGEVLRIMQQLAEEGKTMVVVTHEMGFVRHVSSHVIFLHQGKIEEEGDPEQVF +GNPQSPRLQQFLKGSLK +>EXAM.1216.00002.i0001_00007 792 trmD | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 +MRAIYRRCVFIGIVSLFPEMFRAITDYGVTGRAVKNGLLNIQSWSPRDFTHDRHRTVDDR +PYGGGPGMLMMVQPLRDAIHAAKAAAGEGAKVIYLSPQGRKLDQAGVSELATNQKLILVC +GRYEGVDERVIQTEIDEEWSIGDYVLSGGELPAMTLIDSVARFIPGVLGHEASAIEDSFA +DGLLDCPHYTRPEVLEGMEVPPVLLSGNHAEIRRWRLKQSLGRTWLRRPELLENLALTEE +QARLLAEFKTEHAQQQHKHDGMA +>EXAM.1216.00002.b0001_00008 288 NA | hypothetical protein | NA | NA +MNNHFGKGLMAGLHAPYAYSAHHAVNFCSEYKRGFVLGFTHRMFEKTGDRQLSAWEAGIL +TRRYGLDKEMVMDFFKENHSGMAVRFFMAGYRLEG +>EXAM.1216.00002.b0002_00009 582 NA | hypothetical protein | NA | NA +MSTLLYLHGFNSSPRSAKACQLKNWLAERHPHVEMIVPQLPPYPADAAELLESLVLEHGG +APLGLVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDYLGQNENPYTGQQYVLESR +HIYDLKVMQIDPLEAPDLIWLLQQTGDEVLDYRQAVAYYASCRQTVTEGGNHAFTGFEDY +FNQIVDFLGLHSC +>EXAM.1216.00002.b0002_00010 1308 dctM | C4-dicarboxylate TRAP transporter large permease protein DctM | NA | similar to AA sequence:UniProtKB:O07838 +MIDPIFASCTLIAVFVVLLAMGAPIGICIVIASFSTMMLVLPFDISMFATAQKMFSSLDS +FALLAVPFFVLSGVIMNSGGIAARLINFAKLFTGKLPGSLSYTNIVGNMMFGAISGSAIA +ASTSIGGVMVPMSAREGYDRGFAAAVNIASAPTGMLIPPTTAFILYALASGGTSIAALFA +GGLVAGVLWGVGCMLVTLVVAKRRNYRVFFTVQKGMALKVAVEAIPSLLLIVIIVGGIVQ +GIFTAIEASAIAVVYTLLLTMVFYRTLKIKDLPSILLQTVVMTGVIMFLLATSSAMSFSM +SITNIPAALSDMILGISANKLVILLVITVFLLIIGAFMDIGPAILIFTPILLPIMAKLGV +DPVHLGIIMIYNLAIGTITPPVGSGLYVGASVGKVKVEEVIKPLLPFYGAIIGVLLLITY +IPEIILFLPRLLGIM +>EXAM.1216.00002.b0003_00011 255 NA | hypothetical protein | NA | NA +MIINGKLIKAKDLAKAAGVSRSTVIKYYGISRENYERVATERRKLAFELRASGLKWKEVA +EKMNTTKYSAIAYYRRYLALEKNK +>EXAM.1216.00002.b0003_00012 780 NA | Insertion sequence IS5376 putative ATP-binding protein | NA | similar to AA sequence:UniProtKB:Q45619 +MVELQHQRLMVLAEQLQLDSLIGAAPALSQQAVDQEWSYMDFLEHLLHEEKLARHQRKQA +MYTRMAAFPAVKTFEEYDFTFATGAPQKQIQSLRSLSFIERNENIVLLGPSGVGKTHLAI +AMGYEAVRAGIKVRFTTAADLLLQLSTSQRQGRYKTTLNRGVMAPKLLIIDEIGYLPFSQ +EEAKLFFQVIAKRYEKSAMILTSNLPFGQWDQTFAGDAALTSAMLDRILHHSHVVQIKGE +SYRLKQKRKAGVIAEANPE diff --git a/Examples/1-res-Annotate/Proteins/GEN2.0219.00001.prt b/Examples/1-res-Annotate/Proteins/GEN2.0219.00001.prt new file mode 100644 index 0000000000000000000000000000000000000000..dd6ca8c2299c22eadc436d18081bb8bd09737ed6 --- /dev/null +++ b/Examples/1-res-Annotate/Proteins/GEN2.0219.00001.prt @@ -0,0 +1,77 @@ +>GEN2.0219.00001.b0001_00001 774 hisP | Histidine transport ATP-binding protein HisP | NA | similar to AA sequence:UniProtKB:P02915 +MSENKLHVIDLHKRYGGHEVLKGVSLQARAGDVISIIGSSGSGKSTFLRCINFLEKSSEG +AIIVNGQNINLVRDKDGQLKVADKNQLRLLRTRLTMVFQHFNLWNHMTVLENVMEAPIQV +LGLSKHDARERALKYLAKVGIDERAQGKYPVHLSGGQQQRVSIARALAMEPDVLLFDEPT +SALDPELVGEVLRIMQQLAEEGKTMVVVTHEMGFARHVSSHVIFLHQGKIEEEGNPEQVF +GNPQSPRLQQFLKGSLK +>GEN2.0219.00001.b0001_00002 1404 NA | hypothetical protein | NA | NA +MPATKFSRRTLLTAGSALAVLPFLRALPVQAREPRQTVDIKDYPADDGIASFKQAFADGQ +TVVVPSGWVCENINAAITIPAGKTLRIQGAVRGNGRGRFILLDGCQVVGEQGGSLHNVTL +DVRGSDCVIKGVTMSGFGPVAQIFIGGKEPQVMRNLIIDDITVTHANYAILRQGFHNQMD +GARITHSRFSDLQGDAIEWNVAIHDRDILISDHVIERIDCTNGKINWGIGIGLAGSAYDN +SYPEDQAVKNFVVANITGSDCRQLVHVENGKHFVIRNVKAKNITPDFSKNAGIDNATIAI +YGCDNFVIDNIDMTNSAGMLIGYGVVKGKYLSIPQNFKLNAIRLDNRQVAYKLRGIQISS +GNAPSFVAITNVRMTRATLELHNQPQHLFLRNINVMQTSAIGPALKMHFDLRKDVRGQFM +ARQDTLLSLANVHAINENGQSSVDIDRINHQTVNVEAVNFSLPKRGG +>GEN2.0219.00001.b0002_00003 225 pspB | Phage shock protein B | NA | similar to AA sequence:UniProtKB:P0AFM9 +MSALFLAIPLTIFVLFVLPIWLWLHYSNRAGRGELSQSEQQRLLQLTDDAQRMRERIQAL +EDILDAEHPNWRER +>GEN2.0219.00001.i0002_00004 381 hpcD | 5-carboxymethyl-2-hydroxymuconate Delta-isomerase | 5.3.3.10 | similar to AA sequence:UniProtKB:Q05354 +MPHFIAECTENIREQADLPSLFSKVNEALAATGIFPIGGIRSRAHWLDTWQMADGKHDYA +FVHMTLKIGAGRSLESRQEVGEMLFGLIKAHFADLMENRYLALSFEIAELHPTLNYKQNN +VHALFK +>GEN2.0219.00001.i0002_00005 834 amiD | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 +MMKALLWLVGLALLLTGCASEKGIIDKEGYQLDTRHRAQAAYPRIKVLVIHYTAENFDVS +LATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGVSFWRGATRLNDTSIGIE +LENRGWRMSGGVKSFAPFESAQIQALIPLAKDIIARYDIKPQNVVAHADIAPQRKDDPGP +RFPWRELAAQGIGAWPDAQRVAFYLAGRAPYTPVDTATVLALLSRYGYEVKADMTAREQQ +RVIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD +>GEN2.0219.00001.b0002_00006 1668 glnS | Glutamine--tRNA ligase | 6.1.1.18 | similar to AA sequence:UniProtKB:P00962 +MSEAEARPTNFIRQIIDEDLASGKHTTVHTRFPPEPNGYLHIGHAKSICLNFGIAQDYQG +QCNLRFDDTNPVKEDIEYVDSIKNDVEWLGFHWSGDIRYSSDYFDQLHAYAVELINKGLA +YVDELTPEQIREYRGTLTAPGKNSPFRDRSVEENLALFEKMRTGGFEEGKACLRAKIDMA +SPFIVMRDPVLYRIKFAEHHQTGTKWCIYPMYDFTHCISDALEGITHSLCTLEFQDNRRL +YDWVLDNITIPVHPRQYEFSRLNLEYTVMSKRKLNLLVTDKHVEGWDDPRMPTISGLRRR +GYTAASIREFCKRIGVTKQDNTIEMASLESCIREDLNENAPRAMAVIDPVKLVIENYPQG +ESEMVTMPNHPNKPEMGSREVPFSGEIWIDRADFREEANKQYKRLVMGKEVRLRNAYVIK +AERVEKDAEGNITTIFCTYDADTLSKDPADGRKVKGVIHWVSVAHALPIEIRLYDRLFSV +PNPGAAEDFLSVINPESLVIKQGYGEPSLKAAVAGKAFQFEREGYFCLDSRYATADKLVF +NRTVGLRDTWAKAGE +>GEN2.0219.00001.b0003_00007 768 trmD | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 +MFIGIVSLFPEMFRAITDYGVTGRAVKKGLLNIQSWSPRDFAHDRHRTVDDRPYGGGPGM +LMMVQPLRDAIHAAKAAAGEGAKVIYLSPQGRKLDQAGVSELATNQKLILVCGRYEGVDE +RVIQAEIDEEWSIGDYVLSGGELPAMTLIDSVARFIPGVLGHEASAIEDSFADGLLDCPH +YTRPEVLEGMEVPPVLLSGNHAEIRRWRLKQSLGRTWLRRPELLENLALTEEQARLLAEF +KTEHAQQQHKHDGMA +>GEN2.0219.00001.i0003_00008 261 NA | hypothetical protein | NA | NA +MAGLHAPYAYSAHHAVNFCSEYKRGFVLGFTHRMFEKTGDRQLSAWEAGILTRRYGLDKE +MVMDFFKENHSGMAVRFFMAGYRLEG +>GEN2.0219.00001.i0003_00009 582 NA | hypothetical protein | NA | NA +MSTLLYLHGFNSSPRSAKACQLKNWLAERHPHVEMIVPQLPPYPADAAELLESLVLEHGG +APLGLVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDYLGQNENPYTGQQYVLESR +HIYDLKVMQIDPLEAPDLIWLLQQTGDEVLDYRQAVAYYASCRQTVTEGGNHAFTGFEDY +FNQIVDFLGLHSC +>GEN2.0219.00001.b0003_00010 369 NA | hypothetical protein | NA | NA +MMRTDGTVTQPRLKQVALSAISEVNAELYEFRRRQQMLGYASLAEVPAEQLDGKSERIQH +YFNAVYCWARAMLNERYQDYDATASGVKRGEELAEASGDLWRDARWAISRVQDAPHCTVE +LI +>GEN2.0219.00001.b0004_00011 1308 dctM | C4-dicarboxylate TRAP transporter large permease protein DctM | NA | similar to AA sequence:UniProtKB:O07838 +MIDPIFASCTLIAVFVVLLAMGAPIGICIVIASFSTMMLVLPFDISMFATAQKMFSSLDS +FALLAVPFFVLSGVIMNSGGIAARLVNFAKLFTGKLPGSLSYTNIVGNMMFGAISGSAIA +ASTSIGGVMVPMSAREGYDRGFAAAVNIASAPTGMLIPPTTAFILYALASGGTSIAALFA +GGLVAGVLWGVGCMLVTLVVAKRRNYRVFFTVQKGMALKVAVEAIPSLLLIVIIVGGIVQ +GIFTAIEASAIAVVYTLLLTIVFYRTLKIKDLPSILLQTVVMTGVIMFLLATSSAMSFSM +SITNIPAALSDMILGISANKLVILLVITVFLLIIGAFMDIGPAILIFTPILLPIMTKLGV +DPVHFGIIMIYNLAIGTITPPVGSGLYVGASVGKVKVEDVIKPLMPFYGAIIGVLLLITY +IPEITLFLPRLLGIM +>GEN2.0219.00001.i0004_00012 465 NA | hypothetical protein | NA | NA +MKFVAPEQAPEQAEVIKNTPFWPDVNLSEFRSVMRTDGTVTQPRLKQVVLTAISEVNAEL +YDFRNRQQMLGWRTLAEVPAEMLDGKSERIRHYHNAVFCWARAVLNERYQDYDATASGVK +RGEELAEASGDLWRDARWAISRVQDAPHCTVELI +>GEN2.0219.00001.b0004_00013 1014 xerC | Tyrosine recombinase XerC | NA | similar to AA sequence:UniProtKB:P39776 +METNITWQQLIDEYFFAKPLRSASEWSYTKVFKSFVHYMGPLSCPNDVTYHKVLAWRRFL +LKEKKLSGRTWNNKVAHMRAIFNYGIQRGLLHYDENPFNNSVVKPDKKRKKTLTQAQIEY +AYQIMEQYENQENTGLGLKYSRCALFPAWFWLTVLDTLYYTGIRQNQLLHIRLNDVDLRE +GQIRLITEGCKNHKEHYVPVISFLRPRLTCLMEKAQSEGLKGNDRLFNIALFTGKDPAIG +DDMDSPQVRAFFRRLSKECQFAISPHRFRHTLATEMMKMPEQNLHMAQSVLGHSNMKSTL +EYVENDIAVMGRALEAQFMQIKAAHARSIYSGLTKNR diff --git a/Examples/1-res-Annotate/Proteins/GEN4.1111.00001.prt b/Examples/1-res-Annotate/Proteins/GEN4.1111.00001.prt new file mode 100644 index 0000000000000000000000000000000000000000..dccb76f56958d6f413a43f2d33a3ccd3aa04f4df --- /dev/null +++ b/Examples/1-res-Annotate/Proteins/GEN4.1111.00001.prt @@ -0,0 +1,52 @@ +>GEN4.1111.00001.b0001_00001 831 amiD | N-acetylmuramoyl-L-alanine amidase AmiD | 3.5.1.28 | similar to AA sequence:UniProtKB:P75820 +MKALLWLVGLALLLTGCASEKGIIDKEGYQLDTRHRAQAAYPRIKVLVIHYTAENFDVSL +ATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGVSFWRGATRLNDTSIGIEL +ENRGWRMSGGVKSFAPFESAQIQALIPLAKDIIARYDIKPQNVVAHADIAPQRKDDPGPR +FPWRELAAQGIGAWPDAQRVAFYLAGRAPYTPVDTATVLALLSRYGYEVKADMTAREQQR +VIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD +>GEN4.1111.00001.i0001_00002 381 hpcD | 5-carboxymethyl-2-hydroxymuconate Delta-isomerase | 5.3.3.10 | similar to AA sequence:UniProtKB:Q05354 +MPHFIAECTENIREQADLPGLFSKVNEALAASGIFPIGGIRSRAHWLDTWQMADGKHDYA +FVHMTLKIGTGRSLESRQEVGEMLFGLIKAHFADLMENRYLALSFEIAELHPTLNYKQNN +VHALFK +>GEN4.1111.00001.i0001_00003 804 hpcG | 2-oxo-hept-4-ene-1,7-dioate hydratase | 4.2.1.163 | similar to AA sequence:UniProtKB:P42270 +MLDKQTHTLIAQRLNQAEKQREQIRAVSLDYPNITIEDAYAVQREWVNIKIAEGRTLKGH +KIGLTSKAMQASSQISEPDYGALLDDMFFHDGGDIPTDRFIVPRIEVELAFVLAKPLRGP +HCTLFDVYNATDYVIPALELIDARSHNIDPETQRPRKVFDTISDNAANAGVILGGRPIKP +DELDLRWISALLYRNGVIEETGVAAGVLNHPANGVAWLANKLAPYDVQLEAGQIILGGSF +TRPVPASKGDTFHVDYGNMGAISCRFV +>GEN4.1111.00001.i0001_00004 1404 NA | hypothetical protein | NA | NA +MPVNKFSRRTLLTAGSALAVLPFLRALPVQAREPRETVDIKDYPADDGIASFKQAFADGQ +TVVLPPGWVCENINAAITIPAGKTLRVQGAVRGNGRGRFILQDGCQVVGEQGGSLHNVTL +DVRGSDCVIKGVTMSGFGPVAQIFIGGKEPQVMRNLIIDDITVTHANYAILRQGFHNQMD +GARITHSRFSDLQGDAIEWNVAIHDRDILISDHVIERIDCTNGKINWGIGIGLAGSTYDN +SYPEDQAVKNFVVANITGSDCRQLVHVENGKHFVIRNVKAKNITPDFSKNAGIDNATIAI +YGCDNFVIDNIDMTNSAGMLIGYGVVKGKYLSIPQNFKLNAIRLDNRQVAYKLRGIQISS +GNIPSFVAITNVRMTRATLELHNQPQHLFLRNINVMQTSAIGPALKMHFDLRKDVRGQFM +ARQDTLLSLANVHAINENGQSSVDIDRINHQTVNVEAVNFSLPKRGG +>GEN4.1111.00001.i0001_00005 774 hisP | Histidine transport ATP-binding protein HisP | NA | similar to AA sequence:UniProtKB:P02915 +MSENKLHVIDLHKRYGGHEVLKGVSLQARAGDVISIIGSSGSGKSTFLRCINFLEKPSEG +AIIVNGQNINLVRDKDGQLKVADKNQLRLLRTRLTMVFQHFNLWSHMTVLENVMEAPIQV +LGLSKHDARERALKYLAKVGIDERAQGKYPVHLSGGQQQRVSIARALAMEPDVLLFDEPT +SALDPELVGEVLRIMQQLAEEGKTMVVVTHEMGFARHVSSHVIFLHQGKIEEEGDPEQVF +GNPQSPRLQQFLKGSLK +>GEN4.1111.00001.i0001_00006 768 trmD | tRNA (guanine-N(1)-)-methyltransferase | 2.1.1.228 | similar to AA sequence:UniProtKB:P0A873 +MFIGIVSLFPEMFRAITDYGVTGRAVKKGLLNIQSWSPRDFAHDRHRTVDDRPYGGGPGM +LMMVQPLRDAIHAAKAAAGEGAKVIYLSPQGRKLDQAGVSELATNQKLILVCGRYEGVDE +RVIQTEIDEEWSIGDYVLSGGELPAMTLIDSVARFIPGVLGHEASAIEDSFADGLLDCPH +YTRPEVLEGMEVPPVLLSGNHAEIRRWRLKQSLGRTWLRRPELLENLALTEEQARLLAEF +KTEHAQQQHKHDGMA +>GEN4.1111.00001.i0001_00007 261 NA | hypothetical protein | NA | NA +MAGLHAPYAYSAHHAVNFCSEYKRGFVLGFTHRMFEKTGDRQLSAWEAGILTRRYGLDKE +MVMDFFKENHSGMAVRFFMVGYRLEG +>GEN4.1111.00001.i0001_00008 582 NA | hypothetical protein | NA | NA +MSTLLYLHGFNSSPRSAKACQLKNWLAERHPHVEMIVPQLPPYPADAAELLESLVLEHGG +APLGLVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDYLGQNENPYTGQQYVLESR +HIYDLKVMQIDPLEAPDLIWLLQQTGDEVLDYRQAVAYYASCRQTVTEGGNHAFTGFEDY +FNQIVDFLGLHSC +>GEN4.1111.00001.b0001_00009 1083 dctM | C4-dicarboxylate TRAP transporter large permease protein DctM | NA | similar to AA sequence:UniProtKB:O07838 +MNSGGIAARLVNFAKLFTGKLPGSLSYTNIVGNMMFGAISGSAIAASTSIGGVMVPMSAR +EGYDRGFAAAVNIASAPTGMLIPPTTAFILYALASGGTSIAALFAGGLVAGVLWGVGCML +VTLVVAKRRNYRVFFTVQKGMALKVAVEAIPSLLLIVIIVGGIVQGIFTAIEASAIAVVY +TLLLTMVFYRTLKIKDLPSILLQTVVMTGVIMFLLATSSAMSFSMSITNIPAALSDMILG +ISANKLVILLVITVFLLIIGAFMDIGPAILIFTPILLPIMAKLGVDPVHFGIIMIYNLAI +GTITPPVGSGLYVGASVGKVKVEEVIKPLLPFYGAIIGVLLLITYIPEITLFLPRLLGIM diff --git a/Examples/1-res-Annotate/QC_L90-list_genomes.png b/Examples/1-res-Annotate/QC_L90-list_genomes.png new file mode 100644 index 0000000000000000000000000000000000000000..e501adc9b9429b8ce0ac215a40d81c39a5620bef Binary files /dev/null and b/Examples/1-res-Annotate/QC_L90-list_genomes.png differ diff --git a/Examples/1-res-Annotate/QC_nb-contigs-list_genomes.png b/Examples/1-res-Annotate/QC_nb-contigs-list_genomes.png new file mode 100644 index 0000000000000000000000000000000000000000..cee148de2bd0f0ba91c0cf7928a092fd4378ca93 Binary files /dev/null and b/Examples/1-res-Annotate/QC_nb-contigs-list_genomes.png differ diff --git a/Examples/1-res-Annotate/Replicons/EXAM.0219.00001.fna b/Examples/1-res-Annotate/Replicons/EXAM.0219.00001.fna new file mode 100644 index 0000000000000000000000000000000000000000..3ee2afdd2e0dfe944155dd39bed6ce8300128b12 --- /dev/null +++ b/Examples/1-res-Annotate/Replicons/EXAM.0219.00001.fna @@ -0,0 +1,4 @@ +>EXAM.0219.00001.0001 +ATGTGTAAAAGCCGGAGGGGTTATCTTTTCCCGGCTTTTTATTATCAATTACTCATTAACTCCTGTTCCGTTCTTTTGCGTTTAATCACCGGAATATCTCCGGTATTGTTCAGCGCCCCGGAAATGTTTTTAACCACTGTTCTGCACTCCGTTTATTAAACGCGCTCAGCGCGCGCTCATATATCGCGCGCGCGCGCGCGCATATATATATAATGAGTGAGGCTGAAGCCCGCCCGACTAACTTTATTCGTCAGATTATTGATGAAGATCTGGCGAGTGGTAAACATACCACTGTCCATACCCGTTTTCCGCCGGAGCCGAATGGCTATCTGCATATCGGCCACGCGAAATCTATCTGCCTGAACTTTGGCATCGCGCAAGATTATCAGGGCCAGTGCAACCTGCGTTTCGATGACACCAACCCGGTAAAAGAAGATATCGAGTACGTTGATTCGATCAAAAACGACGTCGAGTGGTTAGGCTTTCACTGGTCTGGCGATATTCGCTACTCCTCCGATTACTTTGACCAACTGCACGCCTATGCGGTCGAGCTAATCAATAAAGGCCTGGCCTATGTTGATGAGCTGACGCCGGAGCAGATCCGTGAATACCGCGGTACGCTGACCGCGCCGGGTAAAAACAGCCCGTTCCGCGATCGCAGCGTCGAAGAGAACCTCGCGCTATTTGAAAAAATGCGTACCGGCGGTTTTGAAGAGGGTAAAGCCTGTCTGCGCGCTAAAATCGACATGGCGTCGCCGTTTATCGTGATGCGCGATCCGGTGCTGTATCGCATTAAATTCGCCGAGCATCATCAGACTGGCAACAAGTGGTGCATCTATCCGATGTACGACTTTACTCACTGCATCAGCGATGCGCTGGAAGGCATTACTCATTCTCTGTGTACGCTGGAGTTCCAGGATAACCGTCGTCTGTACGACTGGGTGCTGGACAACATCACCATTCCGGTTCACCCGCGCCAGTACGAATTCTCGCGCCTGAATCTGGAATACACCGTGATGTCCAAGCGTAAGCTGAACCTGCTGGTGACCGACAAACACGTCGAAGGTTGGGACGATCCGCGTATGCCGACTATTTCCGGTCTGCGCCGTCGCGGCTATACCGCGGCTTCTATTCGTGAGTTCTGCAAACGCATCGGCGTCACCAAGCAGGACAACACTATTGAGATGGCGTCGCTGGAATCCTGCATTCGCGAAGATCTGAACGAAAACGCGCCGCGCGCGATGGCGGTAATCGATCCGGTAAAACTGGTTATCGAAAATTACCCGCAGGGTGAGAGCGAAATGGTTACCATGCCTAACCATCCGAATAAACCGGAGATGGGCAGCCGTGAAGTGCCGTTTAGCGGTGAGATCTGGATCGATCGCGCAGATTTCCGCGAAGAAGCGAACAAACAGTACAAACGTCTGGTGATGGGCAAAGAAGTGCGTCTGCGTAATGCCTACGTCATTAAAGCGGAGCGCGTAGAGAAGGATGCCGAAGGGAATATCACCACCATCTTCTGTACCTATGATGCTGATACGCTGAGTAAAGATCCGGCTGACGGGCGTAAAGTGAAAGGCGTAATCCACTGGGTTAGCGCAGCACATGCGCTGCCGATTGAAATTCGTCTCTACGACCGTCTGTTCAGCGTGCCGAATCCGGGCGCCGCGGAGGACTTCCTGTCTGTTATCAACCCCGAATCATTAGTGATTAAGCAGGGGTATGGCGAGCCGTCGCTGAAAGCGGCGGTAGCAGGAAAAGCTTTCCAGTTTGAACGTGAAGGCTACTTCTGCCTCGACAGCCGCTATGCAACGGCCGATAAGCTGGTCTTTAACCGCACCGTGGGCCTGCGTGATACCTGGGCGAAAGCGGGCGAGTAACGCGCATAGGCGGCCTTCAAGGAGGCGCTAGGCGAATGAAAGCGCTACTGTGGCTGGTGGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGCGAAAAAGGAATTATCGATAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCCTATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGACGTTAACGGGTCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTATATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCGGGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAATCGTGGTTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAATTCAGGCATTGATTCCGTTAGCGAAGGATATTATCGCGCGCTATAACATCAAACCGCAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGCTTCCCGTGGCGCGAGCTGGCGGCGCAGGGGATTAGCGCCTGGCCTGACGCCCAGCGTGTGGCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCGTTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCACGCGAGCAGCAGCGGGTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCCGAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACGCGATAGGCGCGCGAGATTACGCGCGCAGTATCGCGC +>EXAM.0219.00001.0002 +CGCGCATAGCGCGCAATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTTACCCGGCCTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGATTTTCCCCATCGGCGGTATCCGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGATTACGCGTTTGTGCATATGACGCTGAAAATCGGCGCCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTTGGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCGACCTGATGGAGAACCGCTATCTGGCGCTGTCGTTTGAGATTGCCGAGCTACATCCGACGCTCAATTACAAACAAAACAACGTACACGCGTTATTTAAATAGCCGATATGAGGCGCGGCGCTATAAGGCGCTCATGAGCATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTGACGGCAGGTTCTGCGCTTGCTGTTCTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCGAGACCGTCGATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGCCGACGGACAGACCGTGGTCGTACCGCCAGGATGGGTGTGTGAAAATATCAATGCGGCGATAACGATTCCGGCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATGGCCGGGGACGGTTTATTTTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAATGTGACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGGCGATGAGCGGCTTTGGCCCCGTCGCGCAAATTTTCATCGGTGGTAAGGAACCGCAGGTGATGCGTAATCTCATTATCGATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCATAACCAAATGGATGGCGCGCGGATTACGCATAGCCGCTTTAGCGATTTACAGGGGGACGCCATTGAGTGGAATGTCGCGATTCACGACCGCGACATCCTGATTTCCGATCATGTCATCGAACGCATTAATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCACCTATGACAACAGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTGGTGGCCAATATTACCGGATCTGATTGCCGACAGCTTGTGCACGTAGAAAATGGCAAACATTTCGTCATTCGCAATGTCAAAGCCAAAAACATCACGCCCGGTTTCAGTAAAAATGCGGGTATTGATAACGCAACGATCGCAATTTATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAGTGCCGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAACTTTAAATTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCAAATTTCCTCCGGCAACACCCCCTCTTTTGTCGCCATCACCAATGTACGGATGACGCGTGCTACGCTGGAACTGCATAATCAACCGCAGCACCTCTTTCTGCGCAATATCAACGTGATGCAAACTTCAGCGATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGTAAAGATGTACGTGGTCAATTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAATGAAAACGGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGCAGTGAATTTTTCGCTGCCGAAGCGGGGAGGGTAACGAATTCCGCGGCATATAGCGGCGCGATAGCATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTGCTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGTGATTAGCATCATCGGCTCGTCCGGCTCCGGTAAAAGCACTTTTTTGCGCTGTATTAACTTCCTCGAAAAACCGAGCGAAGGCGCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGATGGGCAGCTCAAAGTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCACTTCAACCTCTGGAGCCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTACTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGGATTGATGAGCGCGCTCAGGGCAAATATCCCGTCCATCTCTCCGGCGGCCAACAGCAGCGCGTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACATCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAAGAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCTCGCCATGTCTCTTCGCACGTTATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGTGATCCGGAGCAGGTGTTCGGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAACGCTCAGAGGCGCGCGCGCGCTATATACGCGCAGTGTTTATTGGCATAGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGACTTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCGGGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGACCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGGGTTCTGGGGCATGAGGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCGCTGGCGTTTGAAACAGTCGCTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAGCCGTCATATGGGCTCTCAGAGGAGACGCGCTAATGCGCAGTACGTGTTTATTGGCATAGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGACTTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCGGGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGACCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGGGTTCTGGGGCATGAGGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCGCTGGCGTTTGAAACAGTCGCTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAGCGCATACGCGCGCGATATATATTCGCGCAGAGGCGCGCATAGCGATGAAGTTTGTTGCGCCCGAACAGGCACCGGAACAGGCGGAGGTCATCAAAAATACGCCGTTCTGGCCTGATGTGGACCTGTCGGAATTTCGCAGTGTGATGCGCACTGACGGCACGGTGACGCAGCCGCGTTTAAAGCAGGTCGTGCTGACGGCGATCTCTGAGGTTAACGCTGAGCTGTACGACTTCCGCAACCGTCAGCAGATGCTGGGCTGGCGGACACTTGCTGAGGTTCCCGCAGAAATGCTGGACGGTAAAAGCGAGCGTATCCGGCACTACCACAACGCTGTTTTTTGCTGGGCGCGCGCTGTGCTTAATGAGCGTTATCAGGACTATGACGCCACGGCGTCAGGCGTGAAGCGAGGGGAGGAGCTGGCGGAGGCCAGCGGCGATCTGTGGCGTGATGCCCGCTGGGCCATCAGCCGGGTGCAGGATGTACCGCACTGTACGGTGGAGCTTATCTGAACGGCTAGCGCGTAGCGGCAGTCTCGGATGAATAATCATTTTGGGAAAGGGTTAATGGCCGGGTTGCACGCGCCATATGCATATAGCGCGCATCATGCGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGTATGTTCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGCTGGAATTCTGACGCGTCGCTATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATGGCGGTTCGCTTCTTTATGGCCGGTTATCGACTCGAAGGTTGACGATCGCGCGCAGGAGGAGAGCTCTCTTCTTCTAGAGCATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGCCAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTTGAGATGATCGTCCCTCAGCTGCCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAATCTCTCGTGCTTGAGCATGGCGGTGCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAATGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACCGACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGCCATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGGCTACTGCAACAGACGGGCGATGAAGTGCTGTATTACCGCCAGGCGGTGGCATATTACGCCTCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCACGGGCTTCGAAGATTATTTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGACGGCTATGCGCGCGCTCGATGATTGACCCTATTTTTGCGTCCTGTACGCTAATTGCCGTCTTTGTTGTTTTACTGGCCATGGGCGCGCCTATCGGGATCTGCATCGTTATCGCCTCTTTCAGCACCATGATGCTGGTACTGCCTTTCGATATTTCGATGTTCGCCACCGCGCAAAAAATGTTCTCCAGCCTGGACAGTTTTGCCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGCGGGGGAATTGCCGCCCGGCTGGTCAATTTTGCCAAACTGTTTACTGGCAAACTGCCCGGCTCGCTCTCTTATACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCGATTGCCGCCTCAACCTCCATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTACGATCGCGGCTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAATTCCGCCCACCACGGCTTTTATCCTTTACGCGCTGGCAAGCGGGGGAACATCGATTGCCGCTCTGTTCGCCGGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTGGTGGTCGCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGTATGGCGCTAAAAGTTGCCGTTGAGGCCATTCCCAGCCTGCTGCTGATCGTGATTATTGTCGGCGGCATTGTGCAGGGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTATACGTTATTGCTGACGATGGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAGACAGTGGTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTCTCGATGTCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCCAATAAACTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATGGATATTGGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCAATCATGGCTAAACTGGGCGTCGATCCGGTGCATTTCGGCATTATCATGATCTATAACCTGGCGATTGGCACCATTACGCCGCCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAGGAAGTGATTAAACCGTTGCTGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTATTAATTACCTACATTCCGGAAATCACACTGTTCTTACCCCGTCTACTGGGCATCATGTAA diff --git a/Examples/1-res-Annotate/Replicons/EXAM.1216.00002.fna b/Examples/1-res-Annotate/Replicons/EXAM.1216.00002.fna new file mode 100644 index 0000000000000000000000000000000000000000..a6c8c7d61a21ec35944197e0e76c7e01cfd13245 --- /dev/null +++ b/Examples/1-res-Annotate/Replicons/EXAM.1216.00002.fna @@ -0,0 +1,6 @@ +>EXAM.1216.00002.0001 +ATGAGAGCGCTACTGTGGCTGGTGGGTCTTGCGTTGCTGTTAACAGGCTGCGCGAGCGAAAAAGGAATTATCGATGAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCCTATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGACGTTAACGGGCCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTATATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCGGGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAATTCAGGCATTGATTCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCGCAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGCTTCCCGTGGCGCGAGCTGGCGGCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTGGCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCGTTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGGGTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCCGAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACGAGTCCGGGAGCGGCGAGATTCCGGGCAGAGGAGGCGGCTATAGCGGATGAGAGCGCTACTGTGGCTGGTGGGTCTTGCGTTGCTGTTAACAGGCTGCGCGAGCGAAAAAGGAATTATCGATGAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCCTATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGACGTTAACGGGCCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTATATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCGGGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAATTCAGGCATTGATTCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCGCAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGCTTCCCGTGGCGCGAGCTGGCGGCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTGGCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCGTTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGGGTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCCGAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACAGGGCGCGGTATCGCGAGGACTCTCTCTTCGAGGACATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTTACCCGGCCTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGATTTTCCCCATCGGCGGTATCCGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGATTACGCGTTTGTGCATATGACGCTGAAAATCGGCGCCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTCGGCGAAATGCTGTTCGGGCTGATTAAAGCCCACTTCGCCGACCTGATGGAGAACCGCTATCTGGCGCTGTCGTTTGAGATTGCCGAGTTACATCCAACGCTCAATTACAAACAAAACAACGTACACGCGTTATTTAAATAGCCCGAGTATTAGCGCGCGGCGGCGCTATAGGCGCGCTATAGGCATGAGCGCGCTATTTCTGGCCATCCCGTTAACCATTTTTGTGTTGTTTGTGTTACCGATTTGGCTGTGGCTGCATTACAGCAACCGCGCCGGTCGGGGAGAACTGTCGCAAAGCGAGCAGCAACGCTTACTGCAACTCACAGACGACGCGCAACGTATGCGCGAGCGCATTCAGGCGCTGGAAGACATTCTTGATGCAGAGCATCCGAACTGGAGAGAGCGCTAACGAGAGTCTCGGAGGAGCGGCGCTCTGGATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTGACGGCAGGTTCTGCGCTTGCTGTTCTTCCTTTCCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCGAGACCGTCGATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGCCGACGGACAGACCGTGGTCGTACCGCCAGGATGGGTGTGTGAAAATATCAATGCGGCGATAACGATTCCGGCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATGGCCGGGGACGGTTTATTTTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAATGTGACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGGCGATGAGCGGCTTTGGCCCCGTCGCGCAAATTTTCATCGGTGGTAAGGAACCGCAGGTGATGCGTAATCTCATTATCGATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCATAACCAAATGGACGGCGCGAGGATTACGCATAGCCGCTTTAGCGATTTACAGGGGGACGCCATTGAGTGGAATGTCGCGATTCACGACCGCGATATCCTGATTTCCGATCATGTCATCGAACGCATTGATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCACCTATGACAACAGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTGGTGGCCAATATTACCGGATCTGATTGCCGACAGCTTGTGCACGTAGAAAATGGCAAACATTTCGTCATTCGCAATGTCAAAGCCAAAAACATCACGCCCGATTTCAGTAAAAATGCGGGTATTGATAACGCAACGATCGCAATTTATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAGTGCCGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAACTTTAAATTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCAAATTTCCTCCGGCAACACCCCCTCTTTTGTCGCCATCACCAATGTACGGATGACGCGTGCTACGCTGGAACTGCATAATCAACCGCAGCACCTCTTCCTGCGTAATATCAACGTGATGCAAACTTCAGCGATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGTAAAGATGTCCGTGGTCAATTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAATGAAAACGGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGCAGTGAATTTTTCGCTGCCGAAGCGGGGAGGGTAACGCGTCTGCGCGCGCGGAGAGAGGCTCTCAGGACATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTGCTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGTGATTAGCATCATCGGCTCGTCCGGCTCCGGTAAAAGCACTTTTTTGCGCTGTATTAACTTCCTCGAAAAACCGAGCGAAGACGCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGACGGGCAGCTCAAAGTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCACTTTAACCTCTGGAGCCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTACTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGGATTGATGAGCGCGCTCAGGGCAAATATCCCGTTCATCTCTCCGGTGGCCAACAGCAGCGCGTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACTTCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAAGAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGTTCGCCATGTCTCTTCGCACGTTATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGCGATCCGGAGCAGGTGTTCGGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAACCGGATGCGCGCGATATATCGGCGATGCGTGTTTATTGGCATCGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAATGGCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGACTTCACGCATGACCGGCACCGTACCGTGGACGATCGTCCTTACGGCGGCGGACCAGGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGACCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGAGTTCTGGGGCATGAAGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCGCTGGCGTTTGAAACAGTCGCTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAGCCGGCTCTGCGAGAGAGGAGCGCTCGCATGAATAATCATTTTGGGAAAGGGTTAATGGCCGGGTTGCACGCGCCATATGCATATAGCGCGCATCATGCGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGTATGTTCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGCTGGAATTCTGACGCGTCGCTATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATGGCGGTTCGCTTCTTTATGGCCGGTTATCGACTCGAAGGTTGA +>EXAM.1216.00002.0002 +ATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGCCAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTTGAGATGATCGTCCCTCAACTACCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAGTCTCTCGTGCTTGAGCATGGCGGTGCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAATGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACCGACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGCCATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGGCTACTGCAACAGACGGGCGATGAAGTGCTGGATTACCGCCAGGCGGTGGCATATTACGCCTCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCACGGGCTTCGAAGATTATTTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGACGCGTAGCGGCCGGAATCTTCTCGGAGAGGCGCTTCTCTCTCGGAGATGATTGACCCTATTTTTGCGTCCTGTACGCTAATTGCCGTCTTTGTTGTTTTACTGGCCATGGGCGCGCCTATCGGGATCTGCATCGTTATCGCCTCTTTCAGCACCATGATGCTGGTACTGCCTTTCGATATTTCGATGTTCGCCACCGCGCAAAAAATGTTCTCCAGCCTGGACAGTTTTGCCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGCGGGGGAATTGCCGCCCGGCTGATCAATTTTGCCAAACTGTTTACTGGCAAACTGCCCGGTTCGCTCTCTTATACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCAATTGCCGCCTCAACCTCCATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTACGATCGCGGCTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAATTCCGCCCACCACGGCTTTTATCCTTTATGCGCTGGCAAGCGGGGGAACATCGATTGCCGCTCTGTTCGCCGGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTGGTAGTCGCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGCATGGCGCTAAAAGTTGCCGTTGAGGCCATTCCCAGCCTGTTACTGATCGTGATTATCGTCGGCGGCATTGTGCAGGGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTATACGTTATTGCTGACGATGGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAGACAGTGGTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTCTCAATGTCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCCAATAAACTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATGGATATCGGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCGATCATGGCTAAACTGGGCGTCGATCCGGTGCATTTGGGCATTATCATGATCTATAACCTGGCGATTGGCACCATTACGCCGCCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAGGAAGTGATTAAACCGTTGCTGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTATTAATTACCTACATTCCGGAAATCATACTGTTCTTACCCCGTCTACTGGGCATCATGTAAACCCGATGGCGCGCAGAGGCGCGAGTTCTGGA +>EXAM.1216.00002.0003 +CACAGGGCTTAGAGGCGCTATGGCAATGATAATTAATGGGAAATTAATTAAAGCAAAAGACTTAGCTAAGGCTGCAGGTGTATCTCGTTCAACAGTGATTAAATATTACGGCATTAGCCGTGAGAATTACGAAAGGGTAGCAACTGAAAGAAGGAAGCTTGCTTTTGAACTAAGAGCATCAGGTTTAAAATGGAAAGAAGTTGCTGAAAAAATGAACACGACAAAATATAGCGCAATTGCATATTATAGACGATATTTAGCATTAGAGAAAAACAAATAACAGGCGCTAAGGCGGCGATCCTAGCGCGCGATCGCGCATGCGATGGTCGAACTGCAACATCAACGGCTGATGGTGCTTGCCGAACAGCTCCAGCTGGACAGTCTTATCGGCGCAGCGCCGGCGCTGTCGCAACAGGCGGTGGATCAGGAATGGAGCTACATGGACTTCCTGGAGCACCTGTTACATGAGGAGAAACTGGCCCGGCATCAGCGTAAACAGGCGATGTACACGCGGATGGCAGCCTTCCCGGCGGTAAAGACGTTCGAGGAGTACGACTTCACCTTCGCCACCGGCGCTCCTCAGAAGCAAATCCAGTCGCTGCGATCCCTGAGCTTCATAGAGCGTAACGAAAACATCGTGTTGCTGGGGCCATCGGGCGTGGGAAAAACGCATCTGGCGATAGCCATGGGCTACGAAGCAGTACGGGCGGGCATCAAGGTTCGCTTCACAACAGCAGCGGACCTGCTGCTACAGCTGTCCACTTCACAGCGTCAGGGCCGTTACAAAACGACTCTCAATCGTGGTGTCATGGCCCCGAAGCTGCTTATCATCGATGAAATAGGTTATCTGCCGTTCAGTCAGGAGGAAGCCAAGCTGTTCTTCCAGGTCATCGCCAAACGTTACGAGAAGAGCGCGATGATCCTGACCTCCAACCTGCCGTTCGGGCAGTGGGATCAGACGTTCGCCGGTGATGCAGCGCTGACATCGGCGATGCTGGACCGGATCTTACATCACTCACACGTCGTGCAAATAAAAGGGGAAAGCTATCGACTGAAGCAGAAACGAAAGGCCGGGGTTATAGCTGAAGCTAATCCTGAGTAA diff --git a/Examples/1-res-Annotate/Replicons/GEN2.0219.00001.fna b/Examples/1-res-Annotate/Replicons/GEN2.0219.00001.fna new file mode 100644 index 0000000000000000000000000000000000000000..df3fc3c1d8dfd5565b489da5f2295ddf9eb19bce --- /dev/null +++ b/Examples/1-res-Annotate/Replicons/GEN2.0219.00001.fna @@ -0,0 +1,8 @@ +>GEN2.0219.00001.0001 +ATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTGCTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGTGATTAGCATTATCGGCTCGTCCGGTTCCGGTAAAAGCACTTTTTTGCGCTGCATTAACTTCCTCGAAAAATCGAGCGAAGGCGCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGACGGGCAGCTCAAAGTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCACTTTAACCTCTGGAACCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTACTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGGATTGATGAGCGCGCTCAGGGCAAATATCCCGTCCATCTCTCCGGCGGCCAACAGCAGCGCGTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACATCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAAGAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCCCGCCATGTCTCTTCGCACGTGATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGCAATCCGGAGCAGGTGTTCGGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAAACGCTAGAGGACGCGCCTCTCAGAGAGCGCGCTCTCTCAGAGAGGCGCGCGCCTCTTTCGCAGAGACCNNCGCTCATGAGCGATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTGACGGCAGGTTCCGCGCTTGCTGTTCTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCCAGACCGTCGATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGCCGACGGGCAGACTGTGGTCGTGCCGTCAGGATGGGTGTGTGAAAATATCAATGCGGCGATAACGATTCCGGCGGGAAAAACGCTGCGGATACAGGGCGCGGTGCGTGGGAATGGCCGGGGACGGTTTATTTTGCTGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAATGTGACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGACGATGAGCGGCTTTGGCCCCGTCGCGCAAATTTTCATCGGCGGTAAGGAACCGCAGGTGATGCGTAATCTCATTATCGATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCATAACCAAATGGACGGCGCGCGGATTACGCATAGTCGCTTTAGCGATTTGCAGGGGGACGCCATTGAGTGGAATGTCGCGATTCATGACCGCGACATCCTGATTTCCGATCATGTCATCGAACGCATTGATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCGCCTATGACAATAGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTGGTGGCCAATATTACCGGATCTGATTGCCGACAACTGGTACACGTAGAAAATGGCAAACATTTCGTCATTCGCAATGTCAAAGCCAAAAACATCACGCCCGATTTCAGTAAAAATGCGGGTATTGATAACGCAACGATCGCCATTTATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAGTGCCGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAACTTTAAATTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCAAATTTCCTCCGGTAACGCCCCCTCATTTGTTGCCATCACCAATGTACGGATGACGCGTGCTACGCTGGAACTGCATAATCAACCGCAGCACCTCTTTTTGCGTAATATCAACGTGATGCAAACTTCAGCGATTGGCCCGGCGTTAAAAATGCATTTTGATTTGCGTAAAGATGTCCGTGGTCAATTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAATGAAAACGGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGCAGTGAATTTTTCGCTGCCGAAGCGGGGAGGGTAACACTCGCGCATAGAGAGCTCTCAGAGGAGCGCGCGCGCTATAGCGCGC +>GEN2.0219.00001.0002 +CGCGATAATATAGCGCGCGCTCATAGCATGAGCGCGCTATTTCTGGCCATCCCGTTAACCATTTTTGTGTTGTTTGTGTTACCGATTTGGCTGTGGCTGCATTACAGCAACCGCGCCGGTCGGGGAGAACTGTCGCAAAGCGAGCAGCAACGCTTACTGCAACTCACAGACGACGCGCAACGTATGCGCGAGCGCATTCAGGCGCTGGAAGATATTCTTGATGCAGAGCATCCGAACTGGAGAGAGCGCTAACGCCATATTATAGCGCGCCTCATAAGAGCGGCCTATAGCGCGCTANNNATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTTACCAAGCCTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCACCGGGATTTTCCCCATCGGCGGTATCCGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGATTACGCGTTTGTGCATATGACGCTGAAAATCGGCGCCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTCGGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCGACCTGATGGAGAACCGCTATCTGGCGCTGTCGTTTGAGATTGCCGAGTTACATCCAACGCTCAATTACAAACAAAACAACGTACACGCGTTATTTAAATAGCCAGAGATTCGCGCGTTCAGAGAGGAGCTCTCTCATAGACGCGCGCATATGCGCTCTAGAGAGGCGCGCCTAATGGCGCGCTATGATGAAAGCGCTACTGTGGCTGGTTGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGCGAAAAAGGAATTATCGATAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCCTATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGACGTTAACGGGTCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTATATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCTTGGCATGCGGGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAATTCAGGCATTGATCCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCGCAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGCTTCCCGTGGCGCGAGCTGGCGGCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTGGCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCGTTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGGGTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCCGAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACAAAGCGCTCTCAGAGAGAGCGCTCTGCAGATGAGTGAGGCTGAAGCCCGCCCGACTAACTTTATTCGTCAGATTATTGATGAAGATCTGGCGAGTGGTAAACATACCACTGTCCATACCCGTTTTCCGCCGGAGCCGAATGGCTATCTGCACATCGGCCACGCGAAATCTATCTGCCTGAACTTTGGCATCGCGCAAGATTATCAGGGCCAGTGCAACCTGCGTTTCGATGACACCAACCCGGTAAAAGAAGATATCGAGTACGTTGATTCGATCAAAAACGACGTCGAGTGGTTAGGCTTTCACTGGTCTGGCGATATTCGCTACTCCTCCGATTACTTTGACCAACTGCACGCCTATGCGGTCGAGCTCATCAATAAAGGCCTGGCCTACGTTGATGAGCTGACGCCGGAGCAGATCCGCGAATACCGTGGTACGCTGACCGCGCCGGGTAAAAACAGCCCGTTCCGCGATCGCAGCGTGGAAGAAAACCTCGCGCTATTTGAAAAAATGCGTACCGGCGGTTTTGAAGAGGGTAAAGCCTGTCTGCGCGCTAAAATCGACATGGCGTCGCCGTTTATCGTGATGCGCGATCCGGTGCTGTATCGCATTAAATTCGCCGAGCATCATCAGACCGGCACGAAGTGGTGCATCTATCCGATGTACGACTTTACTCACTGCATCAGCGATGCGCTGGAAGGCATTACTCATTCTCTGTGTACGCTGGAGTTCCAGGACAACCGTCGTCTGTACGACTGGGTGCTGGACAACATCACCATTCCGGTTCACCCGCGCCAGTACGAATTCTCGCGCCTGAATCTGGAATACACCGTGATGTCCAAGCGTAAGCTGAACCTGCTGGTGACCGACAAACACGTCGAAGGTTGGGATGATCCGCGTATGCCGACTATTTCCGGTCTGCGCCGTCGCGGCTATACCGCGGCTTCTATTCGTGAGTTCTGCAAACGCATCGGCGTCACCAAGCAGGACAACACTATTGAGATGGCGTCGCTGGAATCCTGCATTCGCGAAGATCTGAACGAAAACGCGCCGCGCGCGATGGCGGTAATCGATCCGGTAAAACTGGTTATCGAAAACTACCCGCAGGGCGAGAGCGAAATGGTTACCATGCCTAACCATCCGAATAAACCGGAGATGGGCAGCCGTGAAGTGCCGTTTAGCGGTGAGATCTGGATCGATCGCGCGGATTTCCGCGAAGAAGCGAACAAACAGTACAAACGTCTGGTGATGGGCAAAGAAGTGCGTCTGCGTAATGCCTACGTCATTAAAGCGGAGCGCGTAGAGAAGGATGCCGAAGGGAATATCACCACCATCTTCTGTACCTATGATGCTGATACGCTGAGTAAAGATCCGGCTGACGGGCGTAAAGTGAAAGGCGTAATCCACTGGGTTAGCGTAGCACATGCGCTGCCGATTGAAATTCGTCTCTACGACCGTCTGTTCAGCGTGCCGAACCCGGGCGCCGCGGAGGACTTCCTGTCTGTTATCAACCCCGAATCATTAGTGATTAAGCAGGGGTATGGCGAGCCGTCGCTGAAAGCGGCGGTAGCAGGGAAAGCTTTCCAGTTTGAACGTGAAGGTTACTTCTGCCTTGACAGCCGCTATGCAACGGCCGATAAGCTGGTCTTTAACCGCACCGTGGGCCTGCGTGATACCTGGGCGAAAGCGGGCGAATAA +>GEN2.0219.00001.0003 +ACGCGCTATAGGGCTCTCAGAGAGTCTCAGTGTTTATTGGCATCGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGACTTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCGGGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATTCATGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAACTCATTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGGCCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGGGTTCTGGGGCATGAGGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCGCTGGCGTTTGAAACAGTCACTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAGCGAGATCGCGCATAGCGCGCGGCGAGATCCGCGAGACATGAATAATCATTTTGGGAAAGGGTTAATGGCCGGGTTGCACGCGCCATATGCATATAGCGCGCATCATGCGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGTATGTTCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGCCGGAATTCTGACGCGTCGCTATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATGGCGGTTCGCTTCTTTATGGCTGGTTATCGACTCGAAGGTTGAACGCCTCTAGAGCGCTAGAGGCGCGCGCGATATACGCGGCGCGAGACATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGCCAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTTGAGATGATCGTCCCTCAACTGCCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAATCTCTCGTACTTGAGCATGGCGGTGCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAATGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACCGACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGCCATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGGCTACTGCAACAGACGGGCGATGAAGTGCTGGATTACCGCCAGGCGGTGGCATATTACGCCTCCTGCCGTCAGACAGTGACCGAGGGGGGTAATCACGCATTCACGGGCTTCGAAGATTATTTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGACCGGATATACGCGCGCGCGCTATATAGCGCGCGCGGCGATATAGCGCATGAAGTTTGTTGCGCCAGAACAGGCACCGGAACAGGCGGAAATCATCAGAAATACGCCGTTCTGGCCTGATGTGGACCTGTCGGAGTTTCGCAGCATGATGCGCACTGATGGCACGGTGACGCAGCCGCGTTTAAAGCAGGTTGCGCTGTCGGCAATTTCGGAGGTCAACGCAGAGCTGTATGAGTTTCGCAGACGCCAGCAGATGCTGGGGTATGCCTCGCTGGCAGAAGTCCCGGCGGAACAACTGGACGGCAAAAGCGAGCGCATTCAGCACTATTTCAACGCGGTTTACTGCTGGGCACGCGCCATGCTCAACGAACGTTACCAGGACTATGACGCCACGGCATCCGGTGTGAAGCGGGGCGAAGAACTGGCAGAAGCCAGCGGTGATTTGTGGCGTGACGCCCGCTGGGCCATCAGCCGGGTGCAGGATGCGCCGCACTGCACAGTGGAGCTTATCTGA +>GEN2.0219.00001.0004 +ATGATTGACCCTATTTTTGCGTCCTGTACGCTAATTGCCGTCTTTGTTGTTTTACTGGCCATGGGCGCGCCTATCGGGATCTGCATCGTTATCGCCTCTTTCAGCACCATGATGCTGGTACTGCCTTTCGATATTTCGATGTTCGCCACCGCGCAAAAAATGTTCTCCAGCCTGGACAGTTTTGCCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGCGGGGGAATTGCCGCCCGACTGGTCAATTTTGCCAAACTGTTTACTGGCAAACTGCCCGGCTCGCTCTCTTACACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCGATTGCCGCCTCAACCTCTATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTACGATCGCGGTTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAATTCCGCCCACCACGGCTTTTATCCTTTATGCGCTGGCAAGCGGGGGAACATCGATTGCCGCTCTGTTCGCCGGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTGGTGGTCGCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGCATGGCGCTAAAAGTTGCCGTTGAGGCCATTCCCAGCCTGTTACTGATCGTGATTATTGTCGGCGGCATTGTGCAGGGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTATACGTTATTGCTGACGATAGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAGACAGTGGTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTCTCGATGTCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCCAATAAACTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATGGATATCGGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCGATTATGACTAAACTGGGCGTCGATCCGGTGCATTTCGGCATTATCATGATCTATAACCTGGCGATAGGCACCATTACGCCGCCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAGGACGTTATCAAACCGTTGATGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTATTAATTACCTACATTCCGGAAATCACACTGTTTTTACCCCGTCTACTGGGCATCATGTAAACGCTCATAGGCGGCGCGCGCGCTCTCAGGAATGAAGTTTGTTGCGCCAGAACAGGCACCGGAACAGGCGGAGGTCATCAAAAATACGCCGTTCTGGCCTGATGTGAACCTGTCGGAATTTCGCAGTGTGATGCGCACTGACGGCACGGTGACGCAGCCGCGTTTAAAGCAGGTCGTGCTGACGGCGATCTCTGAGGTTAACGCTGAGCTGTACGACTTCCGCAACCGTCAGCAGATGCTGGGCTGGCGGACACTTGCTGAGGTTCCCGCAGAAATGCTGGACGGTAAAAGCGAGCGTATCCGGCACTACCACAACGCTGTTTTTTGCTGGGCGCGCGCTGTGCTTAATGAGCGTTATCAGGACTATGACGCCACGGCGTCAGGCGTGAAGCGAGGGGAGGAGCTGGCGGAGGCCAGCGGCGATCTGTGGCGTGATGCCCGCTGGGCCATCAGCCGGGTGCAGGATGCACCGCACTGTACGGTGGAGCTTATCTGACGCTCATAGGCGCGCGCTCATAGCGCGATGGAAACAAATATTACCTGGCAACAATTGATAGATGAATATTTCTTCGCAAAACCTCTGCGCTCAGCATCTGAATGGAGTTACACCAAAGTCTTCAAATCATTTGTACATTATATGGGGCCGTTAAGCTGCCCTAATGATGTGACATATCACAAAGTGCTTGCCTGGCGCCGTTTTCTTTTAAAAGAGAAAAAGCTGTCCGGACGTACCTGGAATAACAAGGTGGCGCATATGCGGGCCATCTTTAACTACGGAATACAGCGAGGGTTACTGCACTATGACGAAAATCCGTTTAACAATTCGGTAGTTAAACCGGACAAGAAGAGAAAGAAAACGCTCACTCAGGCACAGATTGAGTATGCCTATCAGATCATGGAGCAGTATGAAAATCAGGAGAATACAGGGCTGGGACTGAAATATTCCCGCTGCGCCTTATTTCCTGCATGGTTCTGGCTCACTGTCCTGGATACGCTCTATTACACAGGGATACGTCAGAACCAGTTATTACATATTCGGCTGAATGATGTTGATTTGAGAGAAGGGCAGATTCGGCTGATTACGGAGGGGTGTAAAAATCACAAAGAACACTATGTGCCGGTGATCAGTTTTCTGCGTCCACGGCTGACCTGTTTAATGGAGAAAGCGCAGAGCGAAGGATTGAAAGGTAATGACCGCCTGTTCAATATTGCACTTTTTACCGGCAAAGATCCCGCCATTGGCGATGACATGGATTCTCCTCAGGTAAGAGCATTCTTCCGTCGTCTGTCCAAGGAGTGTCAGTTTGCGATCAGTCCTCATCGTTTCAGACACACGCTGGCCACGGAGATGATGAAAATGCCGGAACAGAATCTGCATATGGCGCAAAGTGTGCTGGGTCATTCAAACATGAAATCCACGCTGGAGTATGTGGAGAATGATATTGCAGTGATGGGGAGGGCTCTGGAAGCGCAGTTTATGCAGATTAAGGCAGCACATGCCCGAAGCATTTACAGTGGGTTGACAAAGAATAGATAA diff --git a/Examples/1-res-Annotate/Replicons/GEN4.1111.00001.fna b/Examples/1-res-Annotate/Replicons/GEN4.1111.00001.fna new file mode 100644 index 0000000000000000000000000000000000000000..821ceba742387f28a70b51f06bf5c81ea8e31957 --- /dev/null +++ b/Examples/1-res-Annotate/Replicons/GEN4.1111.00001.fna @@ -0,0 +1,2 @@ +>GEN4.1111.00001.0001 +ATGAAAGCGCTACTGTGGCTGGTGGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGCGAAAAAGGAATTATCGATAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCCTATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGACGTTAACGGGCCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTATATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCGGGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAATTCAGGCATTGATTCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCGCAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGCTTCCCGTGGCGCGAGCTGGCGGCGCAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTGGCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCGTTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAGCAGCGGGTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCCGAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACGCGATCGCGCGCATATCGCGCGCGATAGATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTTACCCGGCCTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGATTTTCCCCATCGGCGGTATCCGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGATTATGCGTTTGTGCATATGACGCTGAAAATCGGTACCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTCGGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCGACCTGATGGAGAACCGCTATCTGGCGCTGTCGTTTGAGATTGCCGAGCTACATCCGACGCTCAATTACAAACAAAACAACGTACACGCGTTATTTAAATAGCCGCATATGCGGCGCTCAGGAGCGCTAGCATGCTCGATAAACAGACCCATACCCTGATCGCCCAGCGACTTAATCAGGCTGAAAAACAGCGTGAACAAATTCGCGCAGTGTCGCTGGATTATCCCAACATCACTATTGAAGATGCCTATGCCGTACAGCGTGAATGGGTCAATATCAAGATCGCCGAAGGGCGCACGCTCAAAGGCCACAAAATCGGCCTGACCTCAAAAGCGATGCAGGCCAGCTCGCAAATCAGCGAACCGGATTACGGCGCGCTGCTTGACGATATGTTCTTCCATGACGGCGGCGATATCCCCACCGACCGTTTTATCGTCCCGCGTATTGAAGTGGAGCTGGCGTTCGTGCTGGCGAAACCGCTGCGCGGCCCTCACTGCACGCTGTTCGACGTCTACAACGCCACGGATTATGTGATTCCGGCGCTGGAACTGATTGACGCCCGCAGCCACAACATCGACCCGGAAACCCAGCGCCCGCGCAAAGTGTTCGACACCATTTCCGACAACGCCGCCAACGCCGGGGTGATCCTCGGTGGTCGCCCCATCAAACCAGACGAGCTGGATCTGCGCTGGATCTCCGCGCTGCTCTATCGCAACGGCGTGATCGAAGAAACCGGCGTCGCCGCAGGCGTGCTGAATCATCCGGCCAACGGCGTGGCGTGGCTGGCGAACAAGCTTGCCCCCTACGATGTCCAGCTTGAAGCCGGGCAGATCATCCTCGGCGGCTCGTTCACCCGCCCGGTGCCGGCGAGCAAGGGCGACACCTTCCATGTCGATTACGGCAACATGGGCGCGATCAGTTGCCGGTTTGTGTAACCAGATCCGCGCGCGCATATATATCGCGCGCATACGATGCCCGTGAATAAGTTCTCCCGACGTACCCTCCTGACGGCAGGTTCCGCGCTTGCTGTTCTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCGAGACCGTCGATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGCCGACGGACAGACCGTGGTCTTACCGCCAGGATGGGTGTGTGAAAATATCAATGCGGCGATAACGATTCCGGCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATGGCCGGGGACGGTTTATTTTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAATGTGACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGACGATGAGCGGCTTTGGCCCCGTCGCGCAAATTTTCATCGGCGGTAAGGAACCGCAGGTGATGCGTAATCTCATTATCGATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCATAACCAAATGGACGGCGCGCGGATTACGCATAGCCGCTTTAGCGATTTGCAGGGGGACGCCATTGAGTGGAATGTCGCGATTCACGACCGCGACATCCTGATTTCCGATCATGTCATCGAACGCATTGATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCACCTATGACAACAGTTATCCTGAAGATCAGGCAGTAAAAAACTTTGTGGTGGCCAATATTACCGGATCTGATTGCCGACAGCTGGTGCACGTAGAAAATGGCAAACATTTCGTCATTCGCAATGTCAAAGCCAAAAACATCACGCCCGATTTCAGTAAAAATGCGGGTATTGATAACGCAACGATCGCCATTTATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAGTGCTGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAACTTTAAATTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCAAATTTCCTCCGGCAACATCCCCTCTTTTGTCGCCATCACCAATGTACGGATGACGCGTGCTACGCTGGAACTGCATAATCAACCGCAGCACCTCTTTCTGCGTAATATCAACGTGATGCAAACTTCAGCGATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGTAAAGATGTCCGTGGTCAATTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAATGAAAACGGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGCAGTGAATTTTTCGCTGCCGAAGCGGGGAGGGTAACAGAGCTCGCGCGATCGCGAATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTGCTGAAAGGGGTATCGTTGCAGGCCCGCGCCGGAGATGTGATTAGCATCATCGGCTCGTCCGGCTCCGGTAAAAGCACTTTTTTGCGCTGTATTAACTTCCTCGAAAAACCGAGCGAAGGCGCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGACGGGCAGCTCAAAGTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCACTTTAACCTCTGGAGCCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTACTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGGATTGATGAGCGCGCTCAGGGCAAATATCCCGTCCATCTCTCCGGCGGCCAACAGCAGCGCGTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACATCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAAGAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCCCGCCATGTCTCTTCGCACGTTATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGCGATCCGGAGCAGGTGTTCGGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAACAGATACGCGCATGGCGAGTGTTTATTGGCATCGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGACTTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCGGGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGACCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGGGTTCTGGGGCATGAGGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCGGGAAACCATGCCGAGATACGTCGCTGGCGCTTGAAACAGTCGCTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAGCGATAGCGCTAGAGCGATGAATAATCATTTTGGGAAAGGGTTAATGGCTGGGTTGCACGCGCCATATGCATATAGCGCGCATCATGCGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGTATGTTCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGCCGGAATTCTGACGCGTCGCTATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATGGCGGTTCGCTTCTTTATGGTCGGTTATCGACTCGAAGGTTGACAGATGCGCGATCGATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGCCAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTCGAGATGATCGTCCCTCAACTACCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAATCTCTCGTGCTTGAGCATGGCGGTGCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAATGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACCGACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGCCATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGGCTACTGCAACAGACGGGCGATGAAGTGCTGGATTACCGCCAGGCGGTGGCATATTACGCCTCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCACGGGCTTCGAAGATTATTTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGACACGGATGCGCCATCGGCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGCGGGGGAATTGCCGCCCGGCTGGTCAATTTTGCCAAACTGTTTACTGGCAAACTGCCCGGCTCGCTCTCTTATACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCAATTGCCGCCTCAACCTCCATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTACGATCGCGGCTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAATTCCGCCCACCACGGCTTTTATCCTTTATGCGCTGGCAAGCGGGGGAACATCGATTGCCGCTCTGTTCGCCGGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTGGTGGTCGCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGCATGGCGCTAAAAGTTGCCGTTGAGGCCATTCCCAGCCTGCTGCTGATCGTGATTATCGTCGGCGGCATTGTGCAGGGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTATACATTATTGTTGACGATGGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAGACAGTGGTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTCTCGATGTCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCCAATAAACTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATGGATATCGGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCGATCATGGCTAAACTGGGCGTCGATCCGGTGCATTTTGGCATTATCATGATCTATAACCTGGCGATTGGCACCATTACGCCGCCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAGGAAGTGATTAAACCGTTGCTGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTATTAATTACCTACATTCCGGAAATCACACTGTTCTTACCCCGTCTACTGGGCATCATGTAA diff --git a/Examples/1-res-Annotate/discarded-list_genomes.lst b/Examples/1-res-Annotate/discarded-list_genomes.lst new file mode 100644 index 0000000000000000000000000000000000000000..5f702471df1cbfa5cbd91ee5e525d919b478954a --- /dev/null +++ b/Examples/1-res-Annotate/discarded-list_genomes.lst @@ -0,0 +1 @@ +orig_name gsize nb_conts L90 diff --git a/Examples/1-res-Annotate/genomeAPCAT-annotate_list_genomes.log b/Examples/1-res-Annotate/genomeAPCAT-annotate_list_genomes.log new file mode 100644 index 0000000000000000000000000000000000000000..3b96fe66665e961b9b0039cfa111e2a3ce8a9a9b --- /dev/null +++ b/Examples/1-res-Annotate/genomeAPCAT-annotate_list_genomes.log @@ -0,0 +1,9 @@ +[2019-02-12 13:18:41] root :: INFO :: Let's start! +[2019-02-12 13:18:41] utils :: INFO :: Reading genomes +[2019-02-12 13:18:41] qc_annote.gseq :: INFO :: Cutting genomes at each stretch of at least 5 'N', and then, calculating genome size, number of contigs and L90. +[2019-02-12 13:18:41] qc_annote.gseq :: INFO :: Generating distribution of L90 and #contigs graphs. +[2019-02-12 13:18:41] utils :: INFO :: 0 genome was discarded. +[2019-02-12 13:18:41] utils :: INFO :: Writing discarded genomes to Examples/1-res-Annotate/discarded-list_genomes.lst +[2019-02-12 13:18:41] qc_annote.gseq :: INFO :: Renaming kept genomes according to their quality. +[2019-02-12 13:18:41] qc_annote.prokka :: INFO :: Annotating all genomes with prokka +[2019-02-12 13:18:54] qc_annote.ffunc :: INFO :: Formatting all genomes diff --git a/Examples/1-res-Annotate/genomeAPCAT-annotate_list_genomes.log.details b/Examples/1-res-Annotate/genomeAPCAT-annotate_list_genomes.log.details new file mode 100644 index 0000000000000000000000000000000000000000..5e298e25772faebc3b7bf006a9c466e7c62c2dc6 --- /dev/null +++ b/Examples/1-res-Annotate/genomeAPCAT-annotate_list_genomes.log.details @@ -0,0 +1,32 @@ +[2019-02-12 13:18:41] root :: INFO :: Let's start! +[2019-02-12 13:18:41] utils :: INFO :: Reading genomes +[2019-02-12 13:18:41] qc_annote.gseq :: INFO :: Cutting genomes at each stretch of at least 5 'N', and then, calculating genome size, number of contigs and L90. +[2019-02-12 13:18:41] qc_annote.gseq :: INFO :: Generating distribution of L90 and #contigs graphs. +[2019-02-12 13:18:41] matplotlib :: DEBUG :: $HOME=/Users/aperrin +[2019-02-12 13:18:41] matplotlib :: DEBUG :: CONFIGDIR=/Users/aperrin/.matplotlib +[2019-02-12 13:18:41] matplotlib :: DEBUG :: matplotlib data path: /usr/local/lib/python3.7/site-packages/matplotlib/mpl-data +[2019-02-12 13:18:41] matplotlib :: DEBUG :: loaded rc file /usr/local/lib/python3.7/site-packages/matplotlib/mpl-data/matplotlibrc +[2019-02-12 13:18:41] matplotlib :: DEBUG :: matplotlib version 3.0.0 +[2019-02-12 13:18:41] matplotlib :: DEBUG :: interactive is False +[2019-02-12 13:18:41] matplotlib :: DEBUG :: platform is darwin +[2019-02-12 13:18:41] matplotlib :: DEBUG :: loaded modules: ['sys', 'builtins', '_frozen_importlib', '_imp', '_thread', '_warnings', '_weakref', 'zipimport', '_frozen_importlib_external', '_io', 'marshal', 'posix', 'encodings', 'codecs', '_codecs', 'encodings.aliases', 'encodings.utf_8', '_signal', '__main__', 'encodings.latin_1', 'io', 'abc', '_abc', 'site', 'os', 'stat', '_stat', 'posixpath', 'genericpath', 'os.path', '_collections_abc', '_sitebuiltins', '_bootlocale', '_locale', 'types', 'importlib', 'importlib._bootstrap', 'importlib._bootstrap_external', 'warnings', 'importlib.util', 'importlib.abc', 'importlib.machinery', 'contextlib', 'collections', 'operator', '_operator', 'keyword', 'heapq', '_heapq', 'itertools', 'reprlib', '_collections', 'functools', '_functools', 'mpl_toolkits', 'sphinxcontrib', 'sitecustomize', 're', 'enum', 'sre_compile', '_sre', 'sre_parse', 'sre_constants', 'copyreg', 'pkg_resources', '__future__', 'time', 'zipfile', 'shutil', 'fnmatch', 'errno', 'zlib', 'bz2', '_compression', 'threading', 'traceback', 'linecache', 'tokenize', 'token', '_weakrefset', '_bz2', 'lzma', '_lzma', 'pwd', 'grp', 'struct', '_struct', 'binascii', 'pkgutil', 'weakref', 'platform', 'subprocess', 'signal', '_posixsubprocess', 'select', 'selectors', 'collections.abc', 'math', 'plistlib', 'datetime', '_datetime', 'xml', 'xml.parsers', 'xml.parsers.expat', 'pyexpat.errors', 'pyexpat.model', 'pyexpat', 'xml.parsers.expat.model', 'xml.parsers.expat.errors', 'email', 'email.parser', 'email.feedparser', 'email.errors', 'email._policybase', 'email.header', 'email.quoprimime', 'string', '_string', 'email.base64mime', 'base64', 'email.charset', 'email.encoders', 'quopri', 'email.utils', 'random', 'hashlib', '_hashlib', '_blake2', '_sha3', 'bisect', '_bisect', '_random', 'socket', '_socket', 'urllib', 'urllib.parse', 'email._parseaddr', 'calendar', 'locale', 'tempfile', 'textwrap', 'inspect', 'dis', 'opcode', '_opcode', 'pkg_resources.extern', 'pkg_resources._vendor', 'pkg_resources.extern.six', 'pkg_resources._vendor.six', 'pkg_resources.extern.six.moves', 'pkg_resources._vendor.six.moves', 'pkg_resources.py31compat', 'pkg_resources.extern.appdirs', 'pkg_resources._vendor.packaging.__about__', 'pkg_resources.extern.packaging', 'pkg_resources.extern.packaging.version', 'pkg_resources.extern.packaging._structures', 'pkg_resources.extern.packaging.specifiers', 'pkg_resources.extern.packaging._compat', 'pkg_resources.extern.packaging.requirements', 'copy', 'pprint', 'pkg_resources.extern.pyparsing', 'pkg_resources.extern.six.moves.urllib', 'pkg_resources.extern.packaging.markers', 'sysconfig', '_osx_support', '_sysconfigdata_m_darwin_darwin', 'email.message', 'uu', 'email._encoded_words', 'email.iterators', 'genomeAPCAT', 'genomeAPCAT.subcommands', 'genomeAPCAT.subcommands.annote', 'genomeAPCAT.subcommands.pangenome', 'genomeAPCAT.subcommands.corepers', 'genomeAPCAT.subcommands.align', 'genomeAPCAT.subcommands.tree', 'argparse', 'gettext', 'genomeAPCAT.utils', 'glob', 'logging', 'atexit', 'logging.handlers', 'pickle', '_compat_pickle', '_pickle', 'queue', '_queue', 'shlex', 'progressbar', 'progressbar.utils', 'python_utils', 'python_utils.time', 'six', 'python_utils.converters', 'python_utils.terminal', 'progressbar.shortcuts', 'progressbar.bar', 'timeit', 'gc', 'progressbar.widgets', 'progressbar.base', 'progressbar.__about__', 'multiprocessing', 'multiprocessing.context', 'multiprocessing.process', 'multiprocessing.reduction', 'array', '__mp_main__', 'genomeAPCAT.annote_module', 'genomeAPCAT.annote_module.genome_seq_functions', 'numpy', 'numpy._globals', 'numpy.__config__', 'numpy.version', 'numpy._import_tools', 'numpy.add_newdocs', 'numpy.lib', 'numpy.lib.info', 'numpy.lib.type_check', 'numpy.core', 'numpy.core.info', 'numpy.core.multiarray', 'numpy.core.umath', 'numpy.core._internal', 'numpy.compat', 'numpy.compat._inspect', 'numpy.compat.py3k', 'pathlib', 'ntpath', 'ctypes', '_ctypes', 'ctypes._endian', 'numpy.core.numerictypes', 'numbers', 'numpy.core.numeric', 'numpy.core.fromnumeric', 'numpy.core._methods', 'numpy.core.arrayprint', 'numpy.core.defchararray', 'numpy.core.records', 'numpy.core.memmap', 'numpy.core.function_base', 'numpy.core.machar', 'numpy.core.getlimits', 'numpy.core.shape_base', 'numpy.core.einsumfunc', 'numpy.testing', 'unittest', 'unittest.result', 'unittest.util', 'unittest.case', 'difflib', 'unittest.suite', 'unittest.loader', 'unittest.main', 'unittest.runner', 'unittest.signals', 'numpy.testing._private', 'numpy.testing._private.utils', 'numpy.lib.utils', 'numpy.testing._private.decorators', 'numpy.testing._private.nosetester', 'numpy.testing._private.pytesttester', 'numpy.lib.ufunclike', 'numpy.lib.index_tricks', 'numpy.lib.function_base', 'numpy.lib.twodim_base', 'numpy.lib.histograms', 'numpy.matrixlib', 'numpy.matrixlib.defmatrix', 'ast', '_ast', 'numpy.linalg', 'numpy.linalg.info', 'numpy.linalg.linalg', 'numpy.linalg.lapack_lite', 'numpy.linalg._umath_linalg', 'numpy.lib.stride_tricks', 'numpy.lib.mixins', 'numpy.lib.nanfunctions', 'numpy.lib.shape_base', 'numpy.lib.scimath', 'numpy.lib.polynomial', 'numpy.lib.arraysetops', 'numpy.lib.npyio', 'numpy.lib.format', 'numpy.lib._datasource', 'numpy.lib._iotools', 'numpy.lib.financial', 'decimal', '_decimal', 'numpy.lib.arrayterator', 'numpy.lib.arraypad', 'numpy.lib._version', 'numpy.core._multiarray_tests', 'numpy._distributor_init', 'numpy.fft', 'numpy.fft.info', 'numpy.fft.fftpack', 'numpy.fft.fftpack_lite', 'numpy.fft.helper', 'numpy.polynomial', 'numpy.polynomial.polynomial', 'numpy.polynomial.polyutils', 'numpy.polynomial._polybase', 'numpy.polynomial.chebyshev', 'numpy.polynomial.legendre', 'numpy.polynomial.hermite', 'numpy.polynomial.hermite_e', 'numpy.polynomial.laguerre', 'numpy.random', 'numpy.random.info', 'cython_runtime', 'mtrand', 'numpy.random.mtrand', 'numpy.ctypeslib', 'numpy.ma', 'numpy.ma.core', 'numpy.ma.extras', 'genomeAPCAT.annote_module.prokka_functions', 'genomeAPCAT.annote_module.format_functions', 'matplotlib', 'distutils', 'distutils.version', 'urllib.request', 'http', 'http.client', 'ssl', '_ssl', 'urllib.error', 'urllib.response', '_scproxy', 'matplotlib.cbook', 'gzip', 'matplotlib.cbook.deprecation', 'matplotlib.rcsetup', 'matplotlib.fontconfig_pattern', 'pyparsing', 'matplotlib.colors', 'matplotlib._color_data', 'cycler', 'six.moves', 'matplotlib._version', 'json', 'json.decoder', 'json.scanner', '_json', 'json.encoder', 'dateutil', 'dateutil._version'] +[2019-02-12 13:18:41] matplotlib :: DEBUG :: CACHEDIR=/Users/aperrin/.matplotlib +[2019-02-12 13:18:41] matplotlib.font_manager :: DEBUG :: Using fontManager instance from /Users/aperrin/.matplotlib/fontlist-v300.json +[2019-02-12 13:18:41] matplotlib.pyplot :: DEBUG :: Loaded backend agg version unknown. +[2019-02-12 13:18:41] matplotlib.axes._base :: DEBUG :: update_title_pos +[2019-02-12 13:18:41] matplotlib.font_manager :: DEBUG :: findfont: Matching :family=sans-serif:style=normal:variant=normal:weight=normal:stretch=normal:size=10.0 to DejaVu Sans ('/usr/local/lib/python3.7/site-packages/matplotlib/mpl-data/fonts/ttf/DejaVuSans.ttf') with score of 0.050000. +[2019-02-12 13:18:41] matplotlib.font_manager :: DEBUG :: findfont: Matching :family=sans-serif:style=normal:variant=normal:weight=normal:stretch=normal:size=12.0 to DejaVu Sans ('/usr/local/lib/python3.7/site-packages/matplotlib/mpl-data/fonts/ttf/DejaVuSans.ttf') with score of 0.050000. +[2019-02-12 13:18:41] matplotlib.axes._base :: DEBUG :: update_title_pos +[2019-02-12 13:18:41] utils :: INFO :: 0 genome was discarded. +[2019-02-12 13:18:41] utils :: INFO :: Writing discarded genomes to Examples/1-res-Annotate/discarded-list_genomes.lst +[2019-02-12 13:18:41] qc_annote.gseq :: INFO :: Renaming kept genomes according to their quality. +[2019-02-12 13:18:41] qc_annote.prokka :: INFO :: Annotating all genomes with prokka +[2019-02-12 13:18:41] run_prokka :: DETAIL :: Start annotating EXAM.0219.00001 Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna +[2019-02-12 13:18:45] run_prokka :: DETAIL :: End annotating EXAM.0219.00001 Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna +[2019-02-12 13:18:45] run_prokka :: DETAIL :: Start annotating GEN2.0219.00001 Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna +[2019-02-12 13:18:48] run_prokka :: DETAIL :: End annotating GEN2.0219.00001 Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna +[2019-02-12 13:18:48] run_prokka :: DETAIL :: Start annotating EXAM.1216.00002 Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna +[2019-02-12 13:18:51] run_prokka :: DETAIL :: End annotating EXAM.1216.00002 Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna +[2019-02-12 13:18:51] run_prokka :: DETAIL :: Start annotating GEN4.1111.00001 Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna +[2019-02-12 13:18:54] run_prokka :: DETAIL :: End annotating GEN4.1111.00001 Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna +[2019-02-12 13:18:54] qc_annote.ffunc :: INFO :: Formatting all genomes diff --git a/Examples/1-res-Annotate/genomeAPCAT-annotate_list_genomes.log.err b/Examples/1-res-Annotate/genomeAPCAT-annotate_list_genomes.log.err new file mode 100644 index 0000000000000000000000000000000000000000..e69de29bb2d1d6434b8b29ae775ad8c2e48c5391 diff --git a/Examples/1-res-Annotate/gff3/EXAM.0219.00001.gff b/Examples/1-res-Annotate/gff3/EXAM.0219.00001.gff new file mode 100644 index 0000000000000000000000000000000000000000..32eed51cd4a2192fefb1cd5e9cc7e9b7f23a9196 --- /dev/null +++ b/Examples/1-res-Annotate/gff3/EXAM.0219.00001.gff @@ -0,0 +1,12 @@ +##gff-version 3 +EXAM.0219.00001.0001 Prodigal:2.6 CDS 213 1880 . + . ID=EXAM.0219.00001.b0001_00001;eC_number=6.1.1.18;Name=glnS;gene=glnS;inference=similar to AA sequence:UniProtKB:P00962;locus_tag=EXAM.0219.00001.b0001_00001;product=Glutamine--tRNA ligase +EXAM.0219.00001.0001 Prodigal:2.6 CDS 1916 2746 . + . ID=EXAM.0219.00001.b0001_00002;eC_number=3.5.1.28;Name=amiD;gene=amiD;inference=similar to AA sequence:UniProtKB:P75820;locus_tag=EXAM.0219.00001.b0001_00002;product=N-acetylmuramoyl-L-alanine amidase AmiD +EXAM.0219.00001.0002 Prodigal:2.6 CDS 16 396 . + . ID=EXAM.0219.00001.b0002_00003;eC_number=5.3.3.10;Name=hpcD;gene=hpcD;inference=similar to AA sequence:UniProtKB:Q05354;locus_tag=EXAM.0219.00001.b0002_00003;product=5-carboxymethyl-2-hydroxymuconate Delta-isomerase +EXAM.0219.00001.0002 Prodigal:2.6 CDS 428 1837 . + . ID=EXAM.0219.00001.i0002_00004;locus_tag=EXAM.0219.00001.i0002_00004;product=hypothetical protein +EXAM.0219.00001.0002 Prodigal:2.6 CDS 1869 2642 . + . ID=EXAM.0219.00001.i0002_00005;Name=hisP;gene=hisP;inference=similar to AA sequence:UniProtKB:P02915;locus_tag=EXAM.0219.00001.i0002_00005;product=Histidine transport ATP-binding protein HisP +EXAM.0219.00001.0002 Prodigal:2.6 CDS 2709 3443 . + . ID=EXAM.0219.00001.i0002_00006;eC_number=2.1.1.228;Name=trmD_1;gene=trmD_1;inference=similar to AA sequence:UniProtKB:P0A873;locus_tag=EXAM.0219.00001.i0002_00006;product=tRNA (guanine-N(1)-)-methyltransferase +EXAM.0219.00001.0002 Prodigal:2.6 CDS 3427 4254 . + . ID=EXAM.0219.00001.i0002_00007;eC_number=2.1.1.228;Name=trmD_2;gene=trmD_2;inference=similar to AA sequence:UniProtKB:P0A873;locus_tag=EXAM.0219.00001.i0002_00007;product=tRNA (guanine-N(1)-)-methyltransferase +EXAM.0219.00001.0002 Prodigal:2.6 CDS 4299 4763 . + . ID=EXAM.0219.00001.i0002_00008;locus_tag=EXAM.0219.00001.i0002_00008;product=hypothetical protein +EXAM.0219.00001.0002 Prodigal:2.6 CDS 4818 5078 . + . ID=EXAM.0219.00001.i0002_00009;locus_tag=EXAM.0219.00001.i0002_00009;product=hypothetical protein +EXAM.0219.00001.0002 Prodigal:2.6 CDS 5117 5698 . + . ID=EXAM.0219.00001.i0002_00010;locus_tag=EXAM.0219.00001.i0002_00010;product=hypothetical protein +EXAM.0219.00001.0002 Prodigal:2.6 CDS 5717 7024 . + . ID=EXAM.0219.00001.b0002_00011;Name=dctM;gene=dctM;inference=similar to AA sequence:UniProtKB:O07838;locus_tag=EXAM.0219.00001.b0002_00011;product=C4-dicarboxylate TRAP transporter large permease protein DctM diff --git a/Examples/1-res-Annotate/gff3/EXAM.1216.00002.gff b/Examples/1-res-Annotate/gff3/EXAM.1216.00002.gff new file mode 100644 index 0000000000000000000000000000000000000000..cc778abd422fb59a33703cdeaed7fecfa7a933eb --- /dev/null +++ b/Examples/1-res-Annotate/gff3/EXAM.1216.00002.gff @@ -0,0 +1,13 @@ +##gff-version 3 +EXAM.1216.00002.0001 Prodigal:2.6 CDS 1 831 . + . ID=EXAM.1216.00002.b0001_00001;eC_number=3.5.1.28;Name=amiD_1;gene=amiD_1;inference=similar to AA sequence:UniProtKB:P75820;locus_tag=EXAM.1216.00002.b0001_00001;product=N-acetylmuramoyl-L-alanine amidase AmiD +EXAM.1216.00002.0001 Prodigal:2.6 CDS 880 1710 . + . ID=EXAM.1216.00002.i0001_00002;eC_number=3.5.1.28;Name=amiD_2;gene=amiD_2;inference=similar to AA sequence:UniProtKB:P75820;locus_tag=EXAM.1216.00002.i0001_00002;product=N-acetylmuramoyl-L-alanine amidase AmiD +EXAM.1216.00002.0001 Prodigal:2.6 CDS 1655 2128 . + . ID=EXAM.1216.00002.i0001_00003;eC_number=5.3.3.10;Name=hpcD;gene=hpcD;inference=similar to AA sequence:UniProtKB:Q05354;locus_tag=EXAM.1216.00002.i0001_00003;product=5-carboxymethyl-2-hydroxymuconate Delta-isomerase +EXAM.1216.00002.0001 Prodigal:2.6 CDS 2172 2396 . + . ID=EXAM.1216.00002.i0001_00004;Name=pspB;gene=pspB;inference=similar to AA sequence:UniProtKB:P0AFM9;locus_tag=EXAM.1216.00002.i0001_00004;product=Phage shock protein B +EXAM.1216.00002.0001 Prodigal:2.6 CDS 2425 3828 . + . ID=EXAM.1216.00002.i0001_00005;locus_tag=EXAM.1216.00002.i0001_00005;product=hypothetical protein +EXAM.1216.00002.0001 Prodigal:2.6 CDS 3863 4636 . + . ID=EXAM.1216.00002.i0001_00006;Name=hisP;gene=hisP;inference=similar to AA sequence:UniProtKB:P02915;locus_tag=EXAM.1216.00002.i0001_00006;product=Histidine transport ATP-binding protein HisP +EXAM.1216.00002.0001 Prodigal:2.6 CDS 4641 5432 . + . ID=EXAM.1216.00002.i0001_00007;eC_number=2.1.1.228;Name=trmD;gene=trmD;inference=similar to AA sequence:UniProtKB:P0A873;locus_tag=EXAM.1216.00002.i0001_00007;product=tRNA (guanine-N(1)-)-methyltransferase +EXAM.1216.00002.0001 Prodigal:2.6 CDS 5460 5747 . + . ID=EXAM.1216.00002.b0001_00008;locus_tag=EXAM.1216.00002.b0001_00008;product=hypothetical protein +EXAM.1216.00002.0002 Prodigal:2.6 CDS 1 582 . + . ID=EXAM.1216.00002.b0002_00009;locus_tag=EXAM.1216.00002.b0002_00009;product=hypothetical protein +EXAM.1216.00002.0002 Prodigal:2.6 CDS 629 1936 . + . ID=EXAM.1216.00002.b0002_00010;Name=dctM;gene=dctM;inference=similar to AA sequence:UniProtKB:O07838;locus_tag=EXAM.1216.00002.b0002_00010;product=C4-dicarboxylate TRAP transporter large permease protein DctM +EXAM.1216.00002.0003 Prodigal:2.6 CDS 26 280 . + . ID=EXAM.1216.00002.b0003_00011;locus_tag=EXAM.1216.00002.b0003_00011;product=hypothetical protein +EXAM.1216.00002.0003 Prodigal:2.6 CDS 323 1102 . + . ID=EXAM.1216.00002.b0003_00012;inference=similar to AA sequence:UniProtKB:Q45619;locus_tag=EXAM.1216.00002.b0003_00012;product=Insertion sequence IS5376 putative ATP-binding protein diff --git a/Examples/1-res-Annotate/gff3/GEN2.0219.00001.gff b/Examples/1-res-Annotate/gff3/GEN2.0219.00001.gff new file mode 100644 index 0000000000000000000000000000000000000000..cfa099cef564bed7d6de95e87e550cb7ebfc234b --- /dev/null +++ b/Examples/1-res-Annotate/gff3/GEN2.0219.00001.gff @@ -0,0 +1,14 @@ +##gff-version 3 +GEN2.0219.00001.0001 Prodigal:2.6 CDS 1 774 . + . ID=GEN2.0219.00001.b0001_00001;Name=hisP;gene=hisP;inference=similar to AA sequence:UniProtKB:P02915;locus_tag=GEN2.0219.00001.b0001_00001;product=Histidine transport ATP-binding protein HisP +GEN2.0219.00001.0001 Prodigal:2.6 CDS 857 2260 . + . ID=GEN2.0219.00001.b0001_00002;locus_tag=GEN2.0219.00001.b0001_00002;product=hypothetical protein +GEN2.0219.00001.0002 Prodigal:2.6 CDS 28 252 . + . ID=GEN2.0219.00001.b0002_00003;Name=pspB;gene=pspB;inference=similar to AA sequence:UniProtKB:P0AFM9;locus_tag=GEN2.0219.00001.b0002_00003;product=Phage shock protein B +GEN2.0219.00001.0002 Prodigal:2.6 CDS 301 681 . + . ID=GEN2.0219.00001.i0002_00004;eC_number=5.3.3.10;Name=hpcD;gene=hpcD;inference=similar to AA sequence:UniProtKB:Q05354;locus_tag=GEN2.0219.00001.i0002_00004;product=5-carboxymethyl-2-hydroxymuconate Delta-isomerase +GEN2.0219.00001.0002 Prodigal:2.6 CDS 764 1597 . + . ID=GEN2.0219.00001.i0002_00005;eC_number=3.5.1.28;Name=amiD;gene=amiD;inference=similar to AA sequence:UniProtKB:P75820;locus_tag=GEN2.0219.00001.i0002_00005;product=N-acetylmuramoyl-L-alanine amidase AmiD +GEN2.0219.00001.0002 Prodigal:2.6 CDS 1628 3295 . + . ID=GEN2.0219.00001.b0002_00006;eC_number=6.1.1.18;Name=glnS;gene=glnS;inference=similar to AA sequence:UniProtKB:P00962;locus_tag=GEN2.0219.00001.b0002_00006;product=Glutamine--tRNA ligase +GEN2.0219.00001.0003 Prodigal:2.6 CDS 30 797 . + . ID=GEN2.0219.00001.b0003_00007;eC_number=2.1.1.228;Name=trmD;gene=trmD;inference=similar to AA sequence:UniProtKB:P0A873;locus_tag=GEN2.0219.00001.b0003_00007;product=tRNA (guanine-N(1)-)-methyltransferase +GEN2.0219.00001.0003 Prodigal:2.6 CDS 862 1122 . + . ID=GEN2.0219.00001.i0003_00008;locus_tag=GEN2.0219.00001.i0003_00008;product=hypothetical protein +GEN2.0219.00001.0003 Prodigal:2.6 CDS 1170 1751 . + . ID=GEN2.0219.00001.i0003_00009;locus_tag=GEN2.0219.00001.i0003_00009;product=hypothetical protein +GEN2.0219.00001.0003 Prodigal:2.6 CDS 1895 2263 . + . ID=GEN2.0219.00001.b0003_00010;locus_tag=GEN2.0219.00001.b0003_00010;product=hypothetical protein +GEN2.0219.00001.0004 Prodigal:2.6 CDS 1 1308 . + . ID=GEN2.0219.00001.b0004_00011;Name=dctM;gene=dctM;inference=similar to AA sequence:UniProtKB:O07838;locus_tag=GEN2.0219.00001.b0004_00011;product=C4-dicarboxylate TRAP transporter large permease protein DctM +GEN2.0219.00001.0004 Prodigal:2.6 CDS 1340 1804 . + . ID=GEN2.0219.00001.i0004_00012;locus_tag=GEN2.0219.00001.i0004_00012;product=hypothetical protein +GEN2.0219.00001.0004 Prodigal:2.6 CDS 1832 2845 . + . ID=GEN2.0219.00001.b0004_00013;Name=xerC;gene=xerC;inference=similar to AA sequence:UniProtKB:P39776;locus_tag=GEN2.0219.00001.b0004_00013;product=Tyrosine recombinase XerC diff --git a/Examples/1-res-Annotate/gff3/GEN4.1111.00001.gff b/Examples/1-res-Annotate/gff3/GEN4.1111.00001.gff new file mode 100644 index 0000000000000000000000000000000000000000..501e70cfc46444a97bc73caef44346c66030adf3 --- /dev/null +++ b/Examples/1-res-Annotate/gff3/GEN4.1111.00001.gff @@ -0,0 +1,10 @@ +##gff-version 3 +GEN4.1111.00001.0001 Prodigal:2.6 CDS 1 831 . + . ID=GEN4.1111.00001.b0001_00001;eC_number=3.5.1.28;Name=amiD;gene=amiD;inference=similar to AA sequence:UniProtKB:P75820;locus_tag=GEN4.1111.00001.b0001_00001;product=N-acetylmuramoyl-L-alanine amidase AmiD +GEN4.1111.00001.0001 Prodigal:2.6 CDS 861 1241 . + . ID=GEN4.1111.00001.i0001_00002;eC_number=5.3.3.10;Name=hpcD;gene=hpcD;inference=similar to AA sequence:UniProtKB:Q05354;locus_tag=GEN4.1111.00001.i0001_00002;product=5-carboxymethyl-2-hydroxymuconate Delta-isomerase +GEN4.1111.00001.0001 Prodigal:2.6 CDS 1271 2074 . + . ID=GEN4.1111.00001.i0001_00003;eC_number=4.2.1.163;Name=hpcG;gene=hpcG;inference=similar to AA sequence:UniProtKB:P42270;locus_tag=GEN4.1111.00001.i0001_00003;product=2-oxo-hept-4-ene-1,7-dioate hydratase +GEN4.1111.00001.0001 Prodigal:2.6 CDS 2111 3514 . + . ID=GEN4.1111.00001.i0001_00004;locus_tag=GEN4.1111.00001.i0001_00004;product=hypothetical protein +GEN4.1111.00001.0001 Prodigal:2.6 CDS 3535 4308 . + . ID=GEN4.1111.00001.i0001_00005;Name=hisP;gene=hisP;inference=similar to AA sequence:UniProtKB:P02915;locus_tag=GEN4.1111.00001.i0001_00005;product=Histidine transport ATP-binding protein HisP +GEN4.1111.00001.0001 Prodigal:2.6 CDS 4327 5094 . + . ID=GEN4.1111.00001.i0001_00006;eC_number=2.1.1.228;Name=trmD;gene=trmD;inference=similar to AA sequence:UniProtKB:P0A873;locus_tag=GEN4.1111.00001.i0001_00006;product=tRNA (guanine-N(1)-)-methyltransferase +GEN4.1111.00001.0001 Prodigal:2.6 CDS 5138 5398 . + . ID=GEN4.1111.00001.i0001_00007;locus_tag=GEN4.1111.00001.i0001_00007;product=hypothetical protein +GEN4.1111.00001.0001 Prodigal:2.6 CDS 5413 5994 . + . ID=GEN4.1111.00001.i0001_00008;locus_tag=GEN4.1111.00001.i0001_00008;product=hypothetical protein +GEN4.1111.00001.0001 Prodigal:2.6 CDS 6052 7134 . + . ID=GEN4.1111.00001.b0001_00009;Name=dctM;gene=dctM;inference=similar to AA sequence:UniProtKB:O07838;locus_tag=GEN4.1111.00001.b0001_00009;product=C4-dicarboxylate TRAP transporter large permease protein DctM diff --git a/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna new file mode 100644 index 0000000000000000000000000000000000000000..11feaae3628ca13667c0d695bbcacd5a2139310d --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna @@ -0,0 +1,4 @@ +>genome1_1 +ATGTGTAAAAGCCGGAGGGGTTATCTTTTCCCGGCTTTTTATTATCAATTACTCATTAACTCCTGTTCCGTTCTTTTGCGTTTAATCACCGGAATATCTCCGGTATTGTTCAGCGCCCCGGAAATGTTTTTAACCACTGTTCTGCACTCCGTTTATTAAACGCGCTCAGCGCGCGCTCATATATCGCGCGCGCGCGCGCGCATATATATATAATGAGTGAGGCTGAAGCCCGCCCGACTAACTTTATTCGTCAGATTATTGATGAAGATCTGGCGAGTGGTAAACATACCACTGTCCATACCCGTTTTCCGCCGGAGCCGAATGGCTATCTGCATATCGGCCACGCGAAATCTATCTGCCTGAACTTTGGCATCGCGCAAGATTATCAGGGCCAGTGCAACCTGCGTTTCGATGACACCAACCCGGTAAAAGAAGATATCGAGTACGTTGATTCGATCAAAAACGACGTCGAGTGGTTAGGCTTTCACTGGTCTGGCGATATTCGCTACTCCTCCGATTACTTTGACCAACTGCACGCCTATGCGGTCGAGCTAATCAATAAAGGCCTGGCCTATGTTGATGAGCTGACGCCGGAGCAGATCCGTGAATACCGCGGTACGCTGACCGCGCCGGGTAAAAACAGCCCGTTCCGCGATCGCAGCGTCGAAGAGAACCTCGCGCTATTTGAAAAAATGCGTACCGGCGGTTTTGAAGAGGGTAAAGCCTGTCTGCGCGCTAAAATCGACATGGCGTCGCCGTTTATCGTGATGCGCGATCCGGTGCTGTATCGCATTAAATTCGCCGAGCATCATCAGACTGGCAACAAGTGGTGCATCTATCCGATGTACGACTTTACTCACTGCATCAGCGATGCGCTGGAAGGCATTACTCATTCTCTGTGTACGCTGGAGTTCCAGGATAACCGTCGTCTGTACGACTGGGTGCTGGACAACATCACCATTCCGGTTCACCCGCGCCAGTACGAATTCTCGCGCCTGAATCTGGAATACACCGTGATGTCCAAGCGTAAGCTGAACCTGCTGGTGACCGACAAACACGTCGAAGGTTGGGACGATCCGCGTATGCCGACTATTTCCGGTCTGCGCCGTCGCGGCTATACCGCGGCTTCTATTCGTGAGTTCTGCAAACGCATCGGCGTCACCAAGCAGGACAACACTATTGAGATGGCGTCGCTGGAATCCTGCATTCGCGAAGATCTGAACGAAAACGCGCCGCGCGCGATGGCGGTAATCGATCCGGTAAAACTGGTTATCGAAAATTACCCGCAGGGTGAGAGCGAAATGGTTACCATGCCTAACCATCCGAATAAACCGGAGATGGGCAGCCGTGAAGTGCCGTTTAGCGGTGAGATCTGGATCGATCGCGCAGATTTCCGCGAAGAAGCGAACAAACAGTACAAACGTCTGGTGATGGGCAAAGAAGTGCGTCTGCGTAATGCCTACGTCATTAAAGCGGAGCGCGTAGAGAAGGATGCCGAAGGGAATATCACCACCATCTTCTGTACCTATGATGCTGATACGCTGAGTAAAGATCCGGCTGACGGGCGTAAAGTGAAAGGCGTAATCCACTGGGTTAGCGCAGCACATGCGCTGCCGATTGAAATTCGTCTCTACGACCGTCTGTTCAGCGTGCCGAATCCGGGCGCCGCGGAGGACTTCCTGTCTGTTATCAACCCCGAATCATTAGTGATTAAGCAGGGGTATGGCGAGCCGTCGCTGAAAGCGGCGGTAGCAGGAAAAGCTTTCCAGTTTGAACGTGAAGGCTACTTCTGCCTCGACAGCCGCTATGCAACGGCCGATAAGCTGGTCTTTAACCGCACCGTGGGCCTGCGTGATACCTGGGCGAAAGCGGGCGAGTAACGCGCATAGGCGGCCTTCAAGGAGGCGCTAGGCGAATGAAAGCGCTACTGTGGCTGGTGGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGCGAAAAAGGAATTATCGATAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCCTATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGACGTTAACGGGTCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTATATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCGGGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAATCGTGGTTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAATTCAGGCATTGATTCCGTTAGCGAAGGATATTATCGCGCGCTATAACATCAAACCGCAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGCTTCCCGTGGCGCGAGCTGGCGGCGCAGGGGATTAGCGCCTGGCCTGACGCCCAGCGTGTGGCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCGTTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCACGCGAGCAGCAGCGGGTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCCGAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACGCGATAGGCGCGCGAGATTACGCGCGCAGTATCGCGC +>genome1_2 +CGCGCATAGCGCGCAATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTTACCCGGCCTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGATTTTCCCCATCGGCGGTATCCGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGATTACGCGTTTGTGCATATGACGCTGAAAATCGGCGCCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTTGGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCGACCTGATGGAGAACCGCTATCTGGCGCTGTCGTTTGAGATTGCCGAGCTACATCCGACGCTCAATTACAAACAAAACAACGTACACGCGTTATTTAAATAGCCGATATGAGGCGCGGCGCTATAAGGCGCTCATGAGCATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTGACGGCAGGTTCTGCGCTTGCTGTTCTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCGAGACCGTCGATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGCCGACGGACAGACCGTGGTCGTACCGCCAGGATGGGTGTGTGAAAATATCAATGCGGCGATAACGATTCCGGCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATGGCCGGGGACGGTTTATTTTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAATGTGACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGGCGATGAGCGGCTTTGGCCCCGTCGCGCAAATTTTCATCGGTGGTAAGGAACCGCAGGTGATGCGTAATCTCATTATCGATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCATAACCAAATGGATGGCGCGCGGATTACGCATAGCCGCTTTAGCGATTTACAGGGGGACGCCATTGAGTGGAATGTCGCGATTCACGACCGCGACATCCTGATTTCCGATCATGTCATCGAACGCATTAATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCACCTATGACAACAGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTGGTGGCCAATATTACCGGATCTGATTGCCGACAGCTTGTGCACGTAGAAAATGGCAAACATTTCGTCATTCGCAATGTCAAAGCCAAAAACATCACGCCCGGTTTCAGTAAAAATGCGGGTATTGATAACGCAACGATCGCAATTTATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAGTGCCGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAACTTTAAATTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCAAATTTCCTCCGGCAACACCCCCTCTTTTGTCGCCATCACCAATGTACGGATGACGCGTGCTACGCTGGAACTGCATAATCAACCGCAGCACCTCTTTCTGCGCAATATCAACGTGATGCAAACTTCAGCGATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGTAAAGATGTACGTGGTCAATTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAATGAAAACGGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGCAGTGAATTTTTCGCTGCCGAAGCGGGGAGGGTAACGAATTCCGCGGCATATAGCGGCGCGATAGCATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTGCTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGTGATTAGCATCATCGGCTCGTCCGGCTCCGGTAAAAGCACTTTTTTGCGCTGTATTAACTTCCTCGAAAAACCGAGCGAAGGCGCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGATGGGCAGCTCAAAGTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCACTTCAACCTCTGGAGCCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTACTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGGATTGATGAGCGCGCTCAGGGCAAATATCCCGTCCATCTCTCCGGCGGCCAACAGCAGCGCGTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACATCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAAGAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCTCGCCATGTCTCTTCGCACGTTATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGTGATCCGGAGCAGGTGTTCGGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAACGCTCAGAGGCGCGCGCGCGCTATATACGCGCAGTGTTTATTGGCATAGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGACTTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCGGGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGACCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGGGTTCTGGGGCATGAGGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCGCTGGCGTTTGAAACAGTCGCTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAGCCGTCATATGGGCTCTCAGAGGAGACGCGCTAATGCGCAGTACGTGTTTATTGGCATAGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGACTTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCGGGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGACCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGGGTTCTGGGGCATGAGGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCGCTGGCGTTTGAAACAGTCGCTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAGCGCATACGCGCGCGATATATATTCGCGCAGAGGCGCGCATAGCGATGAAGTTTGTTGCGCCCGAACAGGCACCGGAACAGGCGGAGGTCATCAAAAATACGCCGTTCTGGCCTGATGTGGACCTGTCGGAATTTCGCAGTGTGATGCGCACTGACGGCACGGTGACGCAGCCGCGTTTAAAGCAGGTCGTGCTGACGGCGATCTCTGAGGTTAACGCTGAGCTGTACGACTTCCGCAACCGTCAGCAGATGCTGGGCTGGCGGACACTTGCTGAGGTTCCCGCAGAAATGCTGGACGGTAAAAGCGAGCGTATCCGGCACTACCACAACGCTGTTTTTTGCTGGGCGCGCGCTGTGCTTAATGAGCGTTATCAGGACTATGACGCCACGGCGTCAGGCGTGAAGCGAGGGGAGGAGCTGGCGGAGGCCAGCGGCGATCTGTGGCGTGATGCCCGCTGGGCCATCAGCCGGGTGCAGGATGTACCGCACTGTACGGTGGAGCTTATCTGAACGGCTAGCGCGTAGCGGCAGTCTCGGATGAATAATCATTTTGGGAAAGGGTTAATGGCCGGGTTGCACGCGCCATATGCATATAGCGCGCATCATGCGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGTATGTTCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGCTGGAATTCTGACGCGTCGCTATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATGGCGGTTCGCTTCTTTATGGCCGGTTATCGACTCGAAGGTTGACGATCGCGCGCAGGAGGAGAGCTCTCTTCTTCTAGAGCATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGCCAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTTGAGATGATCGTCCCTCAGCTGCCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAATCTCTCGTGCTTGAGCATGGCGGTGCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAATGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACCGACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGCCATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGGCTACTGCAACAGACGGGCGATGAAGTGCTGTATTACCGCCAGGCGGTGGCATATTACGCCTCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCACGGGCTTCGAAGATTATTTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGACGGCTATGCGCGCGCTCGATGATTGACCCTATTTTTGCGTCCTGTACGCTAATTGCCGTCTTTGTTGTTTTACTGGCCATGGGCGCGCCTATCGGGATCTGCATCGTTATCGCCTCTTTCAGCACCATGATGCTGGTACTGCCTTTCGATATTTCGATGTTCGCCACCGCGCAAAAAATGTTCTCCAGCCTGGACAGTTTTGCCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGCGGGGGAATTGCCGCCCGGCTGGTCAATTTTGCCAAACTGTTTACTGGCAAACTGCCCGGCTCGCTCTCTTATACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCGATTGCCGCCTCAACCTCCATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTACGATCGCGGCTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAATTCCGCCCACCACGGCTTTTATCCTTTACGCGCTGGCAAGCGGGGGAACATCGATTGCCGCTCTGTTCGCCGGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTGGTGGTCGCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGTATGGCGCTAAAAGTTGCCGTTGAGGCCATTCCCAGCCTGCTGCTGATCGTGATTATTGTCGGCGGCATTGTGCAGGGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTATACGTTATTGCTGACGATGGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAGACAGTGGTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTCTCGATGTCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCCAATAAACTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATGGATATTGGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCAATCATGGCTAAACTGGGCGTCGATCCGGTGCATTTCGGCATTATCATGATCTATAACCTGGCGATTGGCACCATTACGCCGCCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAGGAAGTGATTAAACCGTTGCTGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTATTAATTACCTACATTCCGGAAATCACACTGTTCTTACCCCGTCTACTGGGCATCATGTAA diff --git a/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokka.log b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokka.log new file mode 100644 index 0000000000000000000000000000000000000000..6aa806da49c94da5a8444327c8a94d809e958faf --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokka.log @@ -0,0 +1,121 @@ +[13:18:42] This is prokka 1.14-dev +[13:18:42] Written by Torsten Seemann <torsten.seemann@gmail.com> +[13:18:42] Homepage is https://github.com/tseemann/prokka +[13:18:42] Local time is Tue Feb 12 13:18:42 2019 +[13:18:42] You are aperrin +[13:18:42] Operating system is darwin +[13:18:42] You have BioPerl 1.006924 +[13:18:42] System has 8 cores. +[13:18:42] Will use maximum of 1 cores. +[13:18:42] Annotating as >>> Bacteria <<< +[13:18:42] Generating locus_tag from 'Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna' contents. +[13:18:42] Setting --locustag GKEGNCBE from MD5 04e07cbef9b8a6b8d550f0d6ed3a8746 +[13:18:42] Creating new output folder: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes +[13:18:42] Running: mkdir -p Examples\/1\-res\-Annotate\/tmp_files\/genome1\.fst\-split5N\.fna\-prokkaRes +[13:18:42] Using filename prefix: EXAM.0219.00001.XXX +[13:18:42] Setting HMMER_NCPU=1 +[13:18:42] Writing log to: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.log +[13:18:42] Command: /usr/local/bin/prokka --outdir Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes --cpus 1 --prefix EXAM.0219.00001 Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna +[13:18:42] Appending to PATH: /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin +[13:18:42] Appending to PATH: /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/../common +[13:18:42] Appending to PATH: /Users/aperrin/Softwares/src/prokka/bin +[13:18:42] Looking for 'aragorn' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/aragorn +[13:18:42] Determined aragorn version is 1.2 +[13:18:42] Looking for 'barrnap' - found /usr/local/bin/barrnap +[13:18:42] Determined barrnap version is 0.8 +[13:18:42] Looking for 'blastp' - found /Users/aperrin/Softwares/bin/blastp +[13:18:42] Determined blastp version is 2.3 +[13:18:42] Looking for 'cmpress' - found /usr/local/bin/cmpress +[13:18:42] Determined cmpress version is 1.1 +[13:18:42] Looking for 'cmscan' - found /usr/local/bin/cmscan +[13:18:42] Determined cmscan version is 1.1 +[13:18:42] Looking for 'egrep' - found /usr/bin/egrep +[13:18:42] Looking for 'find' - found /usr/bin/find +[13:18:42] Looking for 'grep' - found /usr/bin/grep +[13:18:42] Looking for 'hmmpress' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/hmmpress +[13:18:42] Determined hmmpress version is 3.1 +[13:18:42] Looking for 'hmmscan' - found /usr/local/bin/hmmscan +[13:18:42] Determined hmmscan version is 3.1 +[13:18:42] Looking for 'java' - found /usr/bin/java +[13:18:42] Looking for 'less' - found /usr/bin/less +[13:18:42] Looking for 'makeblastdb' - found /Users/aperrin/Softwares/bin/makeblastdb +[13:18:42] Determined makeblastdb version is 2.3 +[13:18:42] Looking for 'minced' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/../common/minced +[13:18:42] Determined minced version is 2.0 +[13:18:42] Looking for 'parallel' - found /usr/local/bin/parallel +[13:18:42] Determined parallel version is 20181022 +[13:18:42] Looking for 'prodigal' - found /usr/local/bin/prodigal +[13:18:42] Determined prodigal version is 2.6 +[13:18:42] Looking for 'prokka-genbank_to_fasta_db' - found /Users/aperrin/Softwares/src/prokka/bin/prokka-genbank_to_fasta_db +[13:18:42] Looking for 'sed' - found /usr/bin/sed +[13:18:42] Looking for 'tbl2asn' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/tbl2asn +[13:18:42] Determined tbl2asn version is 25.6 +[13:18:42] Using genetic code table 11. +[13:18:42] Loading and checking input file: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna +[13:18:42] Wrote 2 contigs totalling 9808 bp. +[13:18:42] Predicting tRNAs and tmRNAs +[13:18:42] Running: aragorn -l -gc11 -w Examples\/1\-res\-Annotate\/tmp_files\/genome1\.fst\-split5N\.fna\-prokkaRes\/EXAM\.0219\.00001\.fna +[13:18:42] Found 0 tRNAs +[13:18:42] Predicting Ribosomal RNAs +[13:18:42] Running Barrnap with 1 threads +[13:18:42] Found 0 rRNAs +[13:18:42] Skipping ncRNA search, enable with --rfam if desired. +[13:18:42] Total of 0 tRNA + rRNA features +[13:18:42] Searching for CRISPR repeats +[13:18:43] Found 0 CRISPRs +[13:18:43] Predicting coding sequences +[13:18:43] Contigs total 9808 bp, so using meta mode +[13:18:43] Running: prodigal -i Examples\/1\-res\-Annotate\/tmp_files\/genome1\.fst\-split5N\.fna\-prokkaRes\/EXAM\.0219\.00001\.fna -c -m -g 11 -p meta -f sco -q +[13:18:43] Found 11 CDS +[13:18:43] Connecting features back to sequences +[13:18:43] Not using genus-specific database. Try --usegenus to enable it. +[13:18:43] Annotating CDS, please be patient. +[13:18:43] Will use 1 CPUs for similarity searching. +[13:18:43] There are still 11 unannotated CDS left (started with 11) +[13:18:43] Will use blast to search against /Users/aperrin/Softwares/src/prokka/db/kingdom/Bacteria/sprot with 1 CPUs +[13:18:43] Running: cat Examples\/1\-res\-Annotate\/tmp_files\/genome1\.fst\-split5N\.fna\-prokkaRes\/sprot\.faa | parallel --gnu --plain -j 1 --block 1558 --recstart '>' --pipe blastp -query - -db /Users/aperrin/Softwares/src/prokka/db/kingdom/Bacteria/sprot -evalue 1e-06 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > Examples\/1\-res\-Annotate\/tmp_files\/genome1\.fst\-split5N\.fna\-prokkaRes\/sprot\.blast 2> /dev/null +[13:18:44] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/sprot.faa +[13:18:44] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/sprot.blast +[13:18:44] There are still 4 unannotated CDS left (started with 11) +[13:18:44] Will use hmmer3 to search against /Users/aperrin/Softwares/src/prokka/db/hmm/HAMAP.hmm with 1 CPUs +[13:18:44] Running: cat Examples\/1\-res\-Annotate\/tmp_files\/genome1\.fst\-split5N\.fna\-prokkaRes\/HAMAP\.hmm\.faa | parallel --gnu --plain -j 1 --block 459 --recstart '>' --pipe hmmscan --noali --notextw --acc -E 1e-06 --cpu 1 /Users/aperrin/Softwares/src/prokka/db/hmm/HAMAP.hmm /dev/stdin > Examples\/1\-res\-Annotate\/tmp_files\/genome1\.fst\-split5N\.fna\-prokkaRes\/HAMAP\.hmm\.hmmer3 2> /dev/null +[13:18:44] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/HAMAP.hmm.faa +[13:18:44] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/HAMAP.hmm.hmmer3 +[13:18:44] Labelling remaining 4 proteins as 'hypothetical protein' +[13:18:44] Possible /pseudo 'tRNA (guanine-N(1)-)-methyltransferase' at genome1_2 position 3427 +[13:18:44] Found 6 unique /gene codes. +[13:18:44] Fixed 2 duplicate /gene - trmD_1 trmD_2 +[13:18:44] Fixed 1 colliding /gene names. +[13:18:44] Adding /locus_tag identifiers +[13:18:44] Assigned 11 locus_tags to CDS and RNA features. +[13:18:44] Writing outputs to Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/ +[13:18:44] Generating annotation statistics file +[13:18:44] Generating Genbank and Sequin files +[13:18:44] Running: tbl2asn -V b -a r10k -l paired-ends -M n -N 1 -y 'Annotated using prokka 1.14-dev from https://github.com/tseemann/prokka' -Z Examples\/1\-res\-Annotate\/tmp_files\/genome1\.fst\-split5N\.fna\-prokkaRes\/EXAM\.0219\.00001\.err -i Examples\/1\-res\-Annotate\/tmp_files\/genome1\.fst\-split5N\.fna\-prokkaRes\/EXAM\.0219\.00001\.fsa 2> /dev/null +[13:18:45] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/errorsummary.val +[13:18:45] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.dr +[13:18:45] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.fixedproducts +[13:18:45] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.ecn +[13:18:45] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.val +[13:18:45] Repairing broken .GBK output that tbl2asn produces... +[13:18:45] Running: sed 's/COORDINATES: profile/COORDINATES:profile/' < Examples\/1\-res\-Annotate\/tmp_files\/genome1\.fst\-split5N\.fna\-prokkaRes\/EXAM\.0219\.00001\.gbf > Examples\/1\-res\-Annotate\/tmp_files\/genome1\.fst\-split5N\.fna\-prokkaRes\/EXAM\.0219\.00001\.gbk +[13:18:45] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.gbf +[13:18:45] Output files: +[13:18:45] Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.txt +[13:18:45] Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.tsv +[13:18:45] Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.fna +[13:18:45] Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.err +[13:18:45] Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.gff +[13:18:45] Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.sqn +[13:18:45] Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.tbl +[13:18:45] Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.gbk +[13:18:45] Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.log +[13:18:45] Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.ffn +[13:18:45] Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.faa +[13:18:45] Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.fsa +[13:18:45] Annotation finished successfully. +[13:18:45] Walltime used: 0.05 minutes +[13:18:45] If you use this result please cite the Prokka paper: +[13:18:45] Seemann T (2014) Prokka: rapid prokaryotic genome annotation. Bioinformatics. 30(14):2068-9. +[13:18:45] Type 'prokka --citation' for more details. +[13:18:45] Share and enjoy! diff --git a/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.err b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.err new file mode 100644 index 0000000000000000000000000000000000000000..9837b5a1951f8b0fabc962ef6fbdf29c7b38b6fb --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.err @@ -0,0 +1,126 @@ +Discrepancy Report Results + +Summary +FATAL: MISSING_PROTEIN_ID:11 proteins have invalid IDs. +DISC_SOURCE_QUALS_ASNDISC:strain (all present, all same) +DISC_SOURCE_QUALS_ASNDISC:taxname (all present, all same) +DISC_FEATURE_COUNT:CDS: 11 present +DISC_COUNT_NUCLEOTIDES:2 nucleotide Bioseqs are present +FEATURE_LOCATION_CONFLICT:11 features have inconsistent gene locations. +OVERLAPPING_CDS:2 coding regions overlap another coding region with a similar or identical name. +DISC_QUALITY_SCORES:Quality scores are missing on all sequences. +ONCALLER_COMMENT_PRESENT:2 comment descriptors were found (all same) +MISSING_GENOMEASSEMBLY_COMMENTS:2 bioseqs are missing GenomeAssembly structured comments +MOLTYPE_NOT_MRNA:2 molecule types are not set as mRNA. +TECHNIQUE_NOT_TSA:2 technique are not set as TSA +MISSING_STRUCTURED_COMMENT:2 sequences do not include structured comments. +MISSING_PROJECT:13 sequences do not include project. +DISC_INCONSISTENT_MOLINFO_TECH:Molinfo Technique Report (some missing, all same) + + +Detailed Report + +FATAL: DiscRep_ALL:MISSING_PROTEIN_ID::11 proteins halid nvalid IDs. +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_1_1 (length 555) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_1_2 (length 276) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2_1 (length 126) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2_2 (length 469) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2_3 (length 257) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2_4 (length 244) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2_5 (length 275) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2_6 (length 154) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2_7 (length 86) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2_8 (length 193) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2_9 (length 435) + +DiscRep_ALL:DISC_SOURCE_QUALS_ASNDISC::strain (all present, all same) +DiscRep_SUB:DISC_SOURCE_QUALS_ASNDISC::2 sources have 'strain' for strain +DiscRep_ALL:DISC_SOURCE_QUALS_ASNDISC::taxname (all present, all same) +DiscRep_SUB:DISC_SOURCE_QUALS_ASNDISC::2 sources have 'Genus species' for taxname +DiscRep_ALL:DISC_FEATURE_COUNT::CDS: 11 present +DiscRep_ALL:DISC_COUNT_NUCLEOTIDES::2 nucleotide Bioseqs are present +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_1 (length 2784) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2 (length 7024) + +DiscRep_ALL:FEATURE_LOCATION_CONFLICT::11 features have inconsistent gene locations. +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:CDS Glutamine--tRNA ligase genome1_1:213-1880 GKEGNCBE_00001 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:CDS N-acetylmuramoyl-L-alanine amidase AmiD genome1_1:1916-2746 GKEGNCBE_00002 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:CDS 5-carboxymethyl-2-hydroxymuconate Delta-isomerase genome1_2:16-396 GKEGNCBE_00003 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:CDS hypothetical protein genome1_2:428-1837 GKEGNCBE_00004 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:CDS Histidine transport ATP-binding protein HisP genome1_2:1869-2642 GKEGNCBE_00005 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:CDS tRNA (guanine-N(1)-)-methyltransferase genome1_2:2709-3443 GKEGNCBE_00006 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:CDS tRNA (guanine-N(1)-)-methyltransferase genome1_2:3427-4254 GKEGNCBE_00007 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:CDS hypothetical protein genome1_2:4299-4763 GKEGNCBE_00008 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:CDS hypothetical protein genome1_2:4818-5078 GKEGNCBE_00009 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:CDS hypothetical protein genome1_2:5117-5698 GKEGNCBE_00010 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:CDS C4-dicarboxylate TRAP transporter large permease protein DctM genome1_2:5717-7024 GKEGNCBE_00011 + +DiscRep_ALL:OVERLAPPING_CDS::2 coding regions overlap another coding region with a similar or identical name. +DiscRep_SUB:OVERLAPPING_CDS::2 coding regions overlap another coding region with a similar or identical name that do not have the appropriate note text +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:CDS tRNA (guanine-N(1)-)-methyltransferase genome1_2:2709-3443 GKEGNCBE_00006 +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:CDS tRNA (guanine-N(1)-)-methyltransferase genome1_2:3427-4254 GKEGNCBE_00007 + +DiscRep_ALL:DISC_QUALITY_SCORES::Quality scores are missing on all sequences. + +DiscRep_ALL:ONCALLER_COMMENT_PRESENT::2 comment descriptors were found (all same) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_1:Annotated using prokka 1.14-dev from https://github.com/tseemann/prokka +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2:Annotated using prokka 1.14-dev from https://github.com/tseemann/prokka + +DiscRep_ALL:MISSING_GENOMEASSEMBLY_COMMENTS::2 bioseqs are missing GenomeAssembly structured comments +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_1 (length 2784) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2 (length 7024) + +DiscRep_ALL:MOLTYPE_NOT_MRNA::2 molecule types are not set as mRNA. +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_1 (length 2784) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2 (length 7024) + +DiscRep_ALL:TECHNIQUE_NOT_TSA::2 technique are not set as TSA +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_1 (length 2784) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2 (length 7024) + +DiscRep_ALL:MISSING_STRUCTURED_COMMENT::2 sequences do not include structured comments. +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_1 (length 2784) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2 (length 7024) + +DiscRep_ALL:MISSING_PROJECT::13 sequences do not include project. +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_1 (length 2784) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_1_1 (length 555) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_1_2 (length 276) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2 (length 7024) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2_1 (length 126) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2_2 (length 469) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2_3 (length 257) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2_4 (length 244) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2_5 (length 275) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2_6 (length 154) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2_7 (length 86) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2_8 (length 193) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2_9 (length 435) + +DiscRep_ALL:DISC_INCONSISTENT_MOLINFO_TECH::Molinfo Technique Report (some missing, all same) +DiscRep_SUB:DISC_INCONSISTENT_MOLINFO_TECH::technique (all missing) +DiscRep_SUB:DISC_INCONSISTENT_MOLINFO_TECH::2 Molinfos are missing field technique +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_1 (length 2784) +Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001:genome1_2 (length 7024) + diff --git a/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.faa b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.faa new file mode 100644 index 0000000000000000000000000000000000000000..2bb9f621de719842530393507763f132848a7e16 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.faa @@ -0,0 +1,69 @@ +>GKEGNCBE_00001 Glutamine--tRNA ligase +MSEAEARPTNFIRQIIDEDLASGKHTTVHTRFPPEPNGYLHIGHAKSICLNFGIAQDYQG +QCNLRFDDTNPVKEDIEYVDSIKNDVEWLGFHWSGDIRYSSDYFDQLHAYAVELINKGLA +YVDELTPEQIREYRGTLTAPGKNSPFRDRSVEENLALFEKMRTGGFEEGKACLRAKIDMA +SPFIVMRDPVLYRIKFAEHHQTGNKWCIYPMYDFTHCISDALEGITHSLCTLEFQDNRRL +YDWVLDNITIPVHPRQYEFSRLNLEYTVMSKRKLNLLVTDKHVEGWDDPRMPTISGLRRR +GYTAASIREFCKRIGVTKQDNTIEMASLESCIREDLNENAPRAMAVIDPVKLVIENYPQG +ESEMVTMPNHPNKPEMGSREVPFSGEIWIDRADFREEANKQYKRLVMGKEVRLRNAYVIK +AERVEKDAEGNITTIFCTYDADTLSKDPADGRKVKGVIHWVSAAHALPIEIRLYDRLFSV +PNPGAAEDFLSVINPESLVIKQGYGEPSLKAAVAGKAFQFEREGYFCLDSRYATADKLVF +NRTVGLRDTWAKAGE +>GKEGNCBE_00002 N-acetylmuramoyl-L-alanine amidase AmiD +MKALLWLVGLALLLTGCASEKGIIDKEGYQLDTRHRAQAAYPRIKVLVIHYTAENFDVSL +ATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGVSFWRGATRLNDTSIGIEL +ENRGWRMSGGVKSFAPFESAQIQALIPLAKDIIARYNIKPQNVVAHADIAPQRKDDPGPR +FPWRELAAQGISAWPDAQRVAFYLAGRAPYTPVDTATVLALLSRYGYEVKADMTAREQQR +VIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD +>GKEGNCBE_00003 5-carboxymethyl-2-hydroxymuconate Delta-isomerase +MPHFIAECTENIREQADLPGLFSKVNEALAASGIFPIGGIRSRAHWLDTWQMADGKHDYA +FVHMTLKIGAGRSLESRQEVGEMLFGLIKAHFADLMENRYLALSFEIAELHPTLNYKQNN +VHALFK +>GKEGNCBE_00004 hypothetical protein +MSMPATKFSRRTLLTAGSALAVLPFLRALPVQAREPRETVDIKDYPADDGIASFKQAFAD +GQTVVVPPGWVCENINAAITIPAGKTLRVQGAVRGNGRGRFILQDGCQVVGEQGGSLHNV +TLDVRGSDCVIKGVAMSGFGPVAQIFIGGKEPQVMRNLIIDDITVTHANYAILRQGFHNQ +MDGARITHSRFSDLQGDAIEWNVAIHDRDILISDHVIERINCTNGKINWGIGIGLAGSTY +DNSYPEDQAVKNFVVANITGSDCRQLVHVENGKHFVIRNVKAKNITPGFSKNAGIDNATI +AIYGCDNFVIDNIDMTNSAGMLIGYGVVKGKYLSIPQNFKLNAIRLDNRQVAYKLRGIQI +SSGNTPSFVAITNVRMTRATLELHNQPQHLFLRNINVMQTSAIGPALKMHFDLRKDVRGQ +FMARQDTLLSLANVHAINENGQSSVDIDRINHQTVNVEAVNFSLPKRGG +>GKEGNCBE_00005 Histidine transport ATP-binding protein HisP +MSENKLHVIDLHKRYGGHEVLKGVSLQARAGDVISIIGSSGSGKSTFLRCINFLEKPSEG +AIIVNGQNINLVRDKDGQLKVADKNQLRLLRTRLTMVFQHFNLWSHMTVLENVMEAPIQV +LGLSKHDARERALKYLAKVGIDERAQGKYPVHLSGGQQQRVSIARALAMEPDVLLFDEPT +SALDPELVGEVLRIMQQLAEEGKTMVVVTHEMGFARHVSSHVIFLHQGKIEEEGDPEQVF +GNPQSPRLQQFLKGSLK +>GKEGNCBE_00006 tRNA (guanine-N(1)-)-methyltransferase +MFRAITDYGVTGRAVKKGLLNIQSWSPRDFAHDRHRTVDDRPYGGGPGMLMMVQPLRDAI +HAAKAAAGEGAKVIYLSPQGRKLDQAGVSELATNQKLILVCGRYEGVDERVIQTEIDEEW +SIGDYVLSGGELPAMTLIDSVARFIPGVLGHEASAIEDSFADGLLDCPHYTRPEVLEGME +VPPVLLSGNHAEIRRWRLKQSLGRTWLRRPELLENLALTEEQARLLAEFKTEHAQQQHKH +DGMA +>GKEGNCBE_00007 tRNA (guanine-N(1)-)-methyltransferase +MMGWHSRHMGSQRRRANAQYVFIGIVSLFPEMFRAITDYGVTGRAVKKGLLNIQSWSPRD +FAHDRHRTVDDRPYGGGPGMLMMVQPLRDAIHAAKAAAGEGAKVIYLSPQGRKLDQAGVS +ELATNQKLILVCGRYEGVDERVIQTEIDEEWSIGDYVLSGGELPAMTLIDSVARFIPGVL +GHEASAIEDSFADGLLDCPHYTRPEVLEGMEVPPVLLSGNHAEIRRWRLKQSLGRTWLRR +PELLENLALTEEQARLLAEFKTEHAQQQHKHDGMA +>GKEGNCBE_00008 hypothetical protein +MKFVAPEQAPEQAEVIKNTPFWPDVDLSEFRSVMRTDGTVTQPRLKQVVLTAISEVNAEL +YDFRNRQQMLGWRTLAEVPAEMLDGKSERIRHYHNAVFCWARAVLNERYQDYDATASGVK +RGEELAEASGDLWRDARWAISRVQDVPHCTVELI +>GKEGNCBE_00009 hypothetical protein +MAGLHAPYAYSAHHAVNFCSEYKRGFVLGFTHRMFEKTGDRQLSAWEAGILTRRYGLDKE +MVMDFFKENHSGMAVRFFMAGYRLEG +>GKEGNCBE_00010 hypothetical protein +MSTLLYLHGFNSSPRSAKACQLKNWLAERHPHVEMIVPQLPPYPADAAELLESLVLEHGG +APLGLVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDYLGQNENPYTGQQYVLESR +HIYDLKVMQIDPLEAPDLIWLLQQTGDEVLYYRQAVAYYASCRQTVTEGGNHAFTGFEDY +FNQIVDFLGLHSC +>GKEGNCBE_00011 C4-dicarboxylate TRAP transporter large permease protein DctM +MIDPIFASCTLIAVFVVLLAMGAPIGICIVIASFSTMMLVLPFDISMFATAQKMFSSLDS +FALLAVPFFVLSGVIMNSGGIAARLVNFAKLFTGKLPGSLSYTNIVGNMMFGAISGSAIA +ASTSIGGVMVPMSAREGYDRGFAAAVNIASAPTGMLIPPTTAFILYALASGGTSIAALFA +GGLVAGVLWGVGCMLVTLVVAKRRNYRVFFTVQKGMALKVAVEAIPSLLLIVIIVGGIVQ +GIFTAIEASAIAVVYTLLLTMVFYRTLKIKDLPSILLQTVVMTGVIMFLLATSSAMSFSM +SITNIPAALSDMILGISANKLVILLVITVFLLIIGAFMDIGPAILIFTPILLPIMAKLGV +DPVHFGIIMIYNLAIGTITPPVGSGLYVGASVGKVKVEEVIKPLLPFYGAIIGVLLLITY +IPEITLFLPRLLGIM diff --git a/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.ffn b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.ffn new file mode 100644 index 0000000000000000000000000000000000000000..8bf92ebb8dfd9e743a03a529ee7eda731975d893 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.ffn @@ -0,0 +1,169 @@ +>GKEGNCBE_00001 Glutamine--tRNA ligase +ATGAGTGAGGCTGAAGCCCGCCCGACTAACTTTATTCGTCAGATTATTGATGAAGATCTG +GCGAGTGGTAAACATACCACTGTCCATACCCGTTTTCCGCCGGAGCCGAATGGCTATCTG +CATATCGGCCACGCGAAATCTATCTGCCTGAACTTTGGCATCGCGCAAGATTATCAGGGC +CAGTGCAACCTGCGTTTCGATGACACCAACCCGGTAAAAGAAGATATCGAGTACGTTGAT +TCGATCAAAAACGACGTCGAGTGGTTAGGCTTTCACTGGTCTGGCGATATTCGCTACTCC +TCCGATTACTTTGACCAACTGCACGCCTATGCGGTCGAGCTAATCAATAAAGGCCTGGCC +TATGTTGATGAGCTGACGCCGGAGCAGATCCGTGAATACCGCGGTACGCTGACCGCGCCG +GGTAAAAACAGCCCGTTCCGCGATCGCAGCGTCGAAGAGAACCTCGCGCTATTTGAAAAA +ATGCGTACCGGCGGTTTTGAAGAGGGTAAAGCCTGTCTGCGCGCTAAAATCGACATGGCG +TCGCCGTTTATCGTGATGCGCGATCCGGTGCTGTATCGCATTAAATTCGCCGAGCATCAT +CAGACTGGCAACAAGTGGTGCATCTATCCGATGTACGACTTTACTCACTGCATCAGCGAT +GCGCTGGAAGGCATTACTCATTCTCTGTGTACGCTGGAGTTCCAGGATAACCGTCGTCTG +TACGACTGGGTGCTGGACAACATCACCATTCCGGTTCACCCGCGCCAGTACGAATTCTCG +CGCCTGAATCTGGAATACACCGTGATGTCCAAGCGTAAGCTGAACCTGCTGGTGACCGAC +AAACACGTCGAAGGTTGGGACGATCCGCGTATGCCGACTATTTCCGGTCTGCGCCGTCGC +GGCTATACCGCGGCTTCTATTCGTGAGTTCTGCAAACGCATCGGCGTCACCAAGCAGGAC +AACACTATTGAGATGGCGTCGCTGGAATCCTGCATTCGCGAAGATCTGAACGAAAACGCG +CCGCGCGCGATGGCGGTAATCGATCCGGTAAAACTGGTTATCGAAAATTACCCGCAGGGT +GAGAGCGAAATGGTTACCATGCCTAACCATCCGAATAAACCGGAGATGGGCAGCCGTGAA +GTGCCGTTTAGCGGTGAGATCTGGATCGATCGCGCAGATTTCCGCGAAGAAGCGAACAAA +CAGTACAAACGTCTGGTGATGGGCAAAGAAGTGCGTCTGCGTAATGCCTACGTCATTAAA +GCGGAGCGCGTAGAGAAGGATGCCGAAGGGAATATCACCACCATCTTCTGTACCTATGAT +GCTGATACGCTGAGTAAAGATCCGGCTGACGGGCGTAAAGTGAAAGGCGTAATCCACTGG +GTTAGCGCAGCACATGCGCTGCCGATTGAAATTCGTCTCTACGACCGTCTGTTCAGCGTG +CCGAATCCGGGCGCCGCGGAGGACTTCCTGTCTGTTATCAACCCCGAATCATTAGTGATT +AAGCAGGGGTATGGCGAGCCGTCGCTGAAAGCGGCGGTAGCAGGAAAAGCTTTCCAGTTT +GAACGTGAAGGCTACTTCTGCCTCGACAGCCGCTATGCAACGGCCGATAAGCTGGTCTTT +AACCGCACCGTGGGCCTGCGTGATACCTGGGCGAAAGCGGGCGAGTAA +>GKEGNCBE_00002 N-acetylmuramoyl-L-alanine amidase AmiD +ATGAAAGCGCTACTGTGGCTGGTGGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGCGAA +AAAGGAATTATCGATAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCC +TATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTG +GCGACGTTAACGGGTCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTA +TATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCG +GGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTG +GAAAATCGTGGTTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCG +CAAATTCAGGCATTGATTCCGTTAGCGAAGGATATTATCGCGCGCTATAACATCAAACCG +CAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGC +TTCCCGTGGCGCGAGCTGGCGGCGCAGGGGATTAGCGCCTGGCCTGACGCCCAGCGTGTG +GCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCG +TTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCACGCGAGCAGCAGCGG +GTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCC +GAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAA +>GKEGNCBE_00003 5-carboxymethyl-2-hydroxymuconate Delta-isomerase +ATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTTACCCGGC +CTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGATTTTCCCCATCGGCGGTATC +CGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGATTACGCG +TTTGTGCATATGACGCTGAAAATCGGCGCCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTT +GGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCGACCTGATGGAGAACCGCTAT +CTGGCGCTGTCGTTTGAGATTGCCGAGCTACATCCGACGCTCAATTACAAACAAAACAAC +GTACACGCGTTATTTAAATAG +>GKEGNCBE_00004 hypothetical protein +ATGAGCATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTGACGGCAGGTTCTGCGCTT +GCTGTTCTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCGAGACCGTC +GATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGCCGAC +GGACAGACCGTGGTCGTACCGCCAGGATGGGTGTGTGAAAATATCAATGCGGCGATAACG +ATTCCGGCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATGGCCGGGGACGG +TTTATTTTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAATGTG +ACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGGCGATGAGCGGCTTTGGC +CCCGTCGCGCAAATTTTCATCGGTGGTAAGGAACCGCAGGTGATGCGTAATCTCATTATC +GATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCATAACCAA +ATGGATGGCGCGCGGATTACGCATAGCCGCTTTAGCGATTTACAGGGGGACGCCATTGAG +TGGAATGTCGCGATTCACGACCGCGACATCCTGATTTCCGATCATGTCATCGAACGCATT +AATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCACCTAT +GACAACAGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTGGTGGCCAATATTACCGGA +TCTGATTGCCGACAGCTTGTGCACGTAGAAAATGGCAAACATTTCGTCATTCGCAATGTC +AAAGCCAAAAACATCACGCCCGGTTTCAGTAAAAATGCGGGTATTGATAACGCAACGATC +GCAATTTATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAGTGCCGGG +ATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAACTTTAAA +TTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCAAATT +TCCTCCGGCAACACCCCCTCTTTTGTCGCCATCACCAATGTACGGATGACGCGTGCTACG +CTGGAACTGCATAATCAACCGCAGCACCTCTTTCTGCGCAATATCAACGTGATGCAAACT +TCAGCGATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGTAAAGATGTACGTGGTCAA +TTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAATGAAAAC +GGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGCAGTG +AATTTTTCGCTGCCGAAGCGGGGAGGGTAA +>GKEGNCBE_00005 Histidine transport ATP-binding protein HisP +ATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTG +CTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGTGATTAGCATCATCGGCTCGTCC +GGCTCCGGTAAAAGCACTTTTTTGCGCTGTATTAACTTCCTCGAAAAACCGAGCGAAGGC +GCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGATGGGCAGCTCAAA +GTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCAC +TTCAACCTCTGGAGCCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTA +CTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGG +ATTGATGAGCGCGCTCAGGGCAAATATCCCGTCCATCTCTCCGGCGGCCAACAGCAGCGC +GTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACA +TCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAA +GAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCTCGCCATGTCTCTTCG +CACGTTATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGTGATCCGGAGCAGGTGTTC +GGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAA +>GKEGNCBE_00006 tRNA (guanine-N(1)-)-methyltransferase +ATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTG +AACATCCAAAGCTGGAGTCCTCGCGACTTCGCGCATGACCGGCACCGTACCGTGGACGAC +CGTCCTTACGGCGGCGGACCGGGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATT +CACGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGA +CGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAGCTTATTCTGGTG +TGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGACCGAAATTGACGAAGAATGG +TCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCC +GTCGCCCGGTTTATACCGGGGGTTCTGGGGCATGAGGCATCAGCAATCGAAGATTCGTTT +GCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGAA +GTACCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCGCTGGCGTTTGAAACAG +TCGCTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAA +GAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGCATAAACAT +GATGGGATGGCATAG +>GKEGNCBE_00007 tRNA (guanine-N(1)-)-methyltransferase +ATGATGGGATGGCATAGCCGTCATATGGGCTCTCAGAGGAGACGCGCTAATGCGCAGTAC +GTGTTTATTGGCATAGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGG +GTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGAC +TTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCGGGGATG +TTAATGATGGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCAGGTGAA +GGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGC +GAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAG +CGCGTAATTCAGACCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTCAGCGGT +GGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGGGTTCTG +GGGCATGAGGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCAC +TATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCGGGAAAC +CATGCTGAGATACGTCGCTGGCGTTTGAAACAGTCGCTGGGCCGAACCTGGCTTAGAAGA +CCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTC +AAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAG +>GKEGNCBE_00008 hypothetical protein +ATGAAGTTTGTTGCGCCCGAACAGGCACCGGAACAGGCGGAGGTCATCAAAAATACGCCG +TTCTGGCCTGATGTGGACCTGTCGGAATTTCGCAGTGTGATGCGCACTGACGGCACGGTG +ACGCAGCCGCGTTTAAAGCAGGTCGTGCTGACGGCGATCTCTGAGGTTAACGCTGAGCTG +TACGACTTCCGCAACCGTCAGCAGATGCTGGGCTGGCGGACACTTGCTGAGGTTCCCGCA +GAAATGCTGGACGGTAAAAGCGAGCGTATCCGGCACTACCACAACGCTGTTTTTTGCTGG +GCGCGCGCTGTGCTTAATGAGCGTTATCAGGACTATGACGCCACGGCGTCAGGCGTGAAG +CGAGGGGAGGAGCTGGCGGAGGCCAGCGGCGATCTGTGGCGTGATGCCCGCTGGGCCATC +AGCCGGGTGCAGGATGTACCGCACTGTACGGTGGAGCTTATCTGA +>GKEGNCBE_00009 hypothetical protein +ATGGCCGGGTTGCACGCGCCATATGCATATAGCGCGCATCATGCGGTGAATTTCTGTTCT +GAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGTATGTTCGAAAAGACCGGCGAT +CGTCAACTTAGCGCGTGGGAGGCTGGAATTCTGACGCGTCGCTATGGTCTGGATAAAGAA +ATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATGGCGGTTCGCTTCTTTATGGCC +GGTTATCGACTCGAAGGTTGA +>GKEGNCBE_00010 hypothetical protein +ATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGC +CAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTTGAGATGATCGTCCCTCAGCTG +CCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAATCTCTCGTGCTTGAGCATGGCGGT +GCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAA +TGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACC +GACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGC +CATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGG +CTACTGCAACAGACGGGCGATGAAGTGCTGTATTACCGCCAGGCGGTGGCATATTACGCC +TCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCACGGGCTTCGAAGATTAT +TTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGA +>GKEGNCBE_00011 C4-dicarboxylate TRAP transporter large permease protein DctM +ATGATTGACCCTATTTTTGCGTCCTGTACGCTAATTGCCGTCTTTGTTGTTTTACTGGCC +ATGGGCGCGCCTATCGGGATCTGCATCGTTATCGCCTCTTTCAGCACCATGATGCTGGTA +CTGCCTTTCGATATTTCGATGTTCGCCACCGCGCAAAAAATGTTCTCCAGCCTGGACAGT +TTTGCCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGCGGGGGA +ATTGCCGCCCGGCTGGTCAATTTTGCCAAACTGTTTACTGGCAAACTGCCCGGCTCGCTC +TCTTATACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCGATTGCC +GCCTCAACCTCCATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTACGATCGC +GGCTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAATTCCGCCCACC +ACGGCTTTTATCCTTTACGCGCTGGCAAGCGGGGGAACATCGATTGCCGCTCTGTTCGCC +GGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTGGTGGTC +GCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGTATGGCGCTAAAAGTT +GCCGTTGAGGCCATTCCCAGCCTGCTGCTGATCGTGATTATTGTCGGCGGCATTGTGCAG +GGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTATACGTTATTGCTGACG +ATGGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAGACAGTG +GTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTCTCGATG +TCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCCAATAAA +CTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATGGATATT +GGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCAATCATGGCTAAACTGGGCGTC +GATCCGGTGCATTTCGGCATTATCATGATCTATAACCTGGCGATTGGCACCATTACGCCG +CCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAGGAAGTG +ATTAAACCGTTGCTGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTATTAATTACCTAC +ATTCCGGAAATCACACTGTTCTTACCCCGTCTACTGGGCATCATGTAA diff --git a/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.fna b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.fna new file mode 100644 index 0000000000000000000000000000000000000000..325b3340782ba0db58b1693fab267c634919957a --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.fna @@ -0,0 +1,167 @@ +>genome1_1 +ATGTGTAAAAGCCGGAGGGGTTATCTTTTCCCGGCTTTTTATTATCAATTACTCATTAAC +TCCTGTTCCGTTCTTTTGCGTTTAATCACCGGAATATCTCCGGTATTGTTCAGCGCCCCG +GAAATGTTTTTAACCACTGTTCTGCACTCCGTTTATTAAACGCGCTCAGCGCGCGCTCAT +ATATCGCGCGCGCGCGCGCGCATATATATATAATGAGTGAGGCTGAAGCCCGCCCGACTA +ACTTTATTCGTCAGATTATTGATGAAGATCTGGCGAGTGGTAAACATACCACTGTCCATA +CCCGTTTTCCGCCGGAGCCGAATGGCTATCTGCATATCGGCCACGCGAAATCTATCTGCC +TGAACTTTGGCATCGCGCAAGATTATCAGGGCCAGTGCAACCTGCGTTTCGATGACACCA +ACCCGGTAAAAGAAGATATCGAGTACGTTGATTCGATCAAAAACGACGTCGAGTGGTTAG +GCTTTCACTGGTCTGGCGATATTCGCTACTCCTCCGATTACTTTGACCAACTGCACGCCT +ATGCGGTCGAGCTAATCAATAAAGGCCTGGCCTATGTTGATGAGCTGACGCCGGAGCAGA +TCCGTGAATACCGCGGTACGCTGACCGCGCCGGGTAAAAACAGCCCGTTCCGCGATCGCA +GCGTCGAAGAGAACCTCGCGCTATTTGAAAAAATGCGTACCGGCGGTTTTGAAGAGGGTA +AAGCCTGTCTGCGCGCTAAAATCGACATGGCGTCGCCGTTTATCGTGATGCGCGATCCGG +TGCTGTATCGCATTAAATTCGCCGAGCATCATCAGACTGGCAACAAGTGGTGCATCTATC +CGATGTACGACTTTACTCACTGCATCAGCGATGCGCTGGAAGGCATTACTCATTCTCTGT +GTACGCTGGAGTTCCAGGATAACCGTCGTCTGTACGACTGGGTGCTGGACAACATCACCA +TTCCGGTTCACCCGCGCCAGTACGAATTCTCGCGCCTGAATCTGGAATACACCGTGATGT +CCAAGCGTAAGCTGAACCTGCTGGTGACCGACAAACACGTCGAAGGTTGGGACGATCCGC +GTATGCCGACTATTTCCGGTCTGCGCCGTCGCGGCTATACCGCGGCTTCTATTCGTGAGT +TCTGCAAACGCATCGGCGTCACCAAGCAGGACAACACTATTGAGATGGCGTCGCTGGAAT +CCTGCATTCGCGAAGATCTGAACGAAAACGCGCCGCGCGCGATGGCGGTAATCGATCCGG +TAAAACTGGTTATCGAAAATTACCCGCAGGGTGAGAGCGAAATGGTTACCATGCCTAACC +ATCCGAATAAACCGGAGATGGGCAGCCGTGAAGTGCCGTTTAGCGGTGAGATCTGGATCG +ATCGCGCAGATTTCCGCGAAGAAGCGAACAAACAGTACAAACGTCTGGTGATGGGCAAAG +AAGTGCGTCTGCGTAATGCCTACGTCATTAAAGCGGAGCGCGTAGAGAAGGATGCCGAAG +GGAATATCACCACCATCTTCTGTACCTATGATGCTGATACGCTGAGTAAAGATCCGGCTG +ACGGGCGTAAAGTGAAAGGCGTAATCCACTGGGTTAGCGCAGCACATGCGCTGCCGATTG +AAATTCGTCTCTACGACCGTCTGTTCAGCGTGCCGAATCCGGGCGCCGCGGAGGACTTCC +TGTCTGTTATCAACCCCGAATCATTAGTGATTAAGCAGGGGTATGGCGAGCCGTCGCTGA +AAGCGGCGGTAGCAGGAAAAGCTTTCCAGTTTGAACGTGAAGGCTACTTCTGCCTCGACA +GCCGCTATGCAACGGCCGATAAGCTGGTCTTTAACCGCACCGTGGGCCTGCGTGATACCT +GGGCGAAAGCGGGCGAGTAACGCGCATAGGCGGCCTTCAAGGAGGCGCTAGGCGAATGAA +AGCGCTACTGTGGCTGGTGGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGCGAAAAAGG +AATTATCGATAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCCTATCC +GCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGAC +GTTAACGGGTCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTATATGG +CGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCGGGCGT +CAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAA +TCGTGGTTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAAT +TCAGGCATTGATTCCGTTAGCGAAGGATATTATCGCGCGCTATAACATCAAACCGCAGAA +TGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGCTTCCC +GTGGCGCGAGCTGGCGGCGCAGGGGATTAGCGCCTGGCCTGACGCCCAGCGTGTGGCGTT +TTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCGTTACT +CTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCACGCGAGCAGCAGCGGGTGAT +TATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCCGAAAC +GCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACGCGATAGGCGCGC +GAGATTACGCGCGCAGTATCGCGC +>genome1_2 +CGCGCATAGCGCGCAATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAG +GCTGATTTACCCGGCCTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGATTTTC +CCCATCGGCGGTATCCGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGT +AAGCATGATTACGCGTTTGTGCATATGACGCTGAAAATCGGCGCCGGGCGCAGCCTGGAG +AGCCGTCAGGAAGTTGGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCGACCTG +ATGGAGAACCGCTATCTGGCGCTGTCGTTTGAGATTGCCGAGCTACATCCGACGCTCAAT +TACAAACAAAACAACGTACACGCGTTATTTAAATAGCCGATATGAGGCGCGGCGCTATAA +GGCGCTCATGAGCATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTGACGGCAGGTTC +TGCGCTTGCTGTTCTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCGA +GACCGTCGATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTT +CGCCGACGGACAGACCGTGGTCGTACCGCCAGGATGGGTGTGTGAAAATATCAATGCGGC +GATAACGATTCCGGCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATGGCCG +GGGACGGTTTATTTTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCA +CAATGTGACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGGCGATGAGCGG +CTTTGGCCCCGTCGCGCAAATTTTCATCGGTGGTAAGGAACCGCAGGTGATGCGTAATCT +CATTATCGATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCA +TAACCAAATGGATGGCGCGCGGATTACGCATAGCCGCTTTAGCGATTTACAGGGGGACGC +CATTGAGTGGAATGTCGCGATTCACGACCGCGACATCCTGATTTCCGATCATGTCATCGA +ACGCATTAATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAG +CACCTATGACAACAGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTGGTGGCCAATAT +TACCGGATCTGATTGCCGACAGCTTGTGCACGTAGAAAATGGCAAACATTTCGTCATTCG +CAATGTCAAAGCCAAAAACATCACGCCCGGTTTCAGTAAAAATGCGGGTATTGATAACGC +AACGATCGCAATTTATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAG +TGCCGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAA +CTTTAAATTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCAT +TCAAATTTCCTCCGGCAACACCCCCTCTTTTGTCGCCATCACCAATGTACGGATGACGCG +TGCTACGCTGGAACTGCATAATCAACCGCAGCACCTCTTTCTGCGCAATATCAACGTGAT +GCAAACTTCAGCGATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGTAAAGATGTACG +TGGTCAATTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAA +TGAAAACGGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGA +AGCAGTGAATTTTTCGCTGCCGAAGCGGGGAGGGTAACGAATTCCGCGGCATATAGCGGC +GCGATAGCATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTC +ATGAAGTGCTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGTGATTAGCATCATCG +GCTCGTCCGGCTCCGGTAAAAGCACTTTTTTGCGCTGTATTAACTTCCTCGAAAAACCGA +GCGAAGGCGCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGATGGGC +AGCTCAAAGTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGT +TTCAGCACTTCAACCTCTGGAGCCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGA +TTCAGGTACTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGA +AGGTGGGGATTGATGAGCGCGCTCAGGGCAAATATCCCGTCCATCTCTCCGGCGGCCAAC +AGCAGCGCGTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATG +AACCCACATCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAAC +TGGCGGAAGAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCTCGCCATG +TCTCTTCGCACGTTATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGTGATCCGGAGC +AGGTGTTCGGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAAT +AACGCTCAGAGGCGCGCGCGCGCTATATACGCGCAGTGTTTATTGGCATAGTTAGCCTGT +TTCCTGAAATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAAAG +GCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGACTTCGCGCATGACCGGCACCGTACCG +TGGACGACCGTCCTTACGGCGGCGGACCGGGGATGTTAATGATGGTGCAACCCTTGCGGG +ACGCCATTCACGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGC +CTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAGCTTA +TTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGACCGAAATTGACG +AAGAATGGTCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGA +TTGACTCCGTCGCCCGGTTTATACCGGGGGTTCTGGGGCATGAGGCATCAGCAATCGAAG +ATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGG +GGATGGAAGTACCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCGCTGGCGTT +TGAAACAGTCGCTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTC +TGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGC +ATAAACATGATGGGATGGCATAGCCGTCATATGGGCTCTCAGAGGAGACGCGCTAATGCG +CAGTACGTGTTTATTGGCATAGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGAT +TACGGGGTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACATCCAAAGCTGGAGTCCT +CGCGACTTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCG +GGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCA +GGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGC +GTTAGCGAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTA +GATGAGCGCGTAATTCAGACCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTC +AGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGG +GTTCTGGGGCATGAGGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGT +CCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCG +GGAAACCATGCTGAGATACGTCGCTGGCGTTTGAAACAGTCGCTGGGCCGAACCTGGCTT +AGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCG +GAGTTCAAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAGCGCATA +CGCGCGCGATATATATTCGCGCAGAGGCGCGCATAGCGATGAAGTTTGTTGCGCCCGAAC +AGGCACCGGAACAGGCGGAGGTCATCAAAAATACGCCGTTCTGGCCTGATGTGGACCTGT +CGGAATTTCGCAGTGTGATGCGCACTGACGGCACGGTGACGCAGCCGCGTTTAAAGCAGG +TCGTGCTGACGGCGATCTCTGAGGTTAACGCTGAGCTGTACGACTTCCGCAACCGTCAGC +AGATGCTGGGCTGGCGGACACTTGCTGAGGTTCCCGCAGAAATGCTGGACGGTAAAAGCG +AGCGTATCCGGCACTACCACAACGCTGTTTTTTGCTGGGCGCGCGCTGTGCTTAATGAGC +GTTATCAGGACTATGACGCCACGGCGTCAGGCGTGAAGCGAGGGGAGGAGCTGGCGGAGG +CCAGCGGCGATCTGTGGCGTGATGCCCGCTGGGCCATCAGCCGGGTGCAGGATGTACCGC +ACTGTACGGTGGAGCTTATCTGAACGGCTAGCGCGTAGCGGCAGTCTCGGATGAATAATC +ATTTTGGGAAAGGGTTAATGGCCGGGTTGCACGCGCCATATGCATATAGCGCGCATCATG +CGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGTATGT +TCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGCTGGAATTCTGACGCGTCGCT +ATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATGGCGG +TTCGCTTCTTTATGGCCGGTTATCGACTCGAAGGTTGACGATCGCGCGCAGGAGGAGAGC +TCTCTTCTTCTAGAGCATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCG +CTCGGCAAAAGCGTGCCAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTTGAGAT +GATCGTCCCTCAGCTGCCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAATCTCTCGT +GCTTGAGCATGGCGGTGCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGC +CACCTGGCTGTCGCAATGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCC +CTTTGAATTACTGACCGACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATA +TGTGCTAGAGTCTCGCCATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGC +GCCGGACCTGATCTGGCTACTGCAACAGACGGGCGATGAAGTGCTGTATTACCGCCAGGC +GGTGGCATATTACGCCTCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCAC +GGGCTTCGAAGATTATTTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGACG +GCTATGCGCGCGCTCGATGATTGACCCTATTTTTGCGTCCTGTACGCTAATTGCCGTCTT +TGTTGTTTTACTGGCCATGGGCGCGCCTATCGGGATCTGCATCGTTATCGCCTCTTTCAG +CACCATGATGCTGGTACTGCCTTTCGATATTTCGATGTTCGCCACCGCGCAAAAAATGTT +CTCCAGCCTGGACAGTTTTGCCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGAT +CATGAATAGCGGGGGAATTGCCGCCCGGCTGGTCAATTTTGCCAAACTGTTTACTGGCAA +ACTGCCCGGCTCGCTCTCTTATACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTC +CGGATCGGCGATTGCCGCCTCAACCTCCATCGGCGGCGTGATGGTGCCGATGAGCGCGCG +CGAAGGTTACGATCGCGGCTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAAT +GTTAATTCCGCCCACCACGGCTTTTATCCTTTACGCGCTGGCAAGCGGGGGAACATCGAT +TGCCGCTCTGTTCGCCGGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCT +GGTCACGCTGGTGGTCGCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGG +TATGGCGCTAAAAGTTGCCGTTGAGGCCATTCCCAGCCTGCTGCTGATCGTGATTATTGT +CGGCGGCATTGTGCAGGGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTA +TACGTTATTGCTGACGATGGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGAT +TTTGCTCCAGACAGTGGTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGC +GATGTCCTTCTCGATGTCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGG +TATTTCCGCCAATAAACTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGG +CGCATTTATGGATATTGGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCAATCAT +GGCTAAACTGGGCGTCGATCCGGTGCATTTCGGCATTATCATGATCTATAACCTGGCGAT +TGGCACCATTACGCCGCCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGT +CAAAGTTGAGGAAGTGATTAAACCGTTGCTGCCTTTTTACGGCGCGATTATCGGCGTTCT +GTTATTAATTACCTACATTCCGGAAATCACACTGTTCTTACCCCGTCTACTGGGCATCAT +GTAA diff --git a/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.fsa b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.fsa new file mode 100644 index 0000000000000000000000000000000000000000..ca5e180abfba3f98289887814c4d89b20e9e1bb6 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.fsa @@ -0,0 +1,167 @@ +>genome1_1 [gcode=11] [organism=Genus species] [strain=strain] +ATGTGTAAAAGCCGGAGGGGTTATCTTTTCCCGGCTTTTTATTATCAATTACTCATTAAC +TCCTGTTCCGTTCTTTTGCGTTTAATCACCGGAATATCTCCGGTATTGTTCAGCGCCCCG +GAAATGTTTTTAACCACTGTTCTGCACTCCGTTTATTAAACGCGCTCAGCGCGCGCTCAT +ATATCGCGCGCGCGCGCGCGCATATATATATAATGAGTGAGGCTGAAGCCCGCCCGACTA +ACTTTATTCGTCAGATTATTGATGAAGATCTGGCGAGTGGTAAACATACCACTGTCCATA +CCCGTTTTCCGCCGGAGCCGAATGGCTATCTGCATATCGGCCACGCGAAATCTATCTGCC +TGAACTTTGGCATCGCGCAAGATTATCAGGGCCAGTGCAACCTGCGTTTCGATGACACCA +ACCCGGTAAAAGAAGATATCGAGTACGTTGATTCGATCAAAAACGACGTCGAGTGGTTAG +GCTTTCACTGGTCTGGCGATATTCGCTACTCCTCCGATTACTTTGACCAACTGCACGCCT +ATGCGGTCGAGCTAATCAATAAAGGCCTGGCCTATGTTGATGAGCTGACGCCGGAGCAGA +TCCGTGAATACCGCGGTACGCTGACCGCGCCGGGTAAAAACAGCCCGTTCCGCGATCGCA +GCGTCGAAGAGAACCTCGCGCTATTTGAAAAAATGCGTACCGGCGGTTTTGAAGAGGGTA +AAGCCTGTCTGCGCGCTAAAATCGACATGGCGTCGCCGTTTATCGTGATGCGCGATCCGG +TGCTGTATCGCATTAAATTCGCCGAGCATCATCAGACTGGCAACAAGTGGTGCATCTATC +CGATGTACGACTTTACTCACTGCATCAGCGATGCGCTGGAAGGCATTACTCATTCTCTGT +GTACGCTGGAGTTCCAGGATAACCGTCGTCTGTACGACTGGGTGCTGGACAACATCACCA +TTCCGGTTCACCCGCGCCAGTACGAATTCTCGCGCCTGAATCTGGAATACACCGTGATGT +CCAAGCGTAAGCTGAACCTGCTGGTGACCGACAAACACGTCGAAGGTTGGGACGATCCGC +GTATGCCGACTATTTCCGGTCTGCGCCGTCGCGGCTATACCGCGGCTTCTATTCGTGAGT +TCTGCAAACGCATCGGCGTCACCAAGCAGGACAACACTATTGAGATGGCGTCGCTGGAAT +CCTGCATTCGCGAAGATCTGAACGAAAACGCGCCGCGCGCGATGGCGGTAATCGATCCGG +TAAAACTGGTTATCGAAAATTACCCGCAGGGTGAGAGCGAAATGGTTACCATGCCTAACC +ATCCGAATAAACCGGAGATGGGCAGCCGTGAAGTGCCGTTTAGCGGTGAGATCTGGATCG +ATCGCGCAGATTTCCGCGAAGAAGCGAACAAACAGTACAAACGTCTGGTGATGGGCAAAG +AAGTGCGTCTGCGTAATGCCTACGTCATTAAAGCGGAGCGCGTAGAGAAGGATGCCGAAG +GGAATATCACCACCATCTTCTGTACCTATGATGCTGATACGCTGAGTAAAGATCCGGCTG +ACGGGCGTAAAGTGAAAGGCGTAATCCACTGGGTTAGCGCAGCACATGCGCTGCCGATTG +AAATTCGTCTCTACGACCGTCTGTTCAGCGTGCCGAATCCGGGCGCCGCGGAGGACTTCC +TGTCTGTTATCAACCCCGAATCATTAGTGATTAAGCAGGGGTATGGCGAGCCGTCGCTGA +AAGCGGCGGTAGCAGGAAAAGCTTTCCAGTTTGAACGTGAAGGCTACTTCTGCCTCGACA +GCCGCTATGCAACGGCCGATAAGCTGGTCTTTAACCGCACCGTGGGCCTGCGTGATACCT +GGGCGAAAGCGGGCGAGTAACGCGCATAGGCGGCCTTCAAGGAGGCGCTAGGCGAATGAA +AGCGCTACTGTGGCTGGTGGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGCGAAAAAGG +AATTATCGATAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCCTATCC +GCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGAC +GTTAACGGGTCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTATATGG +CGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCGGGCGT +CAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAA +TCGTGGTTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAAT +TCAGGCATTGATTCCGTTAGCGAAGGATATTATCGCGCGCTATAACATCAAACCGCAGAA +TGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGCTTCCC +GTGGCGCGAGCTGGCGGCGCAGGGGATTAGCGCCTGGCCTGACGCCCAGCGTGTGGCGTT +TTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCGTTACT +CTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCACGCGAGCAGCAGCGGGTGAT +TATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCCGAAAC +GCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACGCGATAGGCGCGC +GAGATTACGCGCGCAGTATCGCGC +>genome1_2 [gcode=11] [organism=Genus species] [strain=strain] +CGCGCATAGCGCGCAATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAG +GCTGATTTACCCGGCCTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGATTTTC +CCCATCGGCGGTATCCGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGT +AAGCATGATTACGCGTTTGTGCATATGACGCTGAAAATCGGCGCCGGGCGCAGCCTGGAG +AGCCGTCAGGAAGTTGGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCGACCTG +ATGGAGAACCGCTATCTGGCGCTGTCGTTTGAGATTGCCGAGCTACATCCGACGCTCAAT +TACAAACAAAACAACGTACACGCGTTATTTAAATAGCCGATATGAGGCGCGGCGCTATAA +GGCGCTCATGAGCATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTGACGGCAGGTTC +TGCGCTTGCTGTTCTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCGA +GACCGTCGATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTT +CGCCGACGGACAGACCGTGGTCGTACCGCCAGGATGGGTGTGTGAAAATATCAATGCGGC +GATAACGATTCCGGCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATGGCCG +GGGACGGTTTATTTTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCA +CAATGTGACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGGCGATGAGCGG +CTTTGGCCCCGTCGCGCAAATTTTCATCGGTGGTAAGGAACCGCAGGTGATGCGTAATCT +CATTATCGATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCA +TAACCAAATGGATGGCGCGCGGATTACGCATAGCCGCTTTAGCGATTTACAGGGGGACGC +CATTGAGTGGAATGTCGCGATTCACGACCGCGACATCCTGATTTCCGATCATGTCATCGA +ACGCATTAATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAG +CACCTATGACAACAGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTGGTGGCCAATAT +TACCGGATCTGATTGCCGACAGCTTGTGCACGTAGAAAATGGCAAACATTTCGTCATTCG +CAATGTCAAAGCCAAAAACATCACGCCCGGTTTCAGTAAAAATGCGGGTATTGATAACGC +AACGATCGCAATTTATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAG +TGCCGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAA +CTTTAAATTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCAT +TCAAATTTCCTCCGGCAACACCCCCTCTTTTGTCGCCATCACCAATGTACGGATGACGCG +TGCTACGCTGGAACTGCATAATCAACCGCAGCACCTCTTTCTGCGCAATATCAACGTGAT +GCAAACTTCAGCGATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGTAAAGATGTACG +TGGTCAATTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAA +TGAAAACGGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGA +AGCAGTGAATTTTTCGCTGCCGAAGCGGGGAGGGTAACGAATTCCGCGGCATATAGCGGC +GCGATAGCATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTC +ATGAAGTGCTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGTGATTAGCATCATCG +GCTCGTCCGGCTCCGGTAAAAGCACTTTTTTGCGCTGTATTAACTTCCTCGAAAAACCGA +GCGAAGGCGCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGATGGGC +AGCTCAAAGTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGT +TTCAGCACTTCAACCTCTGGAGCCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGA +TTCAGGTACTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGA +AGGTGGGGATTGATGAGCGCGCTCAGGGCAAATATCCCGTCCATCTCTCCGGCGGCCAAC +AGCAGCGCGTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATG +AACCCACATCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAAC +TGGCGGAAGAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCTCGCCATG +TCTCTTCGCACGTTATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGTGATCCGGAGC +AGGTGTTCGGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAAT +AACGCTCAGAGGCGCGCGCGCGCTATATACGCGCAGTGTTTATTGGCATAGTTAGCCTGT +TTCCTGAAATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAAAG +GCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGACTTCGCGCATGACCGGCACCGTACCG +TGGACGACCGTCCTTACGGCGGCGGACCGGGGATGTTAATGATGGTGCAACCCTTGCGGG +ACGCCATTCACGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGC +CTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAGCTTA +TTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGACCGAAATTGACG +AAGAATGGTCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGA +TTGACTCCGTCGCCCGGTTTATACCGGGGGTTCTGGGGCATGAGGCATCAGCAATCGAAG +ATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGG +GGATGGAAGTACCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCGCTGGCGTT +TGAAACAGTCGCTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTC +TGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGC +ATAAACATGATGGGATGGCATAGCCGTCATATGGGCTCTCAGAGGAGACGCGCTAATGCG +CAGTACGTGTTTATTGGCATAGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGAT +TACGGGGTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACATCCAAAGCTGGAGTCCT +CGCGACTTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCG +GGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCA +GGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGC +GTTAGCGAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTA +GATGAGCGCGTAATTCAGACCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTC +AGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGG +GTTCTGGGGCATGAGGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGT +CCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCG +GGAAACCATGCTGAGATACGTCGCTGGCGTTTGAAACAGTCGCTGGGCCGAACCTGGCTT +AGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCG +GAGTTCAAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAGCGCATA +CGCGCGCGATATATATTCGCGCAGAGGCGCGCATAGCGATGAAGTTTGTTGCGCCCGAAC +AGGCACCGGAACAGGCGGAGGTCATCAAAAATACGCCGTTCTGGCCTGATGTGGACCTGT +CGGAATTTCGCAGTGTGATGCGCACTGACGGCACGGTGACGCAGCCGCGTTTAAAGCAGG +TCGTGCTGACGGCGATCTCTGAGGTTAACGCTGAGCTGTACGACTTCCGCAACCGTCAGC +AGATGCTGGGCTGGCGGACACTTGCTGAGGTTCCCGCAGAAATGCTGGACGGTAAAAGCG +AGCGTATCCGGCACTACCACAACGCTGTTTTTTGCTGGGCGCGCGCTGTGCTTAATGAGC +GTTATCAGGACTATGACGCCACGGCGTCAGGCGTGAAGCGAGGGGAGGAGCTGGCGGAGG +CCAGCGGCGATCTGTGGCGTGATGCCCGCTGGGCCATCAGCCGGGTGCAGGATGTACCGC +ACTGTACGGTGGAGCTTATCTGAACGGCTAGCGCGTAGCGGCAGTCTCGGATGAATAATC +ATTTTGGGAAAGGGTTAATGGCCGGGTTGCACGCGCCATATGCATATAGCGCGCATCATG +CGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGTATGT +TCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGCTGGAATTCTGACGCGTCGCT +ATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATGGCGG +TTCGCTTCTTTATGGCCGGTTATCGACTCGAAGGTTGACGATCGCGCGCAGGAGGAGAGC +TCTCTTCTTCTAGAGCATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCG +CTCGGCAAAAGCGTGCCAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTTGAGAT +GATCGTCCCTCAGCTGCCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAATCTCTCGT +GCTTGAGCATGGCGGTGCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGC +CACCTGGCTGTCGCAATGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCC +CTTTGAATTACTGACCGACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATA +TGTGCTAGAGTCTCGCCATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGC +GCCGGACCTGATCTGGCTACTGCAACAGACGGGCGATGAAGTGCTGTATTACCGCCAGGC +GGTGGCATATTACGCCTCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCAC +GGGCTTCGAAGATTATTTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGACG +GCTATGCGCGCGCTCGATGATTGACCCTATTTTTGCGTCCTGTACGCTAATTGCCGTCTT +TGTTGTTTTACTGGCCATGGGCGCGCCTATCGGGATCTGCATCGTTATCGCCTCTTTCAG +CACCATGATGCTGGTACTGCCTTTCGATATTTCGATGTTCGCCACCGCGCAAAAAATGTT +CTCCAGCCTGGACAGTTTTGCCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGAT +CATGAATAGCGGGGGAATTGCCGCCCGGCTGGTCAATTTTGCCAAACTGTTTACTGGCAA +ACTGCCCGGCTCGCTCTCTTATACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTC +CGGATCGGCGATTGCCGCCTCAACCTCCATCGGCGGCGTGATGGTGCCGATGAGCGCGCG +CGAAGGTTACGATCGCGGCTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAAT +GTTAATTCCGCCCACCACGGCTTTTATCCTTTACGCGCTGGCAAGCGGGGGAACATCGAT +TGCCGCTCTGTTCGCCGGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCT +GGTCACGCTGGTGGTCGCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGG +TATGGCGCTAAAAGTTGCCGTTGAGGCCATTCCCAGCCTGCTGCTGATCGTGATTATTGT +CGGCGGCATTGTGCAGGGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTA +TACGTTATTGCTGACGATGGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGAT +TTTGCTCCAGACAGTGGTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGC +GATGTCCTTCTCGATGTCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGG +TATTTCCGCCAATAAACTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGG +CGCATTTATGGATATTGGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCAATCAT +GGCTAAACTGGGCGTCGATCCGGTGCATTTCGGCATTATCATGATCTATAACCTGGCGAT +TGGCACCATTACGCCGCCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGT +CAAAGTTGAGGAAGTGATTAAACCGTTGCTGCCTTTTTACGGCGCGATTATCGGCGTTCT +GTTATTAATTACCTACATTCCGGAAATCACACTGTTCTTACCCCGTCTACTGGGCATCAT +GTAA diff --git a/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.gbk b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.gbk new file mode 100644 index 0000000000000000000000000000000000000000..027b74e27c977b2871b77d3534abbb8400722fd0 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.gbk @@ -0,0 +1,346 @@ +LOCUS genome1_1 2784 bp DNA linear 12-FEB-2019 +DEFINITION Genus species strain strain. +ACCESSION +VERSION +KEYWORDS . +SOURCE Genus species + ORGANISM Genus species + Unclassified. +COMMENT Annotated using prokka 1.14-dev from + https://github.com/tseemann/prokka. +FEATURES Location/Qualifiers + source 1..2784 + /organism="Genus species" + /mol_type="genomic DNA" + /strain="strain" + CDS 213..1880 + /gene="glnS" + /locus_tag="GKEGNCBE_00001" + /EC_number="6.1.1.18" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:P00962" + /codon_start=1 + /transl_table=11 + /product="Glutamine--tRNA ligase" + /translation="MSEAEARPTNFIRQIIDEDLASGKHTTVHTRFPPEPNGYLHIGH + AKSICLNFGIAQDYQGQCNLRFDDTNPVKEDIEYVDSIKNDVEWLGFHWSGDIRYSSD + YFDQLHAYAVELINKGLAYVDELTPEQIREYRGTLTAPGKNSPFRDRSVEENLALFEK + MRTGGFEEGKACLRAKIDMASPFIVMRDPVLYRIKFAEHHQTGNKWCIYPMYDFTHCI + SDALEGITHSLCTLEFQDNRRLYDWVLDNITIPVHPRQYEFSRLNLEYTVMSKRKLNL + LVTDKHVEGWDDPRMPTISGLRRRGYTAASIREFCKRIGVTKQDNTIEMASLESCIRE + DLNENAPRAMAVIDPVKLVIENYPQGESEMVTMPNHPNKPEMGSREVPFSGEIWIDRA + DFREEANKQYKRLVMGKEVRLRNAYVIKAERVEKDAEGNITTIFCTYDADTLSKDPAD + GRKVKGVIHWVSAAHALPIEIRLYDRLFSVPNPGAAEDFLSVINPESLVIKQGYGEPS + LKAAVAGKAFQFEREGYFCLDSRYATADKLVFNRTVGLRDTWAKAGE" + CDS 1916..2746 + /gene="amiD" + /locus_tag="GKEGNCBE_00002" + /EC_number="3.5.1.28" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:P75820" + /codon_start=1 + /transl_table=11 + /product="N-acetylmuramoyl-L-alanine amidase AmiD" + /translation="MKALLWLVGLALLLTGCASEKGIIDKEGYQLDTRHRAQAAYPRI + KVLVIHYTAENFDVSLATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGV + SFWRGATRLNDTSIGIELENRGWRMSGGVKSFAPFESAQIQALIPLAKDIIARYNIKP + QNVVAHADIAPQRKDDPGPRFPWRELAAQGISAWPDAQRVAFYLAGRAPYTPVDTATV + LALLSRYGYEVKADMTAREQQRVIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD + " +ORIGIN + 1 atgtgtaaaa gccggagggg ttatcttttc ccggcttttt attatcaatt actcattaac + 61 tcctgttccg ttcttttgcg tttaatcacc ggaatatctc cggtattgtt cagcgccccg + 121 gaaatgtttt taaccactgt tctgcactcc gtttattaaa cgcgctcagc gcgcgctcat + 181 atatcgcgcg cgcgcgcgcg catatatata taatgagtga ggctgaagcc cgcccgacta + 241 actttattcg tcagattatt gatgaagatc tggcgagtgg taaacatacc actgtccata + 301 cccgttttcc gccggagccg aatggctatc tgcatatcgg ccacgcgaaa tctatctgcc + 361 tgaactttgg catcgcgcaa gattatcagg gccagtgcaa cctgcgtttc gatgacacca + 421 acccggtaaa agaagatatc gagtacgttg attcgatcaa aaacgacgtc gagtggttag + 481 gctttcactg gtctggcgat attcgctact cctccgatta ctttgaccaa ctgcacgcct + 541 atgcggtcga gctaatcaat aaaggcctgg cctatgttga tgagctgacg ccggagcaga + 601 tccgtgaata ccgcggtacg ctgaccgcgc cgggtaaaaa cagcccgttc cgcgatcgca + 661 gcgtcgaaga gaacctcgcg ctatttgaaa aaatgcgtac cggcggtttt gaagagggta + 721 aagcctgtct gcgcgctaaa atcgacatgg cgtcgccgtt tatcgtgatg cgcgatccgg + 781 tgctgtatcg cattaaattc gccgagcatc atcagactgg caacaagtgg tgcatctatc + 841 cgatgtacga ctttactcac tgcatcagcg atgcgctgga aggcattact cattctctgt + 901 gtacgctgga gttccaggat aaccgtcgtc tgtacgactg ggtgctggac aacatcacca + 961 ttccggttca cccgcgccag tacgaattct cgcgcctgaa tctggaatac accgtgatgt + 1021 ccaagcgtaa gctgaacctg ctggtgaccg acaaacacgt cgaaggttgg gacgatccgc + 1081 gtatgccgac tatttccggt ctgcgccgtc gcggctatac cgcggcttct attcgtgagt + 1141 tctgcaaacg catcggcgtc accaagcagg acaacactat tgagatggcg tcgctggaat + 1201 cctgcattcg cgaagatctg aacgaaaacg cgccgcgcgc gatggcggta atcgatccgg + 1261 taaaactggt tatcgaaaat tacccgcagg gtgagagcga aatggttacc atgcctaacc + 1321 atccgaataa accggagatg ggcagccgtg aagtgccgtt tagcggtgag atctggatcg + 1381 atcgcgcaga tttccgcgaa gaagcgaaca aacagtacaa acgtctggtg atgggcaaag + 1441 aagtgcgtct gcgtaatgcc tacgtcatta aagcggagcg cgtagagaag gatgccgaag + 1501 ggaatatcac caccatcttc tgtacctatg atgctgatac gctgagtaaa gatccggctg + 1561 acgggcgtaa agtgaaaggc gtaatccact gggttagcgc agcacatgcg ctgccgattg + 1621 aaattcgtct ctacgaccgt ctgttcagcg tgccgaatcc gggcgccgcg gaggacttcc + 1681 tgtctgttat caaccccgaa tcattagtga ttaagcaggg gtatggcgag ccgtcgctga + 1741 aagcggcggt agcaggaaaa gctttccagt ttgaacgtga aggctacttc tgcctcgaca + 1801 gccgctatgc aacggccgat aagctggtct ttaaccgcac cgtgggcctg cgtgatacct + 1861 gggcgaaagc gggcgagtaa cgcgcatagg cggccttcaa ggaggcgcta ggcgaatgaa + 1921 agcgctactg tggctggtgg gtctcgcgtt gctgttaaca ggctgcgcga gcgaaaaagg + 1981 aattatcgat aaagagggat atcagcttga tacccgacat cgggcgcagg cggcctatcc + 2041 gcgcattaaa gtcctggtga ttcactatac ggcggaaaac tttgacgttt cgctggcgac + 2101 gttaacgggt cgcaacgtca gttcgcatta cctgattccc gcaaccccgc cattatatgg + 2161 cggtaaaccg cgcatctggc aactggtgcc ggaacaggat caggcctggc atgcgggcgt + 2221 cagtttctgg cgaggcgcca cgcgtctcaa tgatacgtct attggcattg agctggaaaa + 2281 tcgtggttgg cgaatgtccg gcggggtgaa atctttcgcg ccgtttgaat ccgcgcaaat + 2341 tcaggcattg attccgttag cgaaggatat tatcgcgcgc tataacatca aaccgcagaa + 2401 tgtggtggcc catgcggata tcgcgccgca gcgtaaagac gatcccggcc cgcgcttccc + 2461 gtggcgcgag ctggcggcgc aggggattag cgcctggcct gacgcccagc gtgtggcgtt + 2521 ttatctggct ggacgcgcgc cgtatacgcc agtcgatacc gcaacggtgc ttgcgttact + 2581 ctcgcgctat ggctatgaag tcaaagccga tatgacggca cgcgagcagc agcgggtgat + 2641 tatggcgttc cagatgcact tccgtccggc gcaatggaac ggtatcgcag atgccgaaac + 2701 gcaggcgatt gccgaagcat tactggagaa gtacggccag gattaacgcg ataggcgcgc + 2761 gagattacgc gcgcagtatc gcgc +// +LOCUS genome1_2 7024 bp DNA linear 12-FEB-2019 +DEFINITION Genus species strain strain. +ACCESSION +VERSION +KEYWORDS . +SOURCE Genus species + ORGANISM Genus species + Unclassified. +COMMENT Annotated using prokka 1.14-dev from + https://github.com/tseemann/prokka. +FEATURES Location/Qualifiers + source 1..7024 + /organism="Genus species" + /mol_type="genomic DNA" + /strain="strain" + CDS 16..396 + /gene="hpcD" + /locus_tag="GKEGNCBE_00003" + /EC_number="5.3.3.10" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:Q05354" + /codon_start=1 + /transl_table=11 + /product="5-carboxymethyl-2-hydroxymuconate + Delta-isomerase" + /translation="MPHFIAECTENIREQADLPGLFSKVNEALAASGIFPIGGIRSRA + HWLDTWQMADGKHDYAFVHMTLKIGAGRSLESRQEVGEMLFGLIKAHFADLMENRYLA + LSFEIAELHPTLNYKQNNVHALFK" + CDS 428..1837 + /locus_tag="GKEGNCBE_00004" + /inference="ab initio prediction:Prodigal:2.6" + /codon_start=1 + /transl_table=11 + /product="hypothetical protein" + /translation="MSMPATKFSRRTLLTAGSALAVLPFLRALPVQAREPRETVDIKD + YPADDGIASFKQAFADGQTVVVPPGWVCENINAAITIPAGKTLRVQGAVRGNGRGRFI + LQDGCQVVGEQGGSLHNVTLDVRGSDCVIKGVAMSGFGPVAQIFIGGKEPQVMRNLII + DDITVTHANYAILRQGFHNQMDGARITHSRFSDLQGDAIEWNVAIHDRDILISDHVIE + RINCTNGKINWGIGIGLAGSTYDNSYPEDQAVKNFVVANITGSDCRQLVHVENGKHFV + IRNVKAKNITPGFSKNAGIDNATIAIYGCDNFVIDNIDMTNSAGMLIGYGVVKGKYLS + IPQNFKLNAIRLDNRQVAYKLRGIQISSGNTPSFVAITNVRMTRATLELHNQPQHLFL + RNINVMQTSAIGPALKMHFDLRKDVRGQFMARQDTLLSLANVHAINENGQSSVDIDRI + NHQTVNVEAVNFSLPKRGG" + CDS 1869..2642 + /gene="hisP" + /locus_tag="GKEGNCBE_00005" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:P02915" + /codon_start=1 + /transl_table=11 + /product="Histidine transport ATP-binding protein HisP" + /translation="MSENKLHVIDLHKRYGGHEVLKGVSLQARAGDVISIIGSSGSGK + STFLRCINFLEKPSEGAIIVNGQNINLVRDKDGQLKVADKNQLRLLRTRLTMVFQHFN + LWSHMTVLENVMEAPIQVLGLSKHDARERALKYLAKVGIDERAQGKYPVHLSGGQQQR + VSIARALAMEPDVLLFDEPTSALDPELVGEVLRIMQQLAEEGKTMVVVTHEMGFARHV + SSHVIFLHQGKIEEEGDPEQVFGNPQSPRLQQFLKGSLK" + CDS 2709..3443 + /gene="trmD_1" + /locus_tag="GKEGNCBE_00006" + /EC_number="2.1.1.228" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:P0A873" + /codon_start=1 + /transl_table=11 + /product="tRNA (guanine-N(1)-)-methyltransferase" + /translation="MFRAITDYGVTGRAVKKGLLNIQSWSPRDFAHDRHRTVDDRPYG + GGPGMLMMVQPLRDAIHAAKAAAGEGAKVIYLSPQGRKLDQAGVSELATNQKLILVCG + RYEGVDERVIQTEIDEEWSIGDYVLSGGELPAMTLIDSVARFIPGVLGHEASAIEDSF + ADGLLDCPHYTRPEVLEGMEVPPVLLSGNHAEIRRWRLKQSLGRTWLRRPELLENLAL + TEEQARLLAEFKTEHAQQQHKHDGMA" + CDS 3427..4254 + /gene="trmD_2" + /locus_tag="GKEGNCBE_00007" + /EC_number="2.1.1.228" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:P0A873" + /codon_start=1 + /transl_table=11 + /product="tRNA (guanine-N(1)-)-methyltransferase" + /translation="MMGWHSRHMGSQRRRANAQYVFIGIVSLFPEMFRAITDYGVTGR + AVKKGLLNIQSWSPRDFAHDRHRTVDDRPYGGGPGMLMMVQPLRDAIHAAKAAAGEGA + KVIYLSPQGRKLDQAGVSELATNQKLILVCGRYEGVDERVIQTEIDEEWSIGDYVLSG + GELPAMTLIDSVARFIPGVLGHEASAIEDSFADGLLDCPHYTRPEVLEGMEVPPVLLS + GNHAEIRRWRLKQSLGRTWLRRPELLENLALTEEQARLLAEFKTEHAQQQHKHDGMA" + CDS 4299..4763 + /locus_tag="GKEGNCBE_00008" + /inference="ab initio prediction:Prodigal:2.6" + /codon_start=1 + /transl_table=11 + /product="hypothetical protein" + /translation="MKFVAPEQAPEQAEVIKNTPFWPDVDLSEFRSVMRTDGTVTQPR + LKQVVLTAISEVNAELYDFRNRQQMLGWRTLAEVPAEMLDGKSERIRHYHNAVFCWAR + AVLNERYQDYDATASGVKRGEELAEASGDLWRDARWAISRVQDVPHCTVELI" + CDS 4818..5078 + /locus_tag="GKEGNCBE_00009" + /inference="ab initio prediction:Prodigal:2.6" + /codon_start=1 + /transl_table=11 + /product="hypothetical protein" + /translation="MAGLHAPYAYSAHHAVNFCSEYKRGFVLGFTHRMFEKTGDRQLS + AWEAGILTRRYGLDKEMVMDFFKENHSGMAVRFFMAGYRLEG" + CDS 5117..5698 + /locus_tag="GKEGNCBE_00010" + /inference="ab initio prediction:Prodigal:2.6" + /codon_start=1 + /transl_table=11 + /product="hypothetical protein" + /translation="MSTLLYLHGFNSSPRSAKACQLKNWLAERHPHVEMIVPQLPPYP + ADAAELLESLVLEHGGAPLGLVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDY + LGQNENPYTGQQYVLESRHIYDLKVMQIDPLEAPDLIWLLQQTGDEVLYYRQAVAYYA + SCRQTVTEGGNHAFTGFEDYFNQIVDFLGLHSC" + CDS 5717..7024 + /gene="dctM" + /locus_tag="GKEGNCBE_00011" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:O07838" + /codon_start=1 + /transl_table=11 + /product="C4-dicarboxylate TRAP transporter large permease + protein DctM" + /translation="MIDPIFASCTLIAVFVVLLAMGAPIGICIVIASFSTMMLVLPFD + ISMFATAQKMFSSLDSFALLAVPFFVLSGVIMNSGGIAARLVNFAKLFTGKLPGSLSY + TNIVGNMMFGAISGSAIAASTSIGGVMVPMSAREGYDRGFAAAVNIASAPTGMLIPPT + TAFILYALASGGTSIAALFAGGLVAGVLWGVGCMLVTLVVAKRRNYRVFFTVQKGMAL + KVAVEAIPSLLLIVIIVGGIVQGIFTAIEASAIAVVYTLLLTMVFYRTLKIKDLPSIL + LQTVVMTGVIMFLLATSSAMSFSMSITNIPAALSDMILGISANKLVILLVITVFLLII + GAFMDIGPAILIFTPILLPIMAKLGVDPVHFGIIMIYNLAIGTITPPVGSGLYVGASV + GKVKVEEVIKPLLPFYGAIIGVLLLITYIPEITLFLPRLLGIM" +ORIGIN + 1 cgcgcatagc gcgcaatgcc gcactttatt gctgaatgta ctgaaaatat tcgcgagcag + 61 gctgatttac ccggcctgtt cagcaaggta aacgaggcgc tggccgccag cgggattttc + 121 cccatcggcg gtatccgcag tcgcgcccac tggctggata cctggcagat ggctgacggt + 181 aagcatgatt acgcgtttgt gcatatgacg ctgaaaatcg gcgccgggcg cagcctggag + 241 agccgtcagg aagttggcga aatgctgttt gggctgatta aagcccactt cgccgacctg + 301 atggagaacc gctatctggc gctgtcgttt gagattgccg agctacatcc gacgctcaat + 361 tacaaacaaa acaacgtaca cgcgttattt aaatagccga tatgaggcgc ggcgctataa + 421 ggcgctcatg agcatgcccg cgactaaatt ctcccgacgt accctcctga cggcaggttc + 481 tgcgcttgct gttcttcctt ttctgcgcgc cttgccggta caggcgcgtg aacctcgcga + 541 gaccgtcgat attaaggatt atccggcgga tgacggtatc gcctcgttca aacaggcctt + 601 cgccgacgga cagaccgtgg tcgtaccgcc aggatgggtg tgtgaaaata tcaatgcggc + 661 gataacgatt ccggcgggaa aaacgctgcg ggtacagggc gcggtgcgtg ggaatggccg + 721 gggacggttt attttgcagg acgggtgtca ggtggtgggg gagcagggcg gcagtctgca + 781 caatgtgacg ctggatgttc gcgggtcgga ctgtgtgatt aaaggcgtgg cgatgagcgg + 841 ctttggcccc gtcgcgcaaa ttttcatcgg tggtaaggaa ccgcaggtga tgcgtaatct + 901 cattatcgat gacatcaccg ttacccacgc caactacgcc attctccgcc agggatttca + 961 taaccaaatg gatggcgcgc ggattacgca tagccgcttt agcgatttac agggggacgc + 1021 cattgagtgg aatgtcgcga ttcacgaccg cgacatcctg atttccgatc atgtcatcga + 1081 acgcattaat tgtaccaatg gcaaaatcaa ctgggggatc ggcatcgggc tggcgggtag + 1141 cacctatgac aacagttatc ctgaagacca ggcagtaaaa aactttgtgg tggccaatat + 1201 taccggatct gattgccgac agcttgtgca cgtagaaaat ggcaaacatt tcgtcattcg + 1261 caatgtcaaa gccaaaaaca tcacgcccgg tttcagtaaa aatgcgggta ttgataacgc + 1321 aacgatcgca atttatggct gtgataattt cgtcattgat aatattgata tgacgaatag + 1381 tgccgggatg ctcatcggct atggcgtcgt taaaggaaaa tacctgtcaa ttccgcaaaa + 1441 ctttaaatta aacgctattc ggttggataa tcgccaggtt gcttataaat tacgcggcat + 1501 tcaaatttcc tccggcaaca ccccctcttt tgtcgccatc accaatgtac ggatgacgcg + 1561 tgctacgctg gaactgcata atcaaccgca gcacctcttt ctgcgcaata tcaacgtgat + 1621 gcaaacttca gcgattggcc cggcgttaaa aatgcatttc gatttgcgta aagatgtacg + 1681 tggtcaattt atggcccgcc aggacacgct gctttccctc gctaatgttc atgccatcaa + 1741 tgaaaacggg cagagttccg tggatatcga caggattaat caccaaaccg tgaatgtcga + 1801 agcagtgaat ttttcgctgc cgaagcgggg agggtaacga attccgcggc atatagcggc + 1861 gcgatagcat gtcagaaaat aaattacacg ttatcgattt gcacaaacgc tacggcggtc + 1921 atgaagtgct gaaaggggta tcgctgcagg cccgcgccgg agatgtgatt agcatcatcg + 1981 gctcgtccgg ctccggtaaa agcacttttt tgcgctgtat taacttcctc gaaaaaccga + 2041 gcgaaggcgc gattatcgtg aacggtcaga acattaatct ggtgcgcgac aaagatgggc + 2101 agctcaaagt ggcggataaa aatcagctac gcttgttgcg tacccgcctg acgatggtgt + 2161 ttcagcactt caacctctgg agccacatga cggtgctgga aaatgtgatg gaagcgccga + 2221 ttcaggtact gggattaagc aagcacgacg cgcgcgagcg ggcgttgaaa tatctggcga + 2281 aggtggggat tgatgagcgc gctcagggca aatatcccgt ccatctctcc ggcggccaac + 2341 agcagcgcgt ttctattgcg cgcgcgctgg cgatggaacc tgacgtttta ctgttcgatg + 2401 aacccacatc ggcgctcgat cctgaactgg tcggcgaagt gttgcgcatc atgcaacaac + 2461 tggcggaaga aggcaaaacg atggtggtgg tcacgcatga aatgggcttc gctcgccatg + 2521 tctcttcgca cgttattttt ctgcatcagg ggaaaattga agaagagggt gatccggagc + 2581 aggtgttcgg caatccgcaa agcccgcgtt tacagcaatt cctgaaaggc tcgctgaaat + 2641 aacgctcaga ggcgcgcgcg cgctatatac gcgcagtgtt tattggcata gttagcctgt + 2701 ttcctgaaat gttccgcgca attaccgatt acggggtaac tggccgggca gtaaaaaaag + 2761 gcctgctgaa catccaaagc tggagtcctc gcgacttcgc gcatgaccgg caccgtaccg + 2821 tggacgaccg tccttacggc ggcggaccgg ggatgttaat gatggtgcaa cccttgcggg + 2881 acgccattca cgcagcaaaa gccgcggcag gtgaaggcgc taaagtgatt tatctgtcgc + 2941 ctcagggacg caagcttgat caagcgggcg ttagcgagct ggccacgaat cagaagctta + 3001 ttctggtgtg tggtcgctac gaaggcgtag atgagcgcgt aattcagacc gaaattgacg + 3061 aagaatggtc aattggcgat tacgttctca gcggtggcga actaccggca atgacgctga + 3121 ttgactccgt cgcccggttt ataccggggg ttctggggca tgaggcatca gcaatcgaag + 3181 attcgtttgc tgatgggttg ctggattgtc cgcactatac gcgccctgaa gtgttagagg + 3241 ggatggaagt accgccagta ttgctgtcgg gaaaccatgc tgagatacgt cgctggcgtt + 3301 tgaaacagtc gctgggccga acctggctta gaagacctga acttctggaa aacctggctc + 3361 tgactgaaga gcaagcaagg ttgctggcgg agttcaaaac agaacacgca caacagcagc + 3421 ataaacatga tgggatggca tagccgtcat atgggctctc agaggagacg cgctaatgcg + 3481 cagtacgtgt ttattggcat agttagcctg tttcctgaaa tgttccgcgc aattaccgat + 3541 tacggggtaa ctggccgggc agtaaaaaaa ggcctgctga acatccaaag ctggagtcct + 3601 cgcgacttcg cgcatgaccg gcaccgtacc gtggacgacc gtccttacgg cggcggaccg + 3661 gggatgttaa tgatggtgca acccttgcgg gacgccattc acgcagcaaa agccgcggca + 3721 ggtgaaggcg ctaaagtgat ttatctgtcg cctcagggac gcaagcttga tcaagcgggc + 3781 gttagcgagc tggccacgaa tcagaagctt attctggtgt gtggtcgcta cgaaggcgta + 3841 gatgagcgcg taattcagac cgaaattgac gaagaatggt caattggcga ttacgttctc + 3901 agcggtggcg aactaccggc aatgacgctg attgactccg tcgcccggtt tataccgggg + 3961 gttctggggc atgaggcatc agcaatcgaa gattcgtttg ctgatgggtt gctggattgt + 4021 ccgcactata cgcgccctga agtgttagag gggatggaag taccgccagt attgctgtcg + 4081 ggaaaccatg ctgagatacg tcgctggcgt ttgaaacagt cgctgggccg aacctggctt + 4141 agaagacctg aacttctgga aaacctggct ctgactgaag agcaagcaag gttgctggcg + 4201 gagttcaaaa cagaacacgc acaacagcag cataaacatg atgggatggc atagcgcata + 4261 cgcgcgcgat atatattcgc gcagaggcgc gcatagcgat gaagtttgtt gcgcccgaac + 4321 aggcaccgga acaggcggag gtcatcaaaa atacgccgtt ctggcctgat gtggacctgt + 4381 cggaatttcg cagtgtgatg cgcactgacg gcacggtgac gcagccgcgt ttaaagcagg + 4441 tcgtgctgac ggcgatctct gaggttaacg ctgagctgta cgacttccgc aaccgtcagc + 4501 agatgctggg ctggcggaca cttgctgagg ttcccgcaga aatgctggac ggtaaaagcg + 4561 agcgtatccg gcactaccac aacgctgttt tttgctgggc gcgcgctgtg cttaatgagc + 4621 gttatcagga ctatgacgcc acggcgtcag gcgtgaagcg aggggaggag ctggcggagg + 4681 ccagcggcga tctgtggcgt gatgcccgct gggccatcag ccgggtgcag gatgtaccgc + 4741 actgtacggt ggagcttatc tgaacggcta gcgcgtagcg gcagtctcgg atgaataatc + 4801 attttgggaa agggttaatg gccgggttgc acgcgccata tgcatatagc gcgcatcatg + 4861 cggtgaattt ctgttctgag tataaacgtg gctttgtatt gggttttaca caccgtatgt + 4921 tcgaaaagac cggcgatcgt caacttagcg cgtgggaggc tggaattctg acgcgtcgct + 4981 atggtctgga taaagaaatg gtgatggatt tctttaaaga gaatcattcc gggatggcgg + 5041 ttcgcttctt tatggccggt tatcgactcg aaggttgacg atcgcgcgca ggaggagagc + 5101 tctcttcttc tagagcatgt ctacgcttct ctatttgcac ggattcaaca gttcccctcg + 5161 ctcggcaaaa gcgtgccagc taaaaaactg gctggcggag cgtcatccgc atgttgagat + 5221 gatcgtccct cagctgccgc cgtatcctgc cgatgcggcg gagttgctgg aatctctcgt + 5281 gcttgagcat ggcggtgcgc cattagggct ggtaggatcg tcgctgggtg gttattacgc + 5341 cacctggctg tcgcaatgtt ttatgctgcc ggctgtggtg gtgaatcccg ccgtgcggcc + 5401 ctttgaatta ctgaccgact atctcggtca gaacgagaac ccctacaccg ggcagcaata + 5461 tgtgctagag tctcgccata tttatgatct taaagtcatg cagattgacc cgctggaagc + 5521 gccggacctg atctggctac tgcaacagac gggcgatgaa gtgctgtatt accgccaggc + 5581 ggtggcatat tacgcctcct gccgtcagac agtgaccgag ggtggtaatc acgcattcac + 5641 gggcttcgaa gattatttca accagattgt cgattttctt ggactgcaca gttgctgacg + 5701 gctatgcgcg cgctcgatga ttgaccctat ttttgcgtcc tgtacgctaa ttgccgtctt + 5761 tgttgtttta ctggccatgg gcgcgcctat cgggatctgc atcgttatcg cctctttcag + 5821 caccatgatg ctggtactgc ctttcgatat ttcgatgttc gccaccgcgc aaaaaatgtt + 5881 ctccagcctg gacagttttg ccttgctggc cgtgccgttc ttcgttttgt ccggggtgat + 5941 catgaatagc gggggaattg ccgcccggct ggtcaatttt gccaaactgt ttactggcaa + 6001 actgcccggc tcgctctctt ataccaacat cgtcggcaat atgatgttcg gtgcaatttc + 6061 cggatcggcg attgccgcct caacctccat cggcggcgtg atggtgccga tgagcgcgcg + 6121 cgaaggttac gatcgcggct ttgcggccgc ggtgaatatc gcctccgcgc cgacgggaat + 6181 gttaattccg cccaccacgg cttttatcct ttacgcgctg gcaagcgggg gaacatcgat + 6241 tgccgctctg ttcgccggcg gtctggtcgc gggagtgctg tggggcgttg gctgtatgct + 6301 ggtcacgctg gtggtcgcta agcgtcgaaa ttatcgggtt ttcttcaccg tccaaaaagg + 6361 tatggcgcta aaagttgccg ttgaggccat tcccagcctg ctgctgatcg tgattattgt + 6421 cggcggcatt gtgcagggga ttttcaccgc cattgaagcc tccgcgattg ccgtggtgta + 6481 tacgttattg ctgacgatgg tgttttaccg cacgctgaaa attaaggatt tgccttcgat + 6541 tttgctccag acagtggtaa tgaccggggt catcatgttc ctgctggcaa cctcttcggc + 6601 gatgtccttc tcgatgtcga tcaccaatat tcctgcggcg ctgagcgata tgatcctcgg + 6661 tatttccgcc aataaactgg ttatcctgtt agtcattacc gtctttttgt tgattatcgg + 6721 cgcatttatg gatattggtc cggccattct gatttttacc ccgattctgc tgccaatcat + 6781 ggctaaactg ggcgtcgatc cggtgcattt cggcattatc atgatctata acctggcgat + 6841 tggcaccatt acgccgccag ttggcagtgg tttatatgtc ggggcgagcg tcggtaaggt + 6901 caaagttgag gaagtgatta aaccgttgct gcctttttac ggcgcgatta tcggcgttct + 6961 gttattaatt acctacattc cggaaatcac actgttctta ccccgtctac tgggcatcat + 7021 gtaa +// diff --git a/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.gff b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.gff new file mode 100644 index 0000000000000000000000000000000000000000..71be830b6d848e2d4a9641eb01096ab1ddf42428 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.gff @@ -0,0 +1,182 @@ +##gff-version 3 +##sequence-region genome1_1 1 2784 +##sequence-region genome1_2 1 7024 +genome1_1 Prodigal:2.6 CDS 213 1880 . + 0 ID=GKEGNCBE_00001;eC_number=6.1.1.18;Name=glnS;dbxref=COG:COG0008;gene=glnS;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:P00962;locus_tag=GKEGNCBE_00001;product=Glutamine--tRNA ligase +genome1_1 Prodigal:2.6 CDS 1916 2746 . + 0 ID=GKEGNCBE_00002;eC_number=3.5.1.28;Name=amiD;dbxref=COG:COG3023;gene=amiD;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:P75820;locus_tag=GKEGNCBE_00002;product=N-acetylmuramoyl-L-alanine amidase AmiD +genome1_2 Prodigal:2.6 CDS 16 396 . + 0 ID=GKEGNCBE_00003;eC_number=5.3.3.10;Name=hpcD;gene=hpcD;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:Q05354;locus_tag=GKEGNCBE_00003;product=5-carboxymethyl-2-hydroxymuconate Delta-isomerase +genome1_2 Prodigal:2.6 CDS 428 1837 . + 0 ID=GKEGNCBE_00004;inference=ab initio prediction:Prodigal:2.6;locus_tag=GKEGNCBE_00004;product=hypothetical protein +genome1_2 Prodigal:2.6 CDS 1869 2642 . + 0 ID=GKEGNCBE_00005;Name=hisP;dbxref=COG:COG4598;gene=hisP;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:P02915;locus_tag=GKEGNCBE_00005;product=Histidine transport ATP-binding protein HisP +genome1_2 Prodigal:2.6 CDS 2709 3443 . + 0 ID=GKEGNCBE_00006;eC_number=2.1.1.228;Name=trmD_1;dbxref=COG:COG0336;gene=trmD_1;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:P0A873;locus_tag=GKEGNCBE_00006;product=tRNA (guanine-N(1)-)-methyltransferase +genome1_2 Prodigal:2.6 CDS 3427 4254 . + 0 ID=GKEGNCBE_00007;eC_number=2.1.1.228;Name=trmD_2;dbxref=COG:COG0336;gene=trmD_2;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:P0A873;locus_tag=GKEGNCBE_00007;product=tRNA (guanine-N(1)-)-methyltransferase +genome1_2 Prodigal:2.6 CDS 4299 4763 . + 0 ID=GKEGNCBE_00008;inference=ab initio prediction:Prodigal:2.6;locus_tag=GKEGNCBE_00008;product=hypothetical protein +genome1_2 Prodigal:2.6 CDS 4818 5078 . + 0 ID=GKEGNCBE_00009;inference=ab initio prediction:Prodigal:2.6;locus_tag=GKEGNCBE_00009;product=hypothetical protein +genome1_2 Prodigal:2.6 CDS 5117 5698 . + 0 ID=GKEGNCBE_00010;inference=ab initio prediction:Prodigal:2.6;locus_tag=GKEGNCBE_00010;product=hypothetical protein +genome1_2 Prodigal:2.6 CDS 5717 7024 . + 0 ID=GKEGNCBE_00011;Name=dctM;dbxref=COG:COG1593;gene=dctM;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:O07838;locus_tag=GKEGNCBE_00011;product=C4-dicarboxylate TRAP transporter large permease protein DctM +##FASTA +>genome1_1 +ATGTGTAAAAGCCGGAGGGGTTATCTTTTCCCGGCTTTTTATTATCAATTACTCATTAAC +TCCTGTTCCGTTCTTTTGCGTTTAATCACCGGAATATCTCCGGTATTGTTCAGCGCCCCG +GAAATGTTTTTAACCACTGTTCTGCACTCCGTTTATTAAACGCGCTCAGCGCGCGCTCAT +ATATCGCGCGCGCGCGCGCGCATATATATATAATGAGTGAGGCTGAAGCCCGCCCGACTA +ACTTTATTCGTCAGATTATTGATGAAGATCTGGCGAGTGGTAAACATACCACTGTCCATA +CCCGTTTTCCGCCGGAGCCGAATGGCTATCTGCATATCGGCCACGCGAAATCTATCTGCC +TGAACTTTGGCATCGCGCAAGATTATCAGGGCCAGTGCAACCTGCGTTTCGATGACACCA +ACCCGGTAAAAGAAGATATCGAGTACGTTGATTCGATCAAAAACGACGTCGAGTGGTTAG +GCTTTCACTGGTCTGGCGATATTCGCTACTCCTCCGATTACTTTGACCAACTGCACGCCT +ATGCGGTCGAGCTAATCAATAAAGGCCTGGCCTATGTTGATGAGCTGACGCCGGAGCAGA +TCCGTGAATACCGCGGTACGCTGACCGCGCCGGGTAAAAACAGCCCGTTCCGCGATCGCA +GCGTCGAAGAGAACCTCGCGCTATTTGAAAAAATGCGTACCGGCGGTTTTGAAGAGGGTA +AAGCCTGTCTGCGCGCTAAAATCGACATGGCGTCGCCGTTTATCGTGATGCGCGATCCGG +TGCTGTATCGCATTAAATTCGCCGAGCATCATCAGACTGGCAACAAGTGGTGCATCTATC +CGATGTACGACTTTACTCACTGCATCAGCGATGCGCTGGAAGGCATTACTCATTCTCTGT +GTACGCTGGAGTTCCAGGATAACCGTCGTCTGTACGACTGGGTGCTGGACAACATCACCA +TTCCGGTTCACCCGCGCCAGTACGAATTCTCGCGCCTGAATCTGGAATACACCGTGATGT +CCAAGCGTAAGCTGAACCTGCTGGTGACCGACAAACACGTCGAAGGTTGGGACGATCCGC +GTATGCCGACTATTTCCGGTCTGCGCCGTCGCGGCTATACCGCGGCTTCTATTCGTGAGT +TCTGCAAACGCATCGGCGTCACCAAGCAGGACAACACTATTGAGATGGCGTCGCTGGAAT +CCTGCATTCGCGAAGATCTGAACGAAAACGCGCCGCGCGCGATGGCGGTAATCGATCCGG +TAAAACTGGTTATCGAAAATTACCCGCAGGGTGAGAGCGAAATGGTTACCATGCCTAACC +ATCCGAATAAACCGGAGATGGGCAGCCGTGAAGTGCCGTTTAGCGGTGAGATCTGGATCG +ATCGCGCAGATTTCCGCGAAGAAGCGAACAAACAGTACAAACGTCTGGTGATGGGCAAAG +AAGTGCGTCTGCGTAATGCCTACGTCATTAAAGCGGAGCGCGTAGAGAAGGATGCCGAAG +GGAATATCACCACCATCTTCTGTACCTATGATGCTGATACGCTGAGTAAAGATCCGGCTG +ACGGGCGTAAAGTGAAAGGCGTAATCCACTGGGTTAGCGCAGCACATGCGCTGCCGATTG +AAATTCGTCTCTACGACCGTCTGTTCAGCGTGCCGAATCCGGGCGCCGCGGAGGACTTCC +TGTCTGTTATCAACCCCGAATCATTAGTGATTAAGCAGGGGTATGGCGAGCCGTCGCTGA +AAGCGGCGGTAGCAGGAAAAGCTTTCCAGTTTGAACGTGAAGGCTACTTCTGCCTCGACA +GCCGCTATGCAACGGCCGATAAGCTGGTCTTTAACCGCACCGTGGGCCTGCGTGATACCT +GGGCGAAAGCGGGCGAGTAACGCGCATAGGCGGCCTTCAAGGAGGCGCTAGGCGAATGAA +AGCGCTACTGTGGCTGGTGGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGCGAAAAAGG +AATTATCGATAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCCTATCC +GCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGAC +GTTAACGGGTCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTATATGG +CGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCGGGCGT +CAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAA +TCGTGGTTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAAT +TCAGGCATTGATTCCGTTAGCGAAGGATATTATCGCGCGCTATAACATCAAACCGCAGAA +TGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGCTTCCC +GTGGCGCGAGCTGGCGGCGCAGGGGATTAGCGCCTGGCCTGACGCCCAGCGTGTGGCGTT +TTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCGTTACT +CTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCACGCGAGCAGCAGCGGGTGAT +TATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCCGAAAC +GCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACGCGATAGGCGCGC +GAGATTACGCGCGCAGTATCGCGC +>genome1_2 +CGCGCATAGCGCGCAATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAG +GCTGATTTACCCGGCCTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGATTTTC +CCCATCGGCGGTATCCGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGT +AAGCATGATTACGCGTTTGTGCATATGACGCTGAAAATCGGCGCCGGGCGCAGCCTGGAG +AGCCGTCAGGAAGTTGGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCGACCTG +ATGGAGAACCGCTATCTGGCGCTGTCGTTTGAGATTGCCGAGCTACATCCGACGCTCAAT +TACAAACAAAACAACGTACACGCGTTATTTAAATAGCCGATATGAGGCGCGGCGCTATAA +GGCGCTCATGAGCATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTGACGGCAGGTTC +TGCGCTTGCTGTTCTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCGA +GACCGTCGATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTT +CGCCGACGGACAGACCGTGGTCGTACCGCCAGGATGGGTGTGTGAAAATATCAATGCGGC +GATAACGATTCCGGCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATGGCCG +GGGACGGTTTATTTTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCA +CAATGTGACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGGCGATGAGCGG +CTTTGGCCCCGTCGCGCAAATTTTCATCGGTGGTAAGGAACCGCAGGTGATGCGTAATCT +CATTATCGATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCA +TAACCAAATGGATGGCGCGCGGATTACGCATAGCCGCTTTAGCGATTTACAGGGGGACGC +CATTGAGTGGAATGTCGCGATTCACGACCGCGACATCCTGATTTCCGATCATGTCATCGA +ACGCATTAATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAG +CACCTATGACAACAGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTGGTGGCCAATAT +TACCGGATCTGATTGCCGACAGCTTGTGCACGTAGAAAATGGCAAACATTTCGTCATTCG +CAATGTCAAAGCCAAAAACATCACGCCCGGTTTCAGTAAAAATGCGGGTATTGATAACGC +AACGATCGCAATTTATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAG +TGCCGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAA +CTTTAAATTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCAT +TCAAATTTCCTCCGGCAACACCCCCTCTTTTGTCGCCATCACCAATGTACGGATGACGCG +TGCTACGCTGGAACTGCATAATCAACCGCAGCACCTCTTTCTGCGCAATATCAACGTGAT +GCAAACTTCAGCGATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGTAAAGATGTACG +TGGTCAATTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAA +TGAAAACGGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGA +AGCAGTGAATTTTTCGCTGCCGAAGCGGGGAGGGTAACGAATTCCGCGGCATATAGCGGC +GCGATAGCATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTC +ATGAAGTGCTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGTGATTAGCATCATCG +GCTCGTCCGGCTCCGGTAAAAGCACTTTTTTGCGCTGTATTAACTTCCTCGAAAAACCGA +GCGAAGGCGCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGATGGGC +AGCTCAAAGTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGT +TTCAGCACTTCAACCTCTGGAGCCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGA +TTCAGGTACTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGA +AGGTGGGGATTGATGAGCGCGCTCAGGGCAAATATCCCGTCCATCTCTCCGGCGGCCAAC +AGCAGCGCGTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATG +AACCCACATCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAAC +TGGCGGAAGAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCTCGCCATG +TCTCTTCGCACGTTATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGTGATCCGGAGC +AGGTGTTCGGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAAT +AACGCTCAGAGGCGCGCGCGCGCTATATACGCGCAGTGTTTATTGGCATAGTTAGCCTGT +TTCCTGAAATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAAAG +GCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGACTTCGCGCATGACCGGCACCGTACCG +TGGACGACCGTCCTTACGGCGGCGGACCGGGGATGTTAATGATGGTGCAACCCTTGCGGG +ACGCCATTCACGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGC +CTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAGCTTA +TTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGACCGAAATTGACG +AAGAATGGTCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGA +TTGACTCCGTCGCCCGGTTTATACCGGGGGTTCTGGGGCATGAGGCATCAGCAATCGAAG +ATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGG +GGATGGAAGTACCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCGCTGGCGTT +TGAAACAGTCGCTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTC +TGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGC +ATAAACATGATGGGATGGCATAGCCGTCATATGGGCTCTCAGAGGAGACGCGCTAATGCG +CAGTACGTGTTTATTGGCATAGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGAT +TACGGGGTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACATCCAAAGCTGGAGTCCT +CGCGACTTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCG +GGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCA +GGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGC +GTTAGCGAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTA +GATGAGCGCGTAATTCAGACCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTC +AGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGG +GTTCTGGGGCATGAGGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGT +CCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCG +GGAAACCATGCTGAGATACGTCGCTGGCGTTTGAAACAGTCGCTGGGCCGAACCTGGCTT +AGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCG +GAGTTCAAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAGCGCATA +CGCGCGCGATATATATTCGCGCAGAGGCGCGCATAGCGATGAAGTTTGTTGCGCCCGAAC +AGGCACCGGAACAGGCGGAGGTCATCAAAAATACGCCGTTCTGGCCTGATGTGGACCTGT +CGGAATTTCGCAGTGTGATGCGCACTGACGGCACGGTGACGCAGCCGCGTTTAAAGCAGG +TCGTGCTGACGGCGATCTCTGAGGTTAACGCTGAGCTGTACGACTTCCGCAACCGTCAGC +AGATGCTGGGCTGGCGGACACTTGCTGAGGTTCCCGCAGAAATGCTGGACGGTAAAAGCG +AGCGTATCCGGCACTACCACAACGCTGTTTTTTGCTGGGCGCGCGCTGTGCTTAATGAGC +GTTATCAGGACTATGACGCCACGGCGTCAGGCGTGAAGCGAGGGGAGGAGCTGGCGGAGG +CCAGCGGCGATCTGTGGCGTGATGCCCGCTGGGCCATCAGCCGGGTGCAGGATGTACCGC +ACTGTACGGTGGAGCTTATCTGAACGGCTAGCGCGTAGCGGCAGTCTCGGATGAATAATC +ATTTTGGGAAAGGGTTAATGGCCGGGTTGCACGCGCCATATGCATATAGCGCGCATCATG +CGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGTATGT +TCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGCTGGAATTCTGACGCGTCGCT +ATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATGGCGG +TTCGCTTCTTTATGGCCGGTTATCGACTCGAAGGTTGACGATCGCGCGCAGGAGGAGAGC +TCTCTTCTTCTAGAGCATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCG +CTCGGCAAAAGCGTGCCAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTTGAGAT +GATCGTCCCTCAGCTGCCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAATCTCTCGT +GCTTGAGCATGGCGGTGCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGC +CACCTGGCTGTCGCAATGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCC +CTTTGAATTACTGACCGACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATA +TGTGCTAGAGTCTCGCCATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGC +GCCGGACCTGATCTGGCTACTGCAACAGACGGGCGATGAAGTGCTGTATTACCGCCAGGC +GGTGGCATATTACGCCTCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCAC +GGGCTTCGAAGATTATTTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGACG +GCTATGCGCGCGCTCGATGATTGACCCTATTTTTGCGTCCTGTACGCTAATTGCCGTCTT +TGTTGTTTTACTGGCCATGGGCGCGCCTATCGGGATCTGCATCGTTATCGCCTCTTTCAG +CACCATGATGCTGGTACTGCCTTTCGATATTTCGATGTTCGCCACCGCGCAAAAAATGTT +CTCCAGCCTGGACAGTTTTGCCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGAT +CATGAATAGCGGGGGAATTGCCGCCCGGCTGGTCAATTTTGCCAAACTGTTTACTGGCAA +ACTGCCCGGCTCGCTCTCTTATACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTC +CGGATCGGCGATTGCCGCCTCAACCTCCATCGGCGGCGTGATGGTGCCGATGAGCGCGCG +CGAAGGTTACGATCGCGGCTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAAT +GTTAATTCCGCCCACCACGGCTTTTATCCTTTACGCGCTGGCAAGCGGGGGAACATCGAT +TGCCGCTCTGTTCGCCGGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCT +GGTCACGCTGGTGGTCGCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGG +TATGGCGCTAAAAGTTGCCGTTGAGGCCATTCCCAGCCTGCTGCTGATCGTGATTATTGT +CGGCGGCATTGTGCAGGGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTA +TACGTTATTGCTGACGATGGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGAT +TTTGCTCCAGACAGTGGTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGC +GATGTCCTTCTCGATGTCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGG +TATTTCCGCCAATAAACTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGG +CGCATTTATGGATATTGGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCAATCAT +GGCTAAACTGGGCGTCGATCCGGTGCATTTCGGCATTATCATGATCTATAACCTGGCGAT +TGGCACCATTACGCCGCCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGT +CAAAGTTGAGGAAGTGATTAAACCGTTGCTGCCTTTTTACGGCGCGATTATCGGCGTTCT +GTTATTAATTACCTACATTCCGGAAATCACACTGTTCTTACCCCGTCTACTGGGCATCAT +GTAA diff --git a/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.log b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.log new file mode 100644 index 0000000000000000000000000000000000000000..6aa806da49c94da5a8444327c8a94d809e958faf --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.log @@ -0,0 +1,121 @@ +[13:18:42] This is prokka 1.14-dev +[13:18:42] Written by Torsten Seemann <torsten.seemann@gmail.com> +[13:18:42] Homepage is https://github.com/tseemann/prokka +[13:18:42] Local time is Tue Feb 12 13:18:42 2019 +[13:18:42] You are aperrin +[13:18:42] Operating system is darwin +[13:18:42] You have BioPerl 1.006924 +[13:18:42] System has 8 cores. +[13:18:42] Will use maximum of 1 cores. +[13:18:42] Annotating as >>> Bacteria <<< +[13:18:42] Generating locus_tag from 'Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna' contents. +[13:18:42] Setting --locustag GKEGNCBE from MD5 04e07cbef9b8a6b8d550f0d6ed3a8746 +[13:18:42] Creating new output folder: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes +[13:18:42] Running: mkdir -p Examples\/1\-res\-Annotate\/tmp_files\/genome1\.fst\-split5N\.fna\-prokkaRes +[13:18:42] Using filename prefix: EXAM.0219.00001.XXX +[13:18:42] Setting HMMER_NCPU=1 +[13:18:42] Writing log to: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.log +[13:18:42] Command: /usr/local/bin/prokka --outdir Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes --cpus 1 --prefix EXAM.0219.00001 Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna +[13:18:42] Appending to PATH: /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin +[13:18:42] Appending to PATH: /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/../common +[13:18:42] Appending to PATH: /Users/aperrin/Softwares/src/prokka/bin +[13:18:42] Looking for 'aragorn' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/aragorn +[13:18:42] Determined aragorn version is 1.2 +[13:18:42] Looking for 'barrnap' - found /usr/local/bin/barrnap +[13:18:42] Determined barrnap version is 0.8 +[13:18:42] Looking for 'blastp' - found /Users/aperrin/Softwares/bin/blastp +[13:18:42] Determined blastp version is 2.3 +[13:18:42] Looking for 'cmpress' - found /usr/local/bin/cmpress +[13:18:42] Determined cmpress version is 1.1 +[13:18:42] Looking for 'cmscan' - found /usr/local/bin/cmscan +[13:18:42] Determined cmscan version is 1.1 +[13:18:42] Looking for 'egrep' - found /usr/bin/egrep +[13:18:42] Looking for 'find' - found /usr/bin/find +[13:18:42] Looking for 'grep' - found /usr/bin/grep +[13:18:42] Looking for 'hmmpress' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/hmmpress +[13:18:42] Determined hmmpress version is 3.1 +[13:18:42] Looking for 'hmmscan' - found /usr/local/bin/hmmscan +[13:18:42] Determined hmmscan version is 3.1 +[13:18:42] Looking for 'java' - found /usr/bin/java +[13:18:42] Looking for 'less' - found /usr/bin/less +[13:18:42] Looking for 'makeblastdb' - found /Users/aperrin/Softwares/bin/makeblastdb +[13:18:42] Determined makeblastdb version is 2.3 +[13:18:42] Looking for 'minced' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/../common/minced +[13:18:42] Determined minced version is 2.0 +[13:18:42] Looking for 'parallel' - found /usr/local/bin/parallel +[13:18:42] Determined parallel version is 20181022 +[13:18:42] Looking for 'prodigal' - found /usr/local/bin/prodigal +[13:18:42] Determined prodigal version is 2.6 +[13:18:42] Looking for 'prokka-genbank_to_fasta_db' - found /Users/aperrin/Softwares/src/prokka/bin/prokka-genbank_to_fasta_db +[13:18:42] Looking for 'sed' - found /usr/bin/sed +[13:18:42] Looking for 'tbl2asn' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/tbl2asn +[13:18:42] Determined tbl2asn version is 25.6 +[13:18:42] Using genetic code table 11. +[13:18:42] Loading and checking input file: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna +[13:18:42] Wrote 2 contigs totalling 9808 bp. +[13:18:42] Predicting tRNAs and tmRNAs +[13:18:42] Running: aragorn -l -gc11 -w Examples\/1\-res\-Annotate\/tmp_files\/genome1\.fst\-split5N\.fna\-prokkaRes\/EXAM\.0219\.00001\.fna +[13:18:42] Found 0 tRNAs +[13:18:42] Predicting Ribosomal RNAs +[13:18:42] Running Barrnap with 1 threads +[13:18:42] Found 0 rRNAs +[13:18:42] Skipping ncRNA search, enable with --rfam if desired. +[13:18:42] Total of 0 tRNA + rRNA features +[13:18:42] Searching for CRISPR repeats +[13:18:43] Found 0 CRISPRs +[13:18:43] Predicting coding sequences +[13:18:43] Contigs total 9808 bp, so using meta mode +[13:18:43] Running: prodigal -i Examples\/1\-res\-Annotate\/tmp_files\/genome1\.fst\-split5N\.fna\-prokkaRes\/EXAM\.0219\.00001\.fna -c -m -g 11 -p meta -f sco -q +[13:18:43] Found 11 CDS +[13:18:43] Connecting features back to sequences +[13:18:43] Not using genus-specific database. Try --usegenus to enable it. +[13:18:43] Annotating CDS, please be patient. +[13:18:43] Will use 1 CPUs for similarity searching. +[13:18:43] There are still 11 unannotated CDS left (started with 11) +[13:18:43] Will use blast to search against /Users/aperrin/Softwares/src/prokka/db/kingdom/Bacteria/sprot with 1 CPUs +[13:18:43] Running: cat Examples\/1\-res\-Annotate\/tmp_files\/genome1\.fst\-split5N\.fna\-prokkaRes\/sprot\.faa | parallel --gnu --plain -j 1 --block 1558 --recstart '>' --pipe blastp -query - -db /Users/aperrin/Softwares/src/prokka/db/kingdom/Bacteria/sprot -evalue 1e-06 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > Examples\/1\-res\-Annotate\/tmp_files\/genome1\.fst\-split5N\.fna\-prokkaRes\/sprot\.blast 2> /dev/null +[13:18:44] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/sprot.faa +[13:18:44] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/sprot.blast +[13:18:44] There are still 4 unannotated CDS left (started with 11) +[13:18:44] Will use hmmer3 to search against /Users/aperrin/Softwares/src/prokka/db/hmm/HAMAP.hmm with 1 CPUs +[13:18:44] Running: cat Examples\/1\-res\-Annotate\/tmp_files\/genome1\.fst\-split5N\.fna\-prokkaRes\/HAMAP\.hmm\.faa | parallel --gnu --plain -j 1 --block 459 --recstart '>' --pipe hmmscan --noali --notextw --acc -E 1e-06 --cpu 1 /Users/aperrin/Softwares/src/prokka/db/hmm/HAMAP.hmm /dev/stdin > Examples\/1\-res\-Annotate\/tmp_files\/genome1\.fst\-split5N\.fna\-prokkaRes\/HAMAP\.hmm\.hmmer3 2> /dev/null +[13:18:44] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/HAMAP.hmm.faa +[13:18:44] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/HAMAP.hmm.hmmer3 +[13:18:44] Labelling remaining 4 proteins as 'hypothetical protein' +[13:18:44] Possible /pseudo 'tRNA (guanine-N(1)-)-methyltransferase' at genome1_2 position 3427 +[13:18:44] Found 6 unique /gene codes. +[13:18:44] Fixed 2 duplicate /gene - trmD_1 trmD_2 +[13:18:44] Fixed 1 colliding /gene names. +[13:18:44] Adding /locus_tag identifiers +[13:18:44] Assigned 11 locus_tags to CDS and RNA features. +[13:18:44] Writing outputs to Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/ +[13:18:44] Generating annotation statistics file +[13:18:44] Generating Genbank and Sequin files +[13:18:44] Running: tbl2asn -V b -a r10k -l paired-ends -M n -N 1 -y 'Annotated using prokka 1.14-dev from https://github.com/tseemann/prokka' -Z Examples\/1\-res\-Annotate\/tmp_files\/genome1\.fst\-split5N\.fna\-prokkaRes\/EXAM\.0219\.00001\.err -i Examples\/1\-res\-Annotate\/tmp_files\/genome1\.fst\-split5N\.fna\-prokkaRes\/EXAM\.0219\.00001\.fsa 2> /dev/null +[13:18:45] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/errorsummary.val +[13:18:45] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.dr +[13:18:45] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.fixedproducts +[13:18:45] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.ecn +[13:18:45] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.val +[13:18:45] Repairing broken .GBK output that tbl2asn produces... +[13:18:45] Running: sed 's/COORDINATES: profile/COORDINATES:profile/' < Examples\/1\-res\-Annotate\/tmp_files\/genome1\.fst\-split5N\.fna\-prokkaRes\/EXAM\.0219\.00001\.gbf > Examples\/1\-res\-Annotate\/tmp_files\/genome1\.fst\-split5N\.fna\-prokkaRes\/EXAM\.0219\.00001\.gbk +[13:18:45] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.gbf +[13:18:45] Output files: +[13:18:45] Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.txt +[13:18:45] Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.tsv +[13:18:45] Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.fna +[13:18:45] Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.err +[13:18:45] Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.gff +[13:18:45] Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.sqn +[13:18:45] Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.tbl +[13:18:45] Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.gbk +[13:18:45] Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.log +[13:18:45] Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.ffn +[13:18:45] Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.faa +[13:18:45] Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.fsa +[13:18:45] Annotation finished successfully. +[13:18:45] Walltime used: 0.05 minutes +[13:18:45] If you use this result please cite the Prokka paper: +[13:18:45] Seemann T (2014) Prokka: rapid prokaryotic genome annotation. Bioinformatics. 30(14):2068-9. +[13:18:45] Type 'prokka --citation' for more details. +[13:18:45] Share and enjoy! diff --git a/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.sqn b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.sqn new file mode 100644 index 0000000000000000000000000000000000000000..3090e4316b98bf27b0d8f2e8d65566383ec6e0da --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.sqn @@ -0,0 +1,1042 @@ +Seq-entry ::= set { + class genbank , + seq-set { + set { + class nuc-prot , + descr { + source { + org { + taxname "Genus species" , + orgname { + mod { + { + subtype strain , + subname "strain" } } , + gcode 11 } } } , + comment "Annotated using prokka 1.14-dev from + https://github.com/tseemann/prokka" , + user { + type + str "NcbiCleanup" , + data { + { + label + str "method" , + data + str "SeriousSeqEntryCleanup" } , + { + label + str "version" , + data + int 8 } , + { + label + str "month" , + data + int 2 } , + { + label + str "day" , + data + int 12 } , + { + label + str "year" , + data + int 2019 } } } , + create-date + std { + year 2019 , + month 2 , + day 12 } } , + seq-set { + seq { + id { + local + str "genome1_1" } , + descr { + molinfo { + biomol genomic } } , + inst { + repr raw , + mol dna , + length 2784 , + seq-data + iupacna "ATGTGTAAAAGCCGGAGGGGTTATCTTTTCCCGGCTTTTTATTATCAATTACTCA +TTAACTCCTGTTCCGTTCTTTTGCGTTTAATCACCGGAATATCTCCGGTATTGTTCAGCGCCCCGGAAATGTTTTTAA +CCACTGTTCTGCACTCCGTTTATTAAACGCGCTCAGCGCGCGCTCATATATCGCGCGCGCGCGCGCGCATATATATAT +AATGAGTGAGGCTGAAGCCCGCCCGACTAACTTTATTCGTCAGATTATTGATGAAGATCTGGCGAGTGGTAAACATAC +CACTGTCCATACCCGTTTTCCGCCGGAGCCGAATGGCTATCTGCATATCGGCCACGCGAAATCTATCTGCCTGAACTT +TGGCATCGCGCAAGATTATCAGGGCCAGTGCAACCTGCGTTTCGATGACACCAACCCGGTAAAAGAAGATATCGAGTA +CGTTGATTCGATCAAAAACGACGTCGAGTGGTTAGGCTTTCACTGGTCTGGCGATATTCGCTACTCCTCCGATTACTT +TGACCAACTGCACGCCTATGCGGTCGAGCTAATCAATAAAGGCCTGGCCTATGTTGATGAGCTGACGCCGGAGCAGAT +CCGTGAATACCGCGGTACGCTGACCGCGCCGGGTAAAAACAGCCCGTTCCGCGATCGCAGCGTCGAAGAGAACCTCGC +GCTATTTGAAAAAATGCGTACCGGCGGTTTTGAAGAGGGTAAAGCCTGTCTGCGCGCTAAAATCGACATGGCGTCGCC +GTTTATCGTGATGCGCGATCCGGTGCTGTATCGCATTAAATTCGCCGAGCATCATCAGACTGGCAACAAGTGGTGCAT +CTATCCGATGTACGACTTTACTCACTGCATCAGCGATGCGCTGGAAGGCATTACTCATTCTCTGTGTACGCTGGAGTT +CCAGGATAACCGTCGTCTGTACGACTGGGTGCTGGACAACATCACCATTCCGGTTCACCCGCGCCAGTACGAATTCTC +GCGCCTGAATCTGGAATACACCGTGATGTCCAAGCGTAAGCTGAACCTGCTGGTGACCGACAAACACGTCGAAGGTTG +GGACGATCCGCGTATGCCGACTATTTCCGGTCTGCGCCGTCGCGGCTATACCGCGGCTTCTATTCGTGAGTTCTGCAA +ACGCATCGGCGTCACCAAGCAGGACAACACTATTGAGATGGCGTCGCTGGAATCCTGCATTCGCGAAGATCTGAACGA +AAACGCGCCGCGCGCGATGGCGGTAATCGATCCGGTAAAACTGGTTATCGAAAATTACCCGCAGGGTGAGAGCGAAAT +GGTTACCATGCCTAACCATCCGAATAAACCGGAGATGGGCAGCCGTGAAGTGCCGTTTAGCGGTGAGATCTGGATCGA +TCGCGCAGATTTCCGCGAAGAAGCGAACAAACAGTACAAACGTCTGGTGATGGGCAAAGAAGTGCGTCTGCGTAATGC +CTACGTCATTAAAGCGGAGCGCGTAGAGAAGGATGCCGAAGGGAATATCACCACCATCTTCTGTACCTATGATGCTGA +TACGCTGAGTAAAGATCCGGCTGACGGGCGTAAAGTGAAAGGCGTAATCCACTGGGTTAGCGCAGCACATGCGCTGCC +GATTGAAATTCGTCTCTACGACCGTCTGTTCAGCGTGCCGAATCCGGGCGCCGCGGAGGACTTCCTGTCTGTTATCAA +CCCCGAATCATTAGTGATTAAGCAGGGGTATGGCGAGCCGTCGCTGAAAGCGGCGGTAGCAGGAAAAGCTTTCCAGTT +TGAACGTGAAGGCTACTTCTGCCTCGACAGCCGCTATGCAACGGCCGATAAGCTGGTCTTTAACCGCACCGTGGGCCT +GCGTGATACCTGGGCGAAAGCGGGCGAGTAACGCGCATAGGCGGCCTTCAAGGAGGCGCTAGGCGAATGAAAGCGCTA +CTGTGGCTGGTGGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGCGAAAAAGGAATTATCGATAAAGAGGGATATCAG +CTTGATACCCGACATCGGGCGCAGGCGGCCTATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTT +GACGTTTCGCTGGCGACGTTAACGGGTCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTATATGGC +GGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCGGGCGTCAGTTTCTGGCGAGGCGCC +ACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAATCGTGGTTGGCGAATGTCCGGCGGGGTGAAATCTTTC +GCGCCGTTTGAATCCGCGCAAATTCAGGCATTGATTCCGTTAGCGAAGGATATTATCGCGCGCTATAACATCAAACCG +CAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGCTTCCCGTGGCGCGAGCTG +GCGGCGCAGGGGATTAGCGCCTGGCCTGACGCCCAGCGTGTGGCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCA +GTCGATACCGCAACGGTGCTTGCGTTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCACGCGAGCAG +CAGCGGGTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCCGAAACGCAGGCG +ATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACGCGATAGGCGCGCGAGATTACGCGCGCAGTATCGCGC" } } , + seq { + id { + local + str "genome1_1_1" } , + descr { + title "Glutamine--tRNA ligase [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 555 , + seq-data + ncbieaa "MSEAEARPTNFIRQIIDEDLASGKHTTVHTRFPPEPNGYLHIGHAKSICLNFGIA +QDYQGQCNLRFDDTNPVKEDIEYVDSIKNDVEWLGFHWSGDIRYSSDYFDQLHAYAVELINKGLAYVDELTPEQIREY +RGTLTAPGKNSPFRDRSVEENLALFEKMRTGGFEEGKACLRAKIDMASPFIVMRDPVLYRIKFAEHHQTGNKWCIYPM +YDFTHCISDALEGITHSLCTLEFQDNRRLYDWVLDNITIPVHPRQYEFSRLNLEYTVMSKRKLNLLVTDKHVEGWDDP +RMPTISGLRRRGYTAASIREFCKRIGVTKQDNTIEMASLESCIREDLNENAPRAMAVIDPVKLVIENYPQGESEMVTM +PNHPNKPEMGSREVPFSGEIWIDRADFREEANKQYKRLVMGKEVRLRNAYVIKAERVEKDAEGNITTIFCTYDADTLS +KDPADGRKVKGVIHWVSAAHALPIEIRLYDRLFSVPNPGAAEDFLSVINPESLVIKQGYGEPSLKAAVAGKAFQFERE +GYFCLDSRYATADKLVFNRTVGLRDTWAKAGE" } , + annot { + { + data + ftable { + { + id + local + id 3 , + data + prot { + name { + "Glutamine--tRNA ligase" } , + ec { + "6.1.1.18" } } , + location + int { + from 0 , + to 554 , + id + local + str "genome1_1_1" } } } } } } , + seq { + id { + local + str "genome1_1_2" } , + descr { + title "N-acetylmuramoyl-L-alanine amidase AmiD [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 276 , + seq-data + ncbieaa "MKALLWLVGLALLLTGCASEKGIIDKEGYQLDTRHRAQAAYPRIKVLVIHYTAEN +FDVSLATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGVSFWRGATRLNDTSIGIELENRGWRMSGGVKS +FAPFESAQIQALIPLAKDIIARYNIKPQNVVAHADIAPQRKDDPGPRFPWRELAAQGISAWPDAQRVAFYLAGRAPYT +PVDTATVLALLSRYGYEVKADMTAREQQRVIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD" } , + annot { + { + data + ftable { + { + id + local + id 4 , + data + prot { + name { + "N-acetylmuramoyl-L-alanine amidase AmiD" } , + ec { + "3.5.1.28" } } , + location + int { + from 0 , + to 275 , + id + local + str "genome1_1_2" } } } } } } } , + annot { + { + data + ftable { + { + id + local + id 1 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "genome1_1_1" , + location + int { + from 212 , + to 1879 , + strand plus , + id + local + str "genome1_1" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:P00962" } } , + xref { + { + data + gene { + locus "glnS" , + locus-tag "GKEGNCBE_00001" } } } } , + { + id + local + id 2 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "genome1_1_2" , + location + int { + from 1915 , + to 2745 , + strand plus , + id + local + str "genome1_1" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:P75820" } } , + xref { + { + data + gene { + locus "amiD" , + locus-tag "GKEGNCBE_00002" } } } } } } } } , + set { + class nuc-prot , + descr { + source { + org { + taxname "Genus species" , + orgname { + mod { + { + subtype strain , + subname "strain" } } , + gcode 11 } } } , + comment "Annotated using prokka 1.14-dev from + https://github.com/tseemann/prokka" , + user { + type + str "NcbiCleanup" , + data { + { + label + str "method" , + data + str "SeriousSeqEntryCleanup" } , + { + label + str "version" , + data + int 8 } , + { + label + str "month" , + data + int 2 } , + { + label + str "day" , + data + int 12 } , + { + label + str "year" , + data + int 2019 } } } , + create-date + std { + year 2019 , + month 2 , + day 12 } } , + seq-set { + seq { + id { + local + str "genome1_2" } , + descr { + molinfo { + biomol genomic } } , + inst { + repr raw , + mol dna , + length 7024 , + seq-data + iupacna "CGCGCATAGCGCGCAATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCG +AGCAGGCTGATTTACCCGGCCTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGATTTTCCCCATCGGCGGTA +TCCGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGATTACGCGTTTGTGCATATGACGC +TGAAAATCGGCGCCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTTGGCGAAATGCTGTTTGGGCTGATTAAAGCCCACT +TCGCCGACCTGATGGAGAACCGCTATCTGGCGCTGTCGTTTGAGATTGCCGAGCTACATCCGACGCTCAATTACAAAC +AAAACAACGTACACGCGTTATTTAAATAGCCGATATGAGGCGCGGCGCTATAAGGCGCTCATGAGCATGCCCGCGACT +AAATTCTCCCGACGTACCCTCCTGACGGCAGGTTCTGCGCTTGCTGTTCTTCCTTTTCTGCGCGCCTTGCCGGTACAG +GCGCGTGAACCTCGCGAGACCGTCGATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTC +GCCGACGGACAGACCGTGGTCGTACCGCCAGGATGGGTGTGTGAAAATATCAATGCGGCGATAACGATTCCGGCGGGA +AAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATGGCCGGGGACGGTTTATTTTGCAGGACGGGTGTCAGGTGGTG +GGGGAGCAGGGCGGCAGTCTGCACAATGTGACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGGCGATG +AGCGGCTTTGGCCCCGTCGCGCAAATTTTCATCGGTGGTAAGGAACCGCAGGTGATGCGTAATCTCATTATCGATGAC +ATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCATAACCAAATGGATGGCGCGCGGATTACGCAT +AGCCGCTTTAGCGATTTACAGGGGGACGCCATTGAGTGGAATGTCGCGATTCACGACCGCGACATCCTGATTTCCGAT +CATGTCATCGAACGCATTAATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCACCTAT +GACAACAGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTGGTGGCCAATATTACCGGATCTGATTGCCGACAGCTT +GTGCACGTAGAAAATGGCAAACATTTCGTCATTCGCAATGTCAAAGCCAAAAACATCACGCCCGGTTTCAGTAAAAAT +GCGGGTATTGATAACGCAACGATCGCAATTTATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAGT +GCCGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAACTTTAAATTAAACGCTATT +CGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCAAATTTCCTCCGGCAACACCCCCTCTTTTGTCGCC +ATCACCAATGTACGGATGACGCGTGCTACGCTGGAACTGCATAATCAACCGCAGCACCTCTTTCTGCGCAATATCAAC +GTGATGCAAACTTCAGCGATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGTAAAGATGTACGTGGTCAATTTATG +GCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAATGAAAACGGGCAGAGTTCCGTGGATATCGAC +AGGATTAATCACCAAACCGTGAATGTCGAAGCAGTGAATTTTTCGCTGCCGAAGCGGGGAGGGTAACGAATTCCGCGG +CATATAGCGGCGCGATAGCATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGT +GCTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGTGATTAGCATCATCGGCTCGTCCGGCTCCGGTAAAAGCAC +TTTTTTGCGCTGTATTAACTTCCTCGAAAAACCGAGCGAAGGCGCGATTATCGTGAACGGTCAGAACATTAATCTGGT +GCGCGACAAAGATGGGCAGCTCAAAGTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTT +TCAGCACTTCAACCTCTGGAGCCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTACTGGGATTAAG +CAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGGATTGATGAGCGCGCTCAGGGCAAATATCC +CGTCCATCTCTCCGGCGGCCAACAGCAGCGCGTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTT +CGATGAACCCACATCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAAGAAGG +CAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCTCGCCATGTCTCTTCGCACGTTATTTTTCTGCATCAGGG +GAAAATTGAAGAAGAGGGTGATCCGGAGCAGGTGTTCGGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGG +CTCGCTGAAATAACGCTCAGAGGCGCGCGCGCGCTATATACGCGCAGTGTTTATTGGCATAGTTAGCCTGTTTCCTGA +AATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACATCCAAAGCTGGAG +TCCTCGCGACTTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCGGGGATGTTAATGAT +GGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGCC +TCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGTGGTCGCTA +CGAAGGCGTAGATGAGCGCGTAATTCAGACCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTCAGCGGTGG +CGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGGGTTCTGGGGCATGAGGCATCAGCAAT +CGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACC +GCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCGCTGGCGTTTGAAACAGTCGCTGGGCCGAACCTGGCTTAG +AAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGC +ACAACAGCAGCATAAACATGATGGGATGGCATAGCCGTCATATGGGCTCTCAGAGGAGACGCGCTAATGCGCAGTACG +TGTTTATTGGCATAGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAA +AAAAAGGCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGACTTCGCGCATGACCGGCACCGTACCGTGGACGACCGTC +CTTACGGCGGCGGACCGGGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCAG +GTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGA +ATCAGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGACCGAAATTGACGAAGAAT +GGTCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATAC +CGGGGGTTCTGGGGCATGAGGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCACTATACGC +GCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCGCTGGCGTT +TGAAACAGTCGCTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAA +GGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAGCGCATACGCGCGC +GATATATATTCGCGCAGAGGCGCGCATAGCGATGAAGTTTGTTGCGCCCGAACAGGCACCGGAACAGGCGGAGGTCAT +CAAAAATACGCCGTTCTGGCCTGATGTGGACCTGTCGGAATTTCGCAGTGTGATGCGCACTGACGGCACGGTGACGCA +GCCGCGTTTAAAGCAGGTCGTGCTGACGGCGATCTCTGAGGTTAACGCTGAGCTGTACGACTTCCGCAACCGTCAGCA +GATGCTGGGCTGGCGGACACTTGCTGAGGTTCCCGCAGAAATGCTGGACGGTAAAAGCGAGCGTATCCGGCACTACCA +CAACGCTGTTTTTTGCTGGGCGCGCGCTGTGCTTAATGAGCGTTATCAGGACTATGACGCCACGGCGTCAGGCGTGAA +GCGAGGGGAGGAGCTGGCGGAGGCCAGCGGCGATCTGTGGCGTGATGCCCGCTGGGCCATCAGCCGGGTGCAGGATGT +ACCGCACTGTACGGTGGAGCTTATCTGAACGGCTAGCGCGTAGCGGCAGTCTCGGATGAATAATCATTTTGGGAAAGG +GTTAATGGCCGGGTTGCACGCGCCATATGCATATAGCGCGCATCATGCGGTGAATTTCTGTTCTGAGTATAAACGTGG +CTTTGTATTGGGTTTTACACACCGTATGTTCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGCTGGAATTCT +GACGCGTCGCTATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATGGCGGTTCGCTT +CTTTATGGCCGGTTATCGACTCGAAGGTTGACGATCGCGCGCAGGAGGAGAGCTCTCTTCTTCTAGAGCATGTCTACG +CTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGCCAGCTAAAAAACTGGCTGGCGGAGCGT +CATCCGCATGTTGAGATGATCGTCCCTCAGCTGCCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAATCTCTCGTG +CTTGAGCATGGCGGTGCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAATGT +TTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACCGACTATCTCGGTCAGAACGAG +AACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGCCATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTG +GAAGCGCCGGACCTGATCTGGCTACTGCAACAGACGGGCGATGAAGTGCTGTATTACCGCCAGGCGGTGGCATATTAC +GCCTCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCACGGGCTTCGAAGATTATTTCAACCAGATTGTC +GATTTTCTTGGACTGCACAGTTGCTGACGGCTATGCGCGCGCTCGATGATTGACCCTATTTTTGCGTCCTGTACGCTA +ATTGCCGTCTTTGTTGTTTTACTGGCCATGGGCGCGCCTATCGGGATCTGCATCGTTATCGCCTCTTTCAGCACCATG +ATGCTGGTACTGCCTTTCGATATTTCGATGTTCGCCACCGCGCAAAAAATGTTCTCCAGCCTGGACAGTTTTGCCTTG +CTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGCGGGGGAATTGCCGCCCGGCTGGTCAATTTTGCC +AAACTGTTTACTGGCAAACTGCCCGGCTCGCTCTCTTATACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCC +GGATCGGCGATTGCCGCCTCAACCTCCATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTACGATCGCGGC +TTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAATTCCGCCCACCACGGCTTTTATCCTTTACGCG +CTGGCAAGCGGGGGAACATCGATTGCCGCTCTGTTCGCCGGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGT +ATGCTGGTCACGCTGGTGGTCGCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGTATGGCGCTAAAA +GTTGCCGTTGAGGCCATTCCCAGCCTGCTGCTGATCGTGATTATTGTCGGCGGCATTGTGCAGGGGATTTTCACCGCC +ATTGAAGCCTCCGCGATTGCCGTGGTGTATACGTTATTGCTGACGATGGTGTTTTACCGCACGCTGAAAATTAAGGAT +TTGCCTTCGATTTTGCTCCAGACAGTGGTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCC +TTCTCGATGTCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCCAATAAACTGGTTATC +CTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATGGATATTGGTCCGGCCATTCTGATTTTTACCCCG +ATTCTGCTGCCAATCATGGCTAAACTGGGCGTCGATCCGGTGCATTTCGGCATTATCATGATCTATAACCTGGCGATT +GGCACCATTACGCCGCCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAGGAAGTGATT +AAACCGTTGCTGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTATTAATTACCTACATTCCGGAAATCACACTGTTC +TTACCCCGTCTACTGGGCATCATGTAA" } } , + seq { + id { + local + str "genome1_2_1" } , + descr { + title "5-carboxymethyl-2-hydroxymuconate Delta-isomerase [Genus + species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 126 , + seq-data + ncbieaa "MPHFIAECTENIREQADLPGLFSKVNEALAASGIFPIGGIRSRAHWLDTWQMADG +KHDYAFVHMTLKIGAGRSLESRQEVGEMLFGLIKAHFADLMENRYLALSFEIAELHPTLNYKQNNVHALFK" } , + annot { + { + data + ftable { + { + id + local + id 14 , + data + prot { + name { + "5-carboxymethyl-2-hydroxymuconate Delta-isomerase" } , + ec { + "5.3.3.10" } } , + location + int { + from 0 , + to 125 , + id + local + str "genome1_2_1" } } } } } } , + seq { + id { + local + str "genome1_2_2" } , + descr { + title "hypothetical protein GKEGNCBE_00004 [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 469 , + seq-data + ncbieaa "MSMPATKFSRRTLLTAGSALAVLPFLRALPVQAREPRETVDIKDYPADDGIASFK +QAFADGQTVVVPPGWVCENINAAITIPAGKTLRVQGAVRGNGRGRFILQDGCQVVGEQGGSLHNVTLDVRGSDCVIKG +VAMSGFGPVAQIFIGGKEPQVMRNLIIDDITVTHANYAILRQGFHNQMDGARITHSRFSDLQGDAIEWNVAIHDRDIL +ISDHVIERINCTNGKINWGIGIGLAGSTYDNSYPEDQAVKNFVVANITGSDCRQLVHVENGKHFVIRNVKAKNITPGF +SKNAGIDNATIAIYGCDNFVIDNIDMTNSAGMLIGYGVVKGKYLSIPQNFKLNAIRLDNRQVAYKLRGIQISSGNTPS +FVAITNVRMTRATLELHNQPQHLFLRNINVMQTSAIGPALKMHFDLRKDVRGQFMARQDTLLSLANVHAINENGQSSV +DIDRINHQTVNVEAVNFSLPKRGG" } , + annot { + { + data + ftable { + { + id + local + id 15 , + data + prot { + name { + "hypothetical protein" } } , + location + int { + from 0 , + to 468 , + id + local + str "genome1_2_2" } } } } } } , + seq { + id { + local + str "genome1_2_3" } , + descr { + title "Histidine transport ATP-binding protein HisP [Genus + species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 257 , + seq-data + ncbieaa "MSENKLHVIDLHKRYGGHEVLKGVSLQARAGDVISIIGSSGSGKSTFLRCINFLE +KPSEGAIIVNGQNINLVRDKDGQLKVADKNQLRLLRTRLTMVFQHFNLWSHMTVLENVMEAPIQVLGLSKHDARERAL +KYLAKVGIDERAQGKYPVHLSGGQQQRVSIARALAMEPDVLLFDEPTSALDPELVGEVLRIMQQLAEEGKTMVVVTHE +MGFARHVSSHVIFLHQGKIEEEGDPEQVFGNPQSPRLQQFLKGSLK" } , + annot { + { + data + ftable { + { + id + local + id 16 , + data + prot { + name { + "Histidine transport ATP-binding protein HisP" } } , + location + int { + from 0 , + to 256 , + id + local + str "genome1_2_3" } } } } } } , + seq { + id { + local + str "genome1_2_4" } , + descr { + title "tRNA (guanine-N(1)-)-methyltransferase [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 244 , + seq-data + ncbieaa "MFRAITDYGVTGRAVKKGLLNIQSWSPRDFAHDRHRTVDDRPYGGGPGMLMMVQP +LRDAIHAAKAAAGEGAKVIYLSPQGRKLDQAGVSELATNQKLILVCGRYEGVDERVIQTEIDEEWSIGDYVLSGGELP +AMTLIDSVARFIPGVLGHEASAIEDSFADGLLDCPHYTRPEVLEGMEVPPVLLSGNHAEIRRWRLKQSLGRTWLRRPE +LLENLALTEEQARLLAEFKTEHAQQQHKHDGMA" } , + annot { + { + data + ftable { + { + id + local + id 17 , + data + prot { + name { + "tRNA (guanine-N(1)-)-methyltransferase" } , + ec { + "2.1.1.228" } } , + location + int { + from 0 , + to 243 , + id + local + str "genome1_2_4" } } } } } } , + seq { + id { + local + str "genome1_2_5" } , + descr { + title "tRNA (guanine-N(1)-)-methyltransferase [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 275 , + seq-data + ncbieaa "MMGWHSRHMGSQRRRANAQYVFIGIVSLFPEMFRAITDYGVTGRAVKKGLLNIQS +WSPRDFAHDRHRTVDDRPYGGGPGMLMMVQPLRDAIHAAKAAAGEGAKVIYLSPQGRKLDQAGVSELATNQKLILVCG +RYEGVDERVIQTEIDEEWSIGDYVLSGGELPAMTLIDSVARFIPGVLGHEASAIEDSFADGLLDCPHYTRPEVLEGME +VPPVLLSGNHAEIRRWRLKQSLGRTWLRRPELLENLALTEEQARLLAEFKTEHAQQQHKHDGMA" } , + annot { + { + data + ftable { + { + id + local + id 18 , + data + prot { + name { + "tRNA (guanine-N(1)-)-methyltransferase" } , + ec { + "2.1.1.228" } } , + location + int { + from 0 , + to 274 , + id + local + str "genome1_2_5" } } } } } } , + seq { + id { + local + str "genome1_2_6" } , + descr { + title "hypothetical protein GKEGNCBE_00008 [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 154 , + seq-data + ncbieaa "MKFVAPEQAPEQAEVIKNTPFWPDVDLSEFRSVMRTDGTVTQPRLKQVVLTAISE +VNAELYDFRNRQQMLGWRTLAEVPAEMLDGKSERIRHYHNAVFCWARAVLNERYQDYDATASGVKRGEELAEASGDLW +RDARWAISRVQDVPHCTVELI" } , + annot { + { + data + ftable { + { + id + local + id 19 , + data + prot { + name { + "hypothetical protein" } } , + location + int { + from 0 , + to 153 , + id + local + str "genome1_2_6" } } } } } } , + seq { + id { + local + str "genome1_2_7" } , + descr { + title "hypothetical protein GKEGNCBE_00009 [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 86 , + seq-data + ncbieaa "MAGLHAPYAYSAHHAVNFCSEYKRGFVLGFTHRMFEKTGDRQLSAWEAGILTRRY +GLDKEMVMDFFKENHSGMAVRFFMAGYRLEG" } , + annot { + { + data + ftable { + { + id + local + id 20 , + data + prot { + name { + "hypothetical protein" } } , + location + int { + from 0 , + to 85 , + id + local + str "genome1_2_7" } } } } } } , + seq { + id { + local + str "genome1_2_8" } , + descr { + title "hypothetical protein GKEGNCBE_00010 [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 193 , + seq-data + ncbieaa "MSTLLYLHGFNSSPRSAKACQLKNWLAERHPHVEMIVPQLPPYPADAAELLESLV +LEHGGAPLGLVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDYLGQNENPYTGQQYVLESRHIYDLKVMQIDPL +EAPDLIWLLQQTGDEVLYYRQAVAYYASCRQTVTEGGNHAFTGFEDYFNQIVDFLGLHSC" } , + annot { + { + data + ftable { + { + id + local + id 21 , + data + prot { + name { + "hypothetical protein" } } , + location + int { + from 0 , + to 192 , + id + local + str "genome1_2_8" } } } } } } , + seq { + id { + local + str "genome1_2_9" } , + descr { + title "C4-dicarboxylate TRAP transporter large permease protein + DctM [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 435 , + seq-data + ncbieaa "MIDPIFASCTLIAVFVVLLAMGAPIGICIVIASFSTMMLVLPFDISMFATAQKMF +SSLDSFALLAVPFFVLSGVIMNSGGIAARLVNFAKLFTGKLPGSLSYTNIVGNMMFGAISGSAIAASTSIGGVMVPMS +AREGYDRGFAAAVNIASAPTGMLIPPTTAFILYALASGGTSIAALFAGGLVAGVLWGVGCMLVTLVVAKRRNYRVFFT +VQKGMALKVAVEAIPSLLLIVIIVGGIVQGIFTAIEASAIAVVYTLLLTMVFYRTLKIKDLPSILLQTVVMTGVIMFL +LATSSAMSFSMSITNIPAALSDMILGISANKLVILLVITVFLLIIGAFMDIGPAILIFTPILLPIMAKLGVDPVHFGI +IMIYNLAIGTITPPVGSGLYVGASVGKVKVEEVIKPLLPFYGAIIGVLLLITYIPEITLFLPRLLGIM" } , + annot { + { + data + ftable { + { + id + local + id 22 , + data + prot { + name { + "C4-dicarboxylate TRAP transporter large permease + protein DctM" } } , + location + int { + from 0 , + to 434 , + id + local + str "genome1_2_9" } } } } } } } , + annot { + { + data + ftable { + { + id + local + id 5 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "genome1_2_1" , + location + int { + from 15 , + to 395 , + strand plus , + id + local + str "genome1_2" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:Q05354" } } , + xref { + { + data + gene { + locus "hpcD" , + locus-tag "GKEGNCBE_00003" } } } } , + { + id + local + id 6 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "genome1_2_2" , + location + int { + from 427 , + to 1836 , + strand plus , + id + local + str "genome1_2" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } } , + xref { + { + data + gene { + locus-tag "GKEGNCBE_00004" } } } } , + { + id + local + id 7 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "genome1_2_3" , + location + int { + from 1868 , + to 2641 , + strand plus , + id + local + str "genome1_2" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:P02915" } } , + xref { + { + data + gene { + locus "hisP" , + locus-tag "GKEGNCBE_00005" } } } } , + { + id + local + id 8 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "genome1_2_4" , + location + int { + from 2708 , + to 3442 , + strand plus , + id + local + str "genome1_2" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:P0A873" } } , + xref { + { + data + gene { + locus "trmD_1" , + locus-tag "GKEGNCBE_00006" } } } } , + { + id + local + id 9 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "genome1_2_5" , + location + int { + from 3426 , + to 4253 , + strand plus , + id + local + str "genome1_2" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:P0A873" } } , + xref { + { + data + gene { + locus "trmD_2" , + locus-tag "GKEGNCBE_00007" } } } } , + { + id + local + id 10 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "genome1_2_6" , + location + int { + from 4298 , + to 4762 , + strand plus , + id + local + str "genome1_2" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } } , + xref { + { + data + gene { + locus-tag "GKEGNCBE_00008" } } } } , + { + id + local + id 11 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "genome1_2_7" , + location + int { + from 4817 , + to 5077 , + strand plus , + id + local + str "genome1_2" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } } , + xref { + { + data + gene { + locus-tag "GKEGNCBE_00009" } } } } , + { + id + local + id 12 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "genome1_2_8" , + location + int { + from 5116 , + to 5697 , + strand plus , + id + local + str "genome1_2" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } } , + xref { + { + data + gene { + locus-tag "GKEGNCBE_00010" } } } } , + { + id + local + id 13 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "genome1_2_9" , + location + int { + from 5716 , + to 7023 , + strand plus , + id + local + str "genome1_2" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:O07838" } } , + xref { + { + data + gene { + locus "dctM" , + locus-tag "GKEGNCBE_00011" } } } } } } } } } } diff --git a/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.tbl b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.tbl new file mode 100644 index 0000000000000000000000000000000000000000..bdd6ca8975f47bbc5faaa1b68bca20be27923aef --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.tbl @@ -0,0 +1,71 @@ +>Feature genome1_1 +213 1880 CDS + EC_number 6.1.1.18 + dbxref COG:COG0008 + gene glnS + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:P00962 + locus_tag GKEGNCBE_00001 + product Glutamine--tRNA ligase +1916 2746 CDS + EC_number 3.5.1.28 + dbxref COG:COG3023 + gene amiD + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:P75820 + locus_tag GKEGNCBE_00002 + product N-acetylmuramoyl-L-alanine amidase AmiD +>Feature genome1_2 +16 396 CDS + EC_number 5.3.3.10 + gene hpcD + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:Q05354 + locus_tag GKEGNCBE_00003 + product 5-carboxymethyl-2-hydroxymuconate Delta-isomerase +428 1837 CDS + inference ab initio prediction:Prodigal:2.6 + locus_tag GKEGNCBE_00004 + product hypothetical protein +1869 2642 CDS + dbxref COG:COG4598 + gene hisP + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:P02915 + locus_tag GKEGNCBE_00005 + product Histidine transport ATP-binding protein HisP +2709 3443 CDS + EC_number 2.1.1.228 + dbxref COG:COG0336 + gene trmD_1 + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:P0A873 + locus_tag GKEGNCBE_00006 + product tRNA (guanine-N(1)-)-methyltransferase +3427 4254 CDS + EC_number 2.1.1.228 + dbxref COG:COG0336 + gene trmD_2 + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:P0A873 + locus_tag GKEGNCBE_00007 + product tRNA (guanine-N(1)-)-methyltransferase +4299 4763 CDS + inference ab initio prediction:Prodigal:2.6 + locus_tag GKEGNCBE_00008 + product hypothetical protein +4818 5078 CDS + inference ab initio prediction:Prodigal:2.6 + locus_tag GKEGNCBE_00009 + product hypothetical protein +5117 5698 CDS + inference ab initio prediction:Prodigal:2.6 + locus_tag GKEGNCBE_00010 + product hypothetical protein +5717 7024 CDS + dbxref COG:COG1593 + gene dctM + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:O07838 + locus_tag GKEGNCBE_00011 + product C4-dicarboxylate TRAP transporter large permease protein DctM diff --git a/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.tsv b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.tsv new file mode 100644 index 0000000000000000000000000000000000000000..f71d90c382bab07dddd2ad815ac182c57bae9b46 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.tsv @@ -0,0 +1,12 @@ +locus_tag ftype length_bp gene EC_number COG product +GKEGNCBE_00001 CDS 1668 glnS 6.1.1.18 COG0008 Glutamine--tRNA ligase +GKEGNCBE_00002 CDS 831 amiD 3.5.1.28 COG3023 N-acetylmuramoyl-L-alanine amidase AmiD +GKEGNCBE_00003 CDS 381 hpcD 5.3.3.10 5-carboxymethyl-2-hydroxymuconate Delta-isomerase +GKEGNCBE_00004 CDS 1410 hypothetical protein +GKEGNCBE_00005 CDS 774 hisP COG4598 Histidine transport ATP-binding protein HisP +GKEGNCBE_00006 CDS 735 trmD_1 2.1.1.228 COG0336 tRNA (guanine-N(1)-)-methyltransferase +GKEGNCBE_00007 CDS 828 trmD_2 2.1.1.228 COG0336 tRNA (guanine-N(1)-)-methyltransferase +GKEGNCBE_00008 CDS 465 hypothetical protein +GKEGNCBE_00009 CDS 261 hypothetical protein +GKEGNCBE_00010 CDS 582 hypothetical protein +GKEGNCBE_00011 CDS 1308 dctM COG1593 C4-dicarboxylate TRAP transporter large permease protein DctM diff --git a/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.txt b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.txt new file mode 100644 index 0000000000000000000000000000000000000000..9cd2780b0722866f0837937f638f91238d44ab84 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome1.fst-split5N.fna-prokkaRes/EXAM.0219.00001.txt @@ -0,0 +1,4 @@ +organism: Genus species strain +contigs: 2 +bases: 9808 +CDS: 11 diff --git a/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna new file mode 100644 index 0000000000000000000000000000000000000000..b7998dca8c972be06796a2c1298a23b43c0e2006 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna @@ -0,0 +1,8 @@ +>this_is_genome_1 +ATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTGCTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGTGATTAGCATTATCGGCTCGTCCGGTTCCGGTAAAAGCACTTTTTTGCGCTGCATTAACTTCCTCGAAAAATCGAGCGAAGGCGCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGACGGGCAGCTCAAAGTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCACTTTAACCTCTGGAACCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTACTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGGATTGATGAGCGCGCTCAGGGCAAATATCCCGTCCATCTCTCCGGCGGCCAACAGCAGCGCGTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACATCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAAGAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCCCGCCATGTCTCTTCGCACGTGATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGCAATCCGGAGCAGGTGTTCGGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAAACGCTAGAGGACGCGCCTCTCAGAGAGCGCGCTCTCTCAGAGAGGCGCGCGCCTCTTTCGCAGAGACCNNCGCTCATGAGCGATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTGACGGCAGGTTCCGCGCTTGCTGTTCTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCCAGACCGTCGATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGCCGACGGGCAGACTGTGGTCGTGCCGTCAGGATGGGTGTGTGAAAATATCAATGCGGCGATAACGATTCCGGCGGGAAAAACGCTGCGGATACAGGGCGCGGTGCGTGGGAATGGCCGGGGACGGTTTATTTTGCTGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAATGTGACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGACGATGAGCGGCTTTGGCCCCGTCGCGCAAATTTTCATCGGCGGTAAGGAACCGCAGGTGATGCGTAATCTCATTATCGATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCATAACCAAATGGACGGCGCGCGGATTACGCATAGTCGCTTTAGCGATTTGCAGGGGGACGCCATTGAGTGGAATGTCGCGATTCATGACCGCGACATCCTGATTTCCGATCATGTCATCGAACGCATTGATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCGCCTATGACAATAGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTGGTGGCCAATATTACCGGATCTGATTGCCGACAACTGGTACACGTAGAAAATGGCAAACATTTCGTCATTCGCAATGTCAAAGCCAAAAACATCACGCCCGATTTCAGTAAAAATGCGGGTATTGATAACGCAACGATCGCCATTTATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAGTGCCGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAACTTTAAATTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCAAATTTCCTCCGGTAACGCCCCCTCATTTGTTGCCATCACCAATGTACGGATGACGCGTGCTACGCTGGAACTGCATAATCAACCGCAGCACCTCTTTTTGCGTAATATCAACGTGATGCAAACTTCAGCGATTGGCCCGGCGTTAAAAATGCATTTTGATTTGCGTAAAGATGTCCGTGGTCAATTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAATGAAAACGGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGCAGTGAATTTTTCGCTGCCGAAGCGGGGAGGGTAACACTCGCGCATAGAGAGCTCTCAGAGGAGCGCGCGCGCTATAGCGCGC +>this_is_genome_2 +CGCGATAATATAGCGCGCGCTCATAGCATGAGCGCGCTATTTCTGGCCATCCCGTTAACCATTTTTGTGTTGTTTGTGTTACCGATTTGGCTGTGGCTGCATTACAGCAACCGCGCCGGTCGGGGAGAACTGTCGCAAAGCGAGCAGCAACGCTTACTGCAACTCACAGACGACGCGCAACGTATGCGCGAGCGCATTCAGGCGCTGGAAGATATTCTTGATGCAGAGCATCCGAACTGGAGAGAGCGCTAACGCCATATTATAGCGCGCCTCATAAGAGCGGCCTATAGCGCGCTANNNATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTTACCAAGCCTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCACCGGGATTTTCCCCATCGGCGGTATCCGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGATTACGCGTTTGTGCATATGACGCTGAAAATCGGCGCCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTCGGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCGACCTGATGGAGAACCGCTATCTGGCGCTGTCGTTTGAGATTGCCGAGTTACATCCAACGCTCAATTACAAACAAAACAACGTACACGCGTTATTTAAATAGCCAGAGATTCGCGCGTTCAGAGAGGAGCTCTCTCATAGACGCGCGCATATGCGCTCTAGAGAGGCGCGCCTAATGGCGCGCTATGATGAAAGCGCTACTGTGGCTGGTTGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGCGAAAAAGGAATTATCGATAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCCTATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGACGTTAACGGGTCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTATATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCTTGGCATGCGGGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAATTCAGGCATTGATCCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCGCAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGCTTCCCGTGGCGCGAGCTGGCGGCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTGGCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCGTTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGGGTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCCGAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACAAAGCGCTCTCAGAGAGAGCGCTCTGCAGATGAGTGAGGCTGAAGCCCGCCCGACTAACTTTATTCGTCAGATTATTGATGAAGATCTGGCGAGTGGTAAACATACCACTGTCCATACCCGTTTTCCGCCGGAGCCGAATGGCTATCTGCACATCGGCCACGCGAAATCTATCTGCCTGAACTTTGGCATCGCGCAAGATTATCAGGGCCAGTGCAACCTGCGTTTCGATGACACCAACCCGGTAAAAGAAGATATCGAGTACGTTGATTCGATCAAAAACGACGTCGAGTGGTTAGGCTTTCACTGGTCTGGCGATATTCGCTACTCCTCCGATTACTTTGACCAACTGCACGCCTATGCGGTCGAGCTCATCAATAAAGGCCTGGCCTACGTTGATGAGCTGACGCCGGAGCAGATCCGCGAATACCGTGGTACGCTGACCGCGCCGGGTAAAAACAGCCCGTTCCGCGATCGCAGCGTGGAAGAAAACCTCGCGCTATTTGAAAAAATGCGTACCGGCGGTTTTGAAGAGGGTAAAGCCTGTCTGCGCGCTAAAATCGACATGGCGTCGCCGTTTATCGTGATGCGCGATCCGGTGCTGTATCGCATTAAATTCGCCGAGCATCATCAGACCGGCACGAAGTGGTGCATCTATCCGATGTACGACTTTACTCACTGCATCAGCGATGCGCTGGAAGGCATTACTCATTCTCTGTGTACGCTGGAGTTCCAGGACAACCGTCGTCTGTACGACTGGGTGCTGGACAACATCACCATTCCGGTTCACCCGCGCCAGTACGAATTCTCGCGCCTGAATCTGGAATACACCGTGATGTCCAAGCGTAAGCTGAACCTGCTGGTGACCGACAAACACGTCGAAGGTTGGGATGATCCGCGTATGCCGACTATTTCCGGTCTGCGCCGTCGCGGCTATACCGCGGCTTCTATTCGTGAGTTCTGCAAACGCATCGGCGTCACCAAGCAGGACAACACTATTGAGATGGCGTCGCTGGAATCCTGCATTCGCGAAGATCTGAACGAAAACGCGCCGCGCGCGATGGCGGTAATCGATCCGGTAAAACTGGTTATCGAAAACTACCCGCAGGGCGAGAGCGAAATGGTTACCATGCCTAACCATCCGAATAAACCGGAGATGGGCAGCCGTGAAGTGCCGTTTAGCGGTGAGATCTGGATCGATCGCGCGGATTTCCGCGAAGAAGCGAACAAACAGTACAAACGTCTGGTGATGGGCAAAGAAGTGCGTCTGCGTAATGCCTACGTCATTAAAGCGGAGCGCGTAGAGAAGGATGCCGAAGGGAATATCACCACCATCTTCTGTACCTATGATGCTGATACGCTGAGTAAAGATCCGGCTGACGGGCGTAAAGTGAAAGGCGTAATCCACTGGGTTAGCGTAGCACATGCGCTGCCGATTGAAATTCGTCTCTACGACCGTCTGTTCAGCGTGCCGAACCCGGGCGCCGCGGAGGACTTCCTGTCTGTTATCAACCCCGAATCATTAGTGATTAAGCAGGGGTATGGCGAGCCGTCGCTGAAAGCGGCGGTAGCAGGGAAAGCTTTCCAGTTTGAACGTGAAGGTTACTTCTGCCTTGACAGCCGCTATGCAACGGCCGATAAGCTGGTCTTTAACCGCACCGTGGGCCTGCGTGATACCTGGGCGAAAGCGGGCGAATAA +>this_is_genome_3 +ACGCGCTATAGGGCTCTCAGAGAGTCTCAGTGTTTATTGGCATCGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGACTTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCGGGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATTCATGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAACTCATTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGGCCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGGGTTCTGGGGCATGAGGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCGCTGGCGTTTGAAACAGTCACTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAGCGAGATCGCGCATAGCGCGCGGCGAGATCCGCGAGACATGAATAATCATTTTGGGAAAGGGTTAATGGCCGGGTTGCACGCGCCATATGCATATAGCGCGCATCATGCGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGTATGTTCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGCCGGAATTCTGACGCGTCGCTATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATGGCGGTTCGCTTCTTTATGGCTGGTTATCGACTCGAAGGTTGAACGCCTCTAGAGCGCTAGAGGCGCGCGCGATATACGCGGCGCGAGACATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGCCAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTTGAGATGATCGTCCCTCAACTGCCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAATCTCTCGTACTTGAGCATGGCGGTGCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAATGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACCGACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGCCATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGGCTACTGCAACAGACGGGCGATGAAGTGCTGGATTACCGCCAGGCGGTGGCATATTACGCCTCCTGCCGTCAGACAGTGACCGAGGGGGGTAATCACGCATTCACGGGCTTCGAAGATTATTTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGACCGGATATACGCGCGCGCGCTATATAGCGCGCGCGGCGATATAGCGCATGAAGTTTGTTGCGCCAGAACAGGCACCGGAACAGGCGGAAATCATCAGAAATACGCCGTTCTGGCCTGATGTGGACCTGTCGGAGTTTCGCAGCATGATGCGCACTGATGGCACGGTGACGCAGCCGCGTTTAAAGCAGGTTGCGCTGTCGGCAATTTCGGAGGTCAACGCAGAGCTGTATGAGTTTCGCAGACGCCAGCAGATGCTGGGGTATGCCTCGCTGGCAGAAGTCCCGGCGGAACAACTGGACGGCAAAAGCGAGCGCATTCAGCACTATTTCAACGCGGTTTACTGCTGGGCACGCGCCATGCTCAACGAACGTTACCAGGACTATGACGCCACGGCATCCGGTGTGAAGCGGGGCGAAGAACTGGCAGAAGCCAGCGGTGATTTGTGGCGTGACGCCCGCTGGGCCATCAGCCGGGTGCAGGATGCGCCGCACTGCACAGTGGAGCTTATCTGA +>this_is_genome_4 +ATGATTGACCCTATTTTTGCGTCCTGTACGCTAATTGCCGTCTTTGTTGTTTTACTGGCCATGGGCGCGCCTATCGGGATCTGCATCGTTATCGCCTCTTTCAGCACCATGATGCTGGTACTGCCTTTCGATATTTCGATGTTCGCCACCGCGCAAAAAATGTTCTCCAGCCTGGACAGTTTTGCCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGCGGGGGAATTGCCGCCCGACTGGTCAATTTTGCCAAACTGTTTACTGGCAAACTGCCCGGCTCGCTCTCTTACACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCGATTGCCGCCTCAACCTCTATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTACGATCGCGGTTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAATTCCGCCCACCACGGCTTTTATCCTTTATGCGCTGGCAAGCGGGGGAACATCGATTGCCGCTCTGTTCGCCGGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTGGTGGTCGCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGCATGGCGCTAAAAGTTGCCGTTGAGGCCATTCCCAGCCTGTTACTGATCGTGATTATTGTCGGCGGCATTGTGCAGGGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTATACGTTATTGCTGACGATAGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAGACAGTGGTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTCTCGATGTCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCCAATAAACTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATGGATATCGGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCGATTATGACTAAACTGGGCGTCGATCCGGTGCATTTCGGCATTATCATGATCTATAACCTGGCGATAGGCACCATTACGCCGCCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAGGACGTTATCAAACCGTTGATGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTATTAATTACCTACATTCCGGAAATCACACTGTTTTTACCCCGTCTACTGGGCATCATGTAAACGCTCATAGGCGGCGCGCGCGCTCTCAGGAATGAAGTTTGTTGCGCCAGAACAGGCACCGGAACAGGCGGAGGTCATCAAAAATACGCCGTTCTGGCCTGATGTGAACCTGTCGGAATTTCGCAGTGTGATGCGCACTGACGGCACGGTGACGCAGCCGCGTTTAAAGCAGGTCGTGCTGACGGCGATCTCTGAGGTTAACGCTGAGCTGTACGACTTCCGCAACCGTCAGCAGATGCTGGGCTGGCGGACACTTGCTGAGGTTCCCGCAGAAATGCTGGACGGTAAAAGCGAGCGTATCCGGCACTACCACAACGCTGTTTTTTGCTGGGCGCGCGCTGTGCTTAATGAGCGTTATCAGGACTATGACGCCACGGCGTCAGGCGTGAAGCGAGGGGAGGAGCTGGCGGAGGCCAGCGGCGATCTGTGGCGTGATGCCCGCTGGGCCATCAGCCGGGTGCAGGATGCACCGCACTGTACGGTGGAGCTTATCTGACGCTCATAGGCGCGCGCTCATAGCGCGATGGAAACAAATATTACCTGGCAACAATTGATAGATGAATATTTCTTCGCAAAACCTCTGCGCTCAGCATCTGAATGGAGTTACACCAAAGTCTTCAAATCATTTGTACATTATATGGGGCCGTTAAGCTGCCCTAATGATGTGACATATCACAAAGTGCTTGCCTGGCGCCGTTTTCTTTTAAAAGAGAAAAAGCTGTCCGGACGTACCTGGAATAACAAGGTGGCGCATATGCGGGCCATCTTTAACTACGGAATACAGCGAGGGTTACTGCACTATGACGAAAATCCGTTTAACAATTCGGTAGTTAAACCGGACAAGAAGAGAAAGAAAACGCTCACTCAGGCACAGATTGAGTATGCCTATCAGATCATGGAGCAGTATGAAAATCAGGAGAATACAGGGCTGGGACTGAAATATTCCCGCTGCGCCTTATTTCCTGCATGGTTCTGGCTCACTGTCCTGGATACGCTCTATTACACAGGGATACGTCAGAACCAGTTATTACATATTCGGCTGAATGATGTTGATTTGAGAGAAGGGCAGATTCGGCTGATTACGGAGGGGTGTAAAAATCACAAAGAACACTATGTGCCGGTGATCAGTTTTCTGCGTCCACGGCTGACCTGTTTAATGGAGAAAGCGCAGAGCGAAGGATTGAAAGGTAATGACCGCCTGTTCAATATTGCACTTTTTACCGGCAAAGATCCCGCCATTGGCGATGACATGGATTCTCCTCAGGTAAGAGCATTCTTCCGTCGTCTGTCCAAGGAGTGTCAGTTTGCGATCAGTCCTCATCGTTTCAGACACACGCTGGCCACGGAGATGATGAAAATGCCGGAACAGAATCTGCATATGGCGCAAAGTGTGCTGGGTCATTCAAACATGAAATCCACGCTGGAGTATGTGGAGAATGATATTGCAGTGATGGGGAGGGCTCTGGAAGCGCAGTTTATGCAGATTAAGGCAGCACATGCCCGAAGCATTTACAGTGGGTTGACAAAGAATAGATAA diff --git a/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokka.log b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokka.log new file mode 100644 index 0000000000000000000000000000000000000000..4e555a8670fed1a2ffe74b977bb3d5ef3b2fa5ba --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokka.log @@ -0,0 +1,119 @@ +[13:18:45] This is prokka 1.14-dev +[13:18:45] Written by Torsten Seemann <torsten.seemann@gmail.com> +[13:18:45] Homepage is https://github.com/tseemann/prokka +[13:18:45] Local time is Tue Feb 12 13:18:45 2019 +[13:18:45] You are aperrin +[13:18:45] Operating system is darwin +[13:18:45] You have BioPerl 1.006924 +[13:18:45] System has 8 cores. +[13:18:45] Will use maximum of 1 cores. +[13:18:45] Annotating as >>> Bacteria <<< +[13:18:45] Generating locus_tag from 'Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna' contents. +[13:18:45] Setting --locustag OHLPOIIB from MD5 8159822bce92b47adcc120629558b9c6 +[13:18:45] Creating new output folder: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes +[13:18:45] Running: mkdir -p Examples\/1\-res\-Annotate\/tmp_files\/genome2\.fst\-split5N\.fna\-prokkaRes +[13:18:45] Using filename prefix: GEN2.0219.00001.XXX +[13:18:45] Setting HMMER_NCPU=1 +[13:18:45] Writing log to: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.log +[13:18:45] Command: /usr/local/bin/prokka --outdir Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes --cpus 1 --prefix GEN2.0219.00001 Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna +[13:18:45] Appending to PATH: /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin +[13:18:45] Appending to PATH: /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/../common +[13:18:45] Appending to PATH: /Users/aperrin/Softwares/src/prokka/bin +[13:18:45] Looking for 'aragorn' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/aragorn +[13:18:45] Determined aragorn version is 1.2 +[13:18:45] Looking for 'barrnap' - found /usr/local/bin/barrnap +[13:18:45] Determined barrnap version is 0.8 +[13:18:45] Looking for 'blastp' - found /Users/aperrin/Softwares/bin/blastp +[13:18:45] Determined blastp version is 2.3 +[13:18:45] Looking for 'cmpress' - found /usr/local/bin/cmpress +[13:18:45] Determined cmpress version is 1.1 +[13:18:45] Looking for 'cmscan' - found /usr/local/bin/cmscan +[13:18:45] Determined cmscan version is 1.1 +[13:18:45] Looking for 'egrep' - found /usr/bin/egrep +[13:18:45] Looking for 'find' - found /usr/bin/find +[13:18:45] Looking for 'grep' - found /usr/bin/grep +[13:18:45] Looking for 'hmmpress' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/hmmpress +[13:18:45] Determined hmmpress version is 3.1 +[13:18:45] Looking for 'hmmscan' - found /usr/local/bin/hmmscan +[13:18:45] Determined hmmscan version is 3.1 +[13:18:45] Looking for 'java' - found /usr/bin/java +[13:18:45] Looking for 'less' - found /usr/bin/less +[13:18:45] Looking for 'makeblastdb' - found /Users/aperrin/Softwares/bin/makeblastdb +[13:18:45] Determined makeblastdb version is 2.3 +[13:18:45] Looking for 'minced' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/../common/minced +[13:18:45] Determined minced version is 2.0 +[13:18:45] Looking for 'parallel' - found /usr/local/bin/parallel +[13:18:45] Determined parallel version is 20181022 +[13:18:45] Looking for 'prodigal' - found /usr/local/bin/prodigal +[13:18:45] Determined prodigal version is 2.6 +[13:18:45] Looking for 'prokka-genbank_to_fasta_db' - found /Users/aperrin/Softwares/src/prokka/bin/prokka-genbank_to_fasta_db +[13:18:45] Looking for 'sed' - found /usr/bin/sed +[13:18:45] Looking for 'tbl2asn' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/tbl2asn +[13:18:45] Determined tbl2asn version is 25.6 +[13:18:45] Using genetic code table 11. +[13:18:45] Loading and checking input file: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna +[13:18:45] Wrote 4 contigs totalling 10711 bp. +[13:18:45] Predicting tRNAs and tmRNAs +[13:18:45] Running: aragorn -l -gc11 -w Examples\/1\-res\-Annotate\/tmp_files\/genome2\.fst\-split5N\.fna\-prokkaRes\/GEN2\.0219\.00001\.fna +[13:18:45] Found 0 tRNAs +[13:18:45] Predicting Ribosomal RNAs +[13:18:45] Running Barrnap with 1 threads +[13:18:45] Found 0 rRNAs +[13:18:45] Skipping ncRNA search, enable with --rfam if desired. +[13:18:45] Total of 0 tRNA + rRNA features +[13:18:45] Searching for CRISPR repeats +[13:18:46] Found 0 CRISPRs +[13:18:46] Predicting coding sequences +[13:18:46] Contigs total 10711 bp, so using meta mode +[13:18:46] Running: prodigal -i Examples\/1\-res\-Annotate\/tmp_files\/genome2\.fst\-split5N\.fna\-prokkaRes\/GEN2\.0219\.00001\.fna -c -m -g 11 -p meta -f sco -q +[13:18:46] Found 13 CDS +[13:18:46] Connecting features back to sequences +[13:18:46] Not using genus-specific database. Try --usegenus to enable it. +[13:18:46] Annotating CDS, please be patient. +[13:18:46] Will use 1 CPUs for similarity searching. +[13:18:46] There are still 13 unannotated CDS left (started with 13) +[13:18:46] Will use blast to search against /Users/aperrin/Softwares/src/prokka/db/kingdom/Bacteria/sprot with 1 CPUs +[13:18:46] Running: cat Examples\/1\-res\-Annotate\/tmp_files\/genome2\.fst\-split5N\.fna\-prokkaRes\/sprot\.faa | parallel --gnu --plain -j 1 --block 1697 --recstart '>' --pipe blastp -query - -db /Users/aperrin/Softwares/src/prokka/db/kingdom/Bacteria/sprot -evalue 1e-06 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > Examples\/1\-res\-Annotate\/tmp_files\/genome2\.fst\-split5N\.fna\-prokkaRes\/sprot\.blast 2> /dev/null +[13:18:47] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/sprot.faa +[13:18:47] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/sprot.blast +[13:18:47] There are still 5 unannotated CDS left (started with 13) +[13:18:47] Will use hmmer3 to search against /Users/aperrin/Softwares/src/prokka/db/hmm/HAMAP.hmm with 1 CPUs +[13:18:47] Running: cat Examples\/1\-res\-Annotate\/tmp_files\/genome2\.fst\-split5N\.fna\-prokkaRes\/HAMAP\.hmm\.faa | parallel --gnu --plain -j 1 --block 521 --recstart '>' --pipe hmmscan --noali --notextw --acc -E 1e-06 --cpu 1 /Users/aperrin/Softwares/src/prokka/db/hmm/HAMAP.hmm /dev/stdin > Examples\/1\-res\-Annotate\/tmp_files\/genome2\.fst\-split5N\.fna\-prokkaRes\/HAMAP\.hmm\.hmmer3 2> /dev/null +[13:18:48] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/HAMAP.hmm.faa +[13:18:48] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/HAMAP.hmm.hmmer3 +[13:18:48] Labelling remaining 5 proteins as 'hypothetical protein' +[13:18:48] Found 8 unique /gene codes. +[13:18:48] Fixed 0 colliding /gene names. +[13:18:48] Adding /locus_tag identifiers +[13:18:48] Assigned 13 locus_tags to CDS and RNA features. +[13:18:48] Writing outputs to Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/ +[13:18:48] Generating annotation statistics file +[13:18:48] Generating Genbank and Sequin files +[13:18:48] Running: tbl2asn -V b -a r10k -l paired-ends -M n -N 1 -y 'Annotated using prokka 1.14-dev from https://github.com/tseemann/prokka' -Z Examples\/1\-res\-Annotate\/tmp_files\/genome2\.fst\-split5N\.fna\-prokkaRes\/GEN2\.0219\.00001\.err -i Examples\/1\-res\-Annotate\/tmp_files\/genome2\.fst\-split5N\.fna\-prokkaRes\/GEN2\.0219\.00001\.fsa 2> /dev/null +[13:18:48] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/errorsummary.val +[13:18:48] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.dr +[13:18:48] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.fixedproducts +[13:18:48] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.ecn +[13:18:48] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.val +[13:18:48] Repairing broken .GBK output that tbl2asn produces... +[13:18:48] Running: sed 's/COORDINATES: profile/COORDINATES:profile/' < Examples\/1\-res\-Annotate\/tmp_files\/genome2\.fst\-split5N\.fna\-prokkaRes\/GEN2\.0219\.00001\.gbf > Examples\/1\-res\-Annotate\/tmp_files\/genome2\.fst\-split5N\.fna\-prokkaRes\/GEN2\.0219\.00001\.gbk +[13:18:48] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.gbf +[13:18:48] Output files: +[13:18:48] Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.gff +[13:18:48] Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.tbl +[13:18:48] Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.sqn +[13:18:48] Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.fna +[13:18:48] Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.tsv +[13:18:48] Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.err +[13:18:48] Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.txt +[13:18:48] Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.fsa +[13:18:48] Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.faa +[13:18:48] Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.log +[13:18:48] Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.ffn +[13:18:48] Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.gbk +[13:18:48] Annotation finished successfully. +[13:18:48] Walltime used: 0.05 minutes +[13:18:48] If you use this result please cite the Prokka paper: +[13:18:48] Seemann T (2014) Prokka: rapid prokaryotic genome annotation. Bioinformatics. 30(14):2068-9. +[13:18:48] Type 'prokka --citation' for more details. +[13:18:48] Share and enjoy! diff --git a/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.err b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.err new file mode 100644 index 0000000000000000000000000000000000000000..d2f6f3262c91bdc696d481b4c91e4962e933b261 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.err @@ -0,0 +1,146 @@ +Discrepancy Report Results + +Summary +FATAL: MISSING_PROTEIN_ID:13 proteins have invalid IDs. +DISC_SOURCE_QUALS_ASNDISC:strain (all present, all same) +DISC_SOURCE_QUALS_ASNDISC:taxname (all present, all same) +DISC_FEATURE_COUNT:CDS: 13 present +DISC_COUNT_NUCLEOTIDES:4 nucleotide Bioseqs are present +FEATURE_LOCATION_CONFLICT:13 features have inconsistent gene locations. +DISC_QUALITY_SCORES:Quality scores are missing on all sequences. +ONCALLER_COMMENT_PRESENT:4 comment descriptors were found (all same) +MISSING_GENOMEASSEMBLY_COMMENTS:4 bioseqs are missing GenomeAssembly structured comments +MOLTYPE_NOT_MRNA:4 molecule types are not set as mRNA. +TECHNIQUE_NOT_TSA:4 technique are not set as TSA +MISSING_STRUCTURED_COMMENT:4 sequences do not include structured comments. +MISSING_PROJECT:17 sequences do not include project. +DISC_INCONSISTENT_MOLINFO_TECH:Molinfo Technique Report (some missing, all same) + + +Detailed Report + +FATAL: DiscRep_ALL:MISSING_PROTEIN_ID::13 proteins halid nvalid IDs. +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_1_1 (length 257) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_1_2 (length 467) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_2_1 (length 74) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_2_2 (length 126) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_2_3 (length 277) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_2_4 (length 555) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_3_1 (length 255) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_3_2 (length 86) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_3_3 (length 193) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_3_4 (length 122) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_4_1 (length 435) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_4_2 (length 154) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_4_3 (length 337) + +DiscRep_ALL:DISC_SOURCE_QUALS_ASNDISC::strain (all present, all same) +DiscRep_SUB:DISC_SOURCE_QUALS_ASNDISC::4 sources have 'strain' for strain +DiscRep_ALL:DISC_SOURCE_QUALS_ASNDISC::taxname (all present, all same) +DiscRep_SUB:DISC_SOURCE_QUALS_ASNDISC::4 sources have 'Genus species' for taxname +DiscRep_ALL:DISC_FEATURE_COUNT::CDS: 13 present +DiscRep_ALL:DISC_COUNT_NUCLEOTIDES::4 nucleotide Bioseqs are present +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_1 (length 2308, 2 other) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_2 (length 3295, 3 other) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_3 (length 2263) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_4 (length 2845) + +DiscRep_ALL:FEATURE_LOCATION_CONFLICT::13 features have inconsistent gene locations. +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:CDS Histidine transport ATP-binding protein HisP this_is_genome_1:1-774 OHLPOIIB_00001 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:CDS hypothetical protein this_is_genome_1:857-2260 OHLPOIIB_00002 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:CDS Phage shock protein B this_is_genome_2:28-252 OHLPOIIB_00003 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:CDS 5-carboxymethyl-2-hydroxymuconate Delta-isomerase this_is_genome_2:301-681 OHLPOIIB_00004 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:CDS N-acetylmuramoyl-L-alanine amidase AmiD this_is_genome_2:764-1597 OHLPOIIB_00005 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:CDS Glutamine--tRNA ligase this_is_genome_2:1628-3295 OHLPOIIB_00006 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:CDS tRNA (guanine-N(1)-)-methyltransferase this_is_genome_3:30-797 OHLPOIIB_00007 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:CDS hypothetical protein this_is_genome_3:862-1122 OHLPOIIB_00008 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:CDS hypothetical protein this_is_genome_3:1170-1751 OHLPOIIB_00009 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:CDS hypothetical protein this_is_genome_3:1895-2263 OHLPOIIB_00010 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:CDS C4-dicarboxylate TRAP transporter large permease protein DctM this_is_genome_4:1-1308 OHLPOIIB_00011 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:CDS hypothetical protein this_is_genome_4:1340-1804 OHLPOIIB_00012 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:CDS Tyrosine recombinase XerC this_is_genome_4:1832-2845 OHLPOIIB_00013 + +DiscRep_ALL:DISC_QUALITY_SCORES::Quality scores are missing on all sequences. + +DiscRep_ALL:ONCALLER_COMMENT_PRESENT::4 comment descriptors were found (all same) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_1:Annotated using prokka 1.14-dev from https://github.com/tseemann/prokka +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_2:Annotated using prokka 1.14-dev from https://github.com/tseemann/prokka +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_3:Annotated using prokka 1.14-dev from https://github.com/tseemann/prokka +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_4:Annotated using prokka 1.14-dev from https://github.com/tseemann/prokka + +DiscRep_ALL:MISSING_GENOMEASSEMBLY_COMMENTS::4 bioseqs are missing GenomeAssembly structured comments +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_1 (length 2308, 2 other) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_2 (length 3295, 3 other) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_3 (length 2263) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_4 (length 2845) + +DiscRep_ALL:MOLTYPE_NOT_MRNA::4 molecule types are not set as mRNA. +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_1 (length 2308, 2 other) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_2 (length 3295, 3 other) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_3 (length 2263) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_4 (length 2845) + +DiscRep_ALL:TECHNIQUE_NOT_TSA::4 technique are not set as TSA +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_1 (length 2308, 2 other) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_2 (length 3295, 3 other) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_3 (length 2263) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_4 (length 2845) + +DiscRep_ALL:MISSING_STRUCTURED_COMMENT::4 sequences do not include structured comments. +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_1 (length 2308, 2 other) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_2 (length 3295, 3 other) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_3 (length 2263) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_4 (length 2845) + +DiscRep_ALL:MISSING_PROJECT::17 sequences do not include project. +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_1 (length 2308, 2 other) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_1_1 (length 257) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_1_2 (length 467) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_2 (length 3295, 3 other) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_2_1 (length 74) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_2_2 (length 126) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_2_3 (length 277) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_2_4 (length 555) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_3 (length 2263) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_3_1 (length 255) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_3_2 (length 86) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_3_3 (length 193) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_3_4 (length 122) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_4 (length 2845) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_4_1 (length 435) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_4_2 (length 154) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_4_3 (length 337) + +DiscRep_ALL:DISC_INCONSISTENT_MOLINFO_TECH::Molinfo Technique Report (some missing, all same) +DiscRep_SUB:DISC_INCONSISTENT_MOLINFO_TECH::technique (all missing) +DiscRep_SUB:DISC_INCONSISTENT_MOLINFO_TECH::4 Molinfos are missing field technique +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_1 (length 2308, 2 other) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_2 (length 3295, 3 other) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_3 (length 2263) +Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001:this_is_genome_4 (length 2845) + diff --git a/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.faa b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.faa new file mode 100644 index 0000000000000000000000000000000000000000..e1d7e1891450800eb67bf2f829dd0f52874e135b --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.faa @@ -0,0 +1,77 @@ +>OHLPOIIB_00001 Histidine transport ATP-binding protein HisP +MSENKLHVIDLHKRYGGHEVLKGVSLQARAGDVISIIGSSGSGKSTFLRCINFLEKSSEG +AIIVNGQNINLVRDKDGQLKVADKNQLRLLRTRLTMVFQHFNLWNHMTVLENVMEAPIQV +LGLSKHDARERALKYLAKVGIDERAQGKYPVHLSGGQQQRVSIARALAMEPDVLLFDEPT +SALDPELVGEVLRIMQQLAEEGKTMVVVTHEMGFARHVSSHVIFLHQGKIEEEGNPEQVF +GNPQSPRLQQFLKGSLK +>OHLPOIIB_00002 hypothetical protein +MPATKFSRRTLLTAGSALAVLPFLRALPVQAREPRQTVDIKDYPADDGIASFKQAFADGQ +TVVVPSGWVCENINAAITIPAGKTLRIQGAVRGNGRGRFILLDGCQVVGEQGGSLHNVTL +DVRGSDCVIKGVTMSGFGPVAQIFIGGKEPQVMRNLIIDDITVTHANYAILRQGFHNQMD +GARITHSRFSDLQGDAIEWNVAIHDRDILISDHVIERIDCTNGKINWGIGIGLAGSAYDN +SYPEDQAVKNFVVANITGSDCRQLVHVENGKHFVIRNVKAKNITPDFSKNAGIDNATIAI +YGCDNFVIDNIDMTNSAGMLIGYGVVKGKYLSIPQNFKLNAIRLDNRQVAYKLRGIQISS +GNAPSFVAITNVRMTRATLELHNQPQHLFLRNINVMQTSAIGPALKMHFDLRKDVRGQFM +ARQDTLLSLANVHAINENGQSSVDIDRINHQTVNVEAVNFSLPKRGG +>OHLPOIIB_00003 Phage shock protein B +MSALFLAIPLTIFVLFVLPIWLWLHYSNRAGRGELSQSEQQRLLQLTDDAQRMRERIQAL +EDILDAEHPNWRER +>OHLPOIIB_00004 5-carboxymethyl-2-hydroxymuconate Delta-isomerase +MPHFIAECTENIREQADLPSLFSKVNEALAATGIFPIGGIRSRAHWLDTWQMADGKHDYA +FVHMTLKIGAGRSLESRQEVGEMLFGLIKAHFADLMENRYLALSFEIAELHPTLNYKQNN +VHALFK +>OHLPOIIB_00005 N-acetylmuramoyl-L-alanine amidase AmiD +MMKALLWLVGLALLLTGCASEKGIIDKEGYQLDTRHRAQAAYPRIKVLVIHYTAENFDVS +LATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGVSFWRGATRLNDTSIGIE +LENRGWRMSGGVKSFAPFESAQIQALIPLAKDIIARYDIKPQNVVAHADIAPQRKDDPGP +RFPWRELAAQGIGAWPDAQRVAFYLAGRAPYTPVDTATVLALLSRYGYEVKADMTAREQQ +RVIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD +>OHLPOIIB_00006 Glutamine--tRNA ligase +MSEAEARPTNFIRQIIDEDLASGKHTTVHTRFPPEPNGYLHIGHAKSICLNFGIAQDYQG +QCNLRFDDTNPVKEDIEYVDSIKNDVEWLGFHWSGDIRYSSDYFDQLHAYAVELINKGLA +YVDELTPEQIREYRGTLTAPGKNSPFRDRSVEENLALFEKMRTGGFEEGKACLRAKIDMA +SPFIVMRDPVLYRIKFAEHHQTGTKWCIYPMYDFTHCISDALEGITHSLCTLEFQDNRRL +YDWVLDNITIPVHPRQYEFSRLNLEYTVMSKRKLNLLVTDKHVEGWDDPRMPTISGLRRR +GYTAASIREFCKRIGVTKQDNTIEMASLESCIREDLNENAPRAMAVIDPVKLVIENYPQG +ESEMVTMPNHPNKPEMGSREVPFSGEIWIDRADFREEANKQYKRLVMGKEVRLRNAYVIK +AERVEKDAEGNITTIFCTYDADTLSKDPADGRKVKGVIHWVSVAHALPIEIRLYDRLFSV +PNPGAAEDFLSVINPESLVIKQGYGEPSLKAAVAGKAFQFEREGYFCLDSRYATADKLVF +NRTVGLRDTWAKAGE +>OHLPOIIB_00007 tRNA (guanine-N(1)-)-methyltransferase +MFIGIVSLFPEMFRAITDYGVTGRAVKKGLLNIQSWSPRDFAHDRHRTVDDRPYGGGPGM +LMMVQPLRDAIHAAKAAAGEGAKVIYLSPQGRKLDQAGVSELATNQKLILVCGRYEGVDE +RVIQAEIDEEWSIGDYVLSGGELPAMTLIDSVARFIPGVLGHEASAIEDSFADGLLDCPH +YTRPEVLEGMEVPPVLLSGNHAEIRRWRLKQSLGRTWLRRPELLENLALTEEQARLLAEF +KTEHAQQQHKHDGMA +>OHLPOIIB_00008 hypothetical protein +MAGLHAPYAYSAHHAVNFCSEYKRGFVLGFTHRMFEKTGDRQLSAWEAGILTRRYGLDKE +MVMDFFKENHSGMAVRFFMAGYRLEG +>OHLPOIIB_00009 hypothetical protein +MSTLLYLHGFNSSPRSAKACQLKNWLAERHPHVEMIVPQLPPYPADAAELLESLVLEHGG +APLGLVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDYLGQNENPYTGQQYVLESR +HIYDLKVMQIDPLEAPDLIWLLQQTGDEVLDYRQAVAYYASCRQTVTEGGNHAFTGFEDY +FNQIVDFLGLHSC +>OHLPOIIB_00010 hypothetical protein +MMRTDGTVTQPRLKQVALSAISEVNAELYEFRRRQQMLGYASLAEVPAEQLDGKSERIQH +YFNAVYCWARAMLNERYQDYDATASGVKRGEELAEASGDLWRDARWAISRVQDAPHCTVE +LI +>OHLPOIIB_00011 C4-dicarboxylate TRAP transporter large permease protein DctM +MIDPIFASCTLIAVFVVLLAMGAPIGICIVIASFSTMMLVLPFDISMFATAQKMFSSLDS +FALLAVPFFVLSGVIMNSGGIAARLVNFAKLFTGKLPGSLSYTNIVGNMMFGAISGSAIA +ASTSIGGVMVPMSAREGYDRGFAAAVNIASAPTGMLIPPTTAFILYALASGGTSIAALFA +GGLVAGVLWGVGCMLVTLVVAKRRNYRVFFTVQKGMALKVAVEAIPSLLLIVIIVGGIVQ +GIFTAIEASAIAVVYTLLLTIVFYRTLKIKDLPSILLQTVVMTGVIMFLLATSSAMSFSM +SITNIPAALSDMILGISANKLVILLVITVFLLIIGAFMDIGPAILIFTPILLPIMTKLGV +DPVHFGIIMIYNLAIGTITPPVGSGLYVGASVGKVKVEDVIKPLMPFYGAIIGVLLLITY +IPEITLFLPRLLGIM +>OHLPOIIB_00012 hypothetical protein +MKFVAPEQAPEQAEVIKNTPFWPDVNLSEFRSVMRTDGTVTQPRLKQVVLTAISEVNAEL +YDFRNRQQMLGWRTLAEVPAEMLDGKSERIRHYHNAVFCWARAVLNERYQDYDATASGVK +RGEELAEASGDLWRDARWAISRVQDAPHCTVELI +>OHLPOIIB_00013 Tyrosine recombinase XerC +METNITWQQLIDEYFFAKPLRSASEWSYTKVFKSFVHYMGPLSCPNDVTYHKVLAWRRFL +LKEKKLSGRTWNNKVAHMRAIFNYGIQRGLLHYDENPFNNSVVKPDKKRKKTLTQAQIEY +AYQIMEQYENQENTGLGLKYSRCALFPAWFWLTVLDTLYYTGIRQNQLLHIRLNDVDLRE +GQIRLITEGCKNHKEHYVPVISFLRPRLTCLMEKAQSEGLKGNDRLFNIALFTGKDPAIG +DDMDSPQVRAFFRRLSKECQFAISPHRFRHTLATEMMKMPEQNLHMAQSVLGHSNMKSTL +EYVENDIAVMGRALEAQFMQIKAAHARSIYSGLTKNR diff --git a/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.ffn b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.ffn new file mode 100644 index 0000000000000000000000000000000000000000..dfb0f1dbe14cb5dee51af28c1445f24e73a962d1 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.ffn @@ -0,0 +1,185 @@ +>OHLPOIIB_00001 Histidine transport ATP-binding protein HisP +ATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTG +CTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGTGATTAGCATTATCGGCTCGTCC +GGTTCCGGTAAAAGCACTTTTTTGCGCTGCATTAACTTCCTCGAAAAATCGAGCGAAGGC +GCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGACGGGCAGCTCAAA +GTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCAC +TTTAACCTCTGGAACCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTA +CTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGG +ATTGATGAGCGCGCTCAGGGCAAATATCCCGTCCATCTCTCCGGCGGCCAACAGCAGCGC +GTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACA +TCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAA +GAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCCCGCCATGTCTCTTCG +CACGTGATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGCAATCCGGAGCAGGTGTTC +GGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAA +>OHLPOIIB_00002 hypothetical protein +ATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTGACGGCAGGTTCCGCGCTTGCTGTT +CTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCCAGACCGTCGATATT +AAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGCCGACGGGCAG +ACTGTGGTCGTGCCGTCAGGATGGGTGTGTGAAAATATCAATGCGGCGATAACGATTCCG +GCGGGAAAAACGCTGCGGATACAGGGCGCGGTGCGTGGGAATGGCCGGGGACGGTTTATT +TTGCTGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAATGTGACGCTG +GATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGACGATGAGCGGCTTTGGCCCCGTC +GCGCAAATTTTCATCGGCGGTAAGGAACCGCAGGTGATGCGTAATCTCATTATCGATGAC +ATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCATAACCAAATGGAC +GGCGCGCGGATTACGCATAGTCGCTTTAGCGATTTGCAGGGGGACGCCATTGAGTGGAAT +GTCGCGATTCATGACCGCGACATCCTGATTTCCGATCATGTCATCGAACGCATTGATTGT +ACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCGCCTATGACAAT +AGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTGGTGGCCAATATTACCGGATCTGAT +TGCCGACAACTGGTACACGTAGAAAATGGCAAACATTTCGTCATTCGCAATGTCAAAGCC +AAAAACATCACGCCCGATTTCAGTAAAAATGCGGGTATTGATAACGCAACGATCGCCATT +TATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAGTGCCGGGATGCTC +ATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAACTTTAAATTAAAC +GCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCAAATTTCCTCC +GGTAACGCCCCCTCATTTGTTGCCATCACCAATGTACGGATGACGCGTGCTACGCTGGAA +CTGCATAATCAACCGCAGCACCTCTTTTTGCGTAATATCAACGTGATGCAAACTTCAGCG +ATTGGCCCGGCGTTAAAAATGCATTTTGATTTGCGTAAAGATGTCCGTGGTCAATTTATG +GCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAATGAAAACGGGCAG +AGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGCAGTGAATTTT +TCGCTGCCGAAGCGGGGAGGGTAA +>OHLPOIIB_00003 Phage shock protein B +ATGAGCGCGCTATTTCTGGCCATCCCGTTAACCATTTTTGTGTTGTTTGTGTTACCGATT +TGGCTGTGGCTGCATTACAGCAACCGCGCCGGTCGGGGAGAACTGTCGCAAAGCGAGCAG +CAACGCTTACTGCAACTCACAGACGACGCGCAACGTATGCGCGAGCGCATTCAGGCGCTG +GAAGATATTCTTGATGCAGAGCATCCGAACTGGAGAGAGCGCTAA +>OHLPOIIB_00004 5-carboxymethyl-2-hydroxymuconate Delta-isomerase +ATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTTACCAAGC +CTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCACCGGGATTTTCCCCATCGGCGGTATC +CGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGATTACGCG +TTTGTGCATATGACGCTGAAAATCGGCGCCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTC +GGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCGACCTGATGGAGAACCGCTAT +CTGGCGCTGTCGTTTGAGATTGCCGAGTTACATCCAACGCTCAATTACAAACAAAACAAC +GTACACGCGTTATTTAAATAG +>OHLPOIIB_00005 N-acetylmuramoyl-L-alanine amidase AmiD +ATGATGAAAGCGCTACTGTGGCTGGTTGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGC +GAAAAAGGAATTATCGATAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCG +GCCTATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCG +CTGGCGACGTTAACGGGTCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCA +TTATATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCTTGGCAT +GCGGGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAG +CTGGAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCC +GCGCAAATTCAGGCATTGATCCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAA +CCGCAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCG +CGCTTCCCGTGGCGCGAGCTGGCGGCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGT +GTGGCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTT +GCGTTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAG +CGGGTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGAT +GCCGAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAA +>OHLPOIIB_00006 Glutamine--tRNA ligase +ATGAGTGAGGCTGAAGCCCGCCCGACTAACTTTATTCGTCAGATTATTGATGAAGATCTG +GCGAGTGGTAAACATACCACTGTCCATACCCGTTTTCCGCCGGAGCCGAATGGCTATCTG +CACATCGGCCACGCGAAATCTATCTGCCTGAACTTTGGCATCGCGCAAGATTATCAGGGC +CAGTGCAACCTGCGTTTCGATGACACCAACCCGGTAAAAGAAGATATCGAGTACGTTGAT +TCGATCAAAAACGACGTCGAGTGGTTAGGCTTTCACTGGTCTGGCGATATTCGCTACTCC +TCCGATTACTTTGACCAACTGCACGCCTATGCGGTCGAGCTCATCAATAAAGGCCTGGCC +TACGTTGATGAGCTGACGCCGGAGCAGATCCGCGAATACCGTGGTACGCTGACCGCGCCG +GGTAAAAACAGCCCGTTCCGCGATCGCAGCGTGGAAGAAAACCTCGCGCTATTTGAAAAA +ATGCGTACCGGCGGTTTTGAAGAGGGTAAAGCCTGTCTGCGCGCTAAAATCGACATGGCG +TCGCCGTTTATCGTGATGCGCGATCCGGTGCTGTATCGCATTAAATTCGCCGAGCATCAT +CAGACCGGCACGAAGTGGTGCATCTATCCGATGTACGACTTTACTCACTGCATCAGCGAT +GCGCTGGAAGGCATTACTCATTCTCTGTGTACGCTGGAGTTCCAGGACAACCGTCGTCTG +TACGACTGGGTGCTGGACAACATCACCATTCCGGTTCACCCGCGCCAGTACGAATTCTCG +CGCCTGAATCTGGAATACACCGTGATGTCCAAGCGTAAGCTGAACCTGCTGGTGACCGAC +AAACACGTCGAAGGTTGGGATGATCCGCGTATGCCGACTATTTCCGGTCTGCGCCGTCGC +GGCTATACCGCGGCTTCTATTCGTGAGTTCTGCAAACGCATCGGCGTCACCAAGCAGGAC +AACACTATTGAGATGGCGTCGCTGGAATCCTGCATTCGCGAAGATCTGAACGAAAACGCG +CCGCGCGCGATGGCGGTAATCGATCCGGTAAAACTGGTTATCGAAAACTACCCGCAGGGC +GAGAGCGAAATGGTTACCATGCCTAACCATCCGAATAAACCGGAGATGGGCAGCCGTGAA +GTGCCGTTTAGCGGTGAGATCTGGATCGATCGCGCGGATTTCCGCGAAGAAGCGAACAAA +CAGTACAAACGTCTGGTGATGGGCAAAGAAGTGCGTCTGCGTAATGCCTACGTCATTAAA +GCGGAGCGCGTAGAGAAGGATGCCGAAGGGAATATCACCACCATCTTCTGTACCTATGAT +GCTGATACGCTGAGTAAAGATCCGGCTGACGGGCGTAAAGTGAAAGGCGTAATCCACTGG +GTTAGCGTAGCACATGCGCTGCCGATTGAAATTCGTCTCTACGACCGTCTGTTCAGCGTG +CCGAACCCGGGCGCCGCGGAGGACTTCCTGTCTGTTATCAACCCCGAATCATTAGTGATT +AAGCAGGGGTATGGCGAGCCGTCGCTGAAAGCGGCGGTAGCAGGGAAAGCTTTCCAGTTT +GAACGTGAAGGTTACTTCTGCCTTGACAGCCGCTATGCAACGGCCGATAAGCTGGTCTTT +AACCGCACCGTGGGCCTGCGTGATACCTGGGCGAAAGCGGGCGAATAA +>OHLPOIIB_00007 tRNA (guanine-N(1)-)-methyltransferase +GTGTTTATTGGCATCGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGG +GTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGAC +TTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCGGGGATG +TTAATGATGGTGCAACCCTTGCGGGACGCCATTCATGCAGCAAAAGCCGCGGCAGGTGAA +GGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGC +GAGCTGGCCACGAATCAGAAACTCATTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAG +CGCGTAATTCAGGCCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTCAGCGGT +GGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGGGTTCTG +GGGCATGAGGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCAC +TATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCGGGAAAC +CATGCTGAGATACGTCGCTGGCGTTTGAAACAGTCACTGGGCCGAACCTGGCTTAGAAGA +CCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTC +AAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAG +>OHLPOIIB_00008 hypothetical protein +ATGGCCGGGTTGCACGCGCCATATGCATATAGCGCGCATCATGCGGTGAATTTCTGTTCT +GAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGTATGTTCGAAAAGACCGGCGAT +CGTCAACTTAGCGCGTGGGAGGCCGGAATTCTGACGCGTCGCTATGGTCTGGATAAAGAA +ATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATGGCGGTTCGCTTCTTTATGGCT +GGTTATCGACTCGAAGGTTGA +>OHLPOIIB_00009 hypothetical protein +ATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGC +CAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTTGAGATGATCGTCCCTCAACTG +CCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAATCTCTCGTACTTGAGCATGGCGGT +GCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAA +TGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACC +GACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGC +CATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGG +CTACTGCAACAGACGGGCGATGAAGTGCTGGATTACCGCCAGGCGGTGGCATATTACGCC +TCCTGCCGTCAGACAGTGACCGAGGGGGGTAATCACGCATTCACGGGCTTCGAAGATTAT +TTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGA +>OHLPOIIB_00010 hypothetical protein +ATGATGCGCACTGATGGCACGGTGACGCAGCCGCGTTTAAAGCAGGTTGCGCTGTCGGCA +ATTTCGGAGGTCAACGCAGAGCTGTATGAGTTTCGCAGACGCCAGCAGATGCTGGGGTAT +GCCTCGCTGGCAGAAGTCCCGGCGGAACAACTGGACGGCAAAAGCGAGCGCATTCAGCAC +TATTTCAACGCGGTTTACTGCTGGGCACGCGCCATGCTCAACGAACGTTACCAGGACTAT +GACGCCACGGCATCCGGTGTGAAGCGGGGCGAAGAACTGGCAGAAGCCAGCGGTGATTTG +TGGCGTGACGCCCGCTGGGCCATCAGCCGGGTGCAGGATGCGCCGCACTGCACAGTGGAG +CTTATCTGA +>OHLPOIIB_00011 C4-dicarboxylate TRAP transporter large permease protein DctM +ATGATTGACCCTATTTTTGCGTCCTGTACGCTAATTGCCGTCTTTGTTGTTTTACTGGCC +ATGGGCGCGCCTATCGGGATCTGCATCGTTATCGCCTCTTTCAGCACCATGATGCTGGTA +CTGCCTTTCGATATTTCGATGTTCGCCACCGCGCAAAAAATGTTCTCCAGCCTGGACAGT +TTTGCCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGCGGGGGA +ATTGCCGCCCGACTGGTCAATTTTGCCAAACTGTTTACTGGCAAACTGCCCGGCTCGCTC +TCTTACACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCGATTGCC +GCCTCAACCTCTATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTACGATCGC +GGTTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAATTCCGCCCACC +ACGGCTTTTATCCTTTATGCGCTGGCAAGCGGGGGAACATCGATTGCCGCTCTGTTCGCC +GGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTGGTGGTC +GCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGCATGGCGCTAAAAGTT +GCCGTTGAGGCCATTCCCAGCCTGTTACTGATCGTGATTATTGTCGGCGGCATTGTGCAG +GGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTATACGTTATTGCTGACG +ATAGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAGACAGTG +GTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTCTCGATG +TCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCCAATAAA +CTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATGGATATC +GGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCGATTATGACTAAACTGGGCGTC +GATCCGGTGCATTTCGGCATTATCATGATCTATAACCTGGCGATAGGCACCATTACGCCG +CCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAGGACGTT +ATCAAACCGTTGATGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTATTAATTACCTAC +ATTCCGGAAATCACACTGTTTTTACCCCGTCTACTGGGCATCATGTAA +>OHLPOIIB_00012 hypothetical protein +ATGAAGTTTGTTGCGCCAGAACAGGCACCGGAACAGGCGGAGGTCATCAAAAATACGCCG +TTCTGGCCTGATGTGAACCTGTCGGAATTTCGCAGTGTGATGCGCACTGACGGCACGGTG +ACGCAGCCGCGTTTAAAGCAGGTCGTGCTGACGGCGATCTCTGAGGTTAACGCTGAGCTG +TACGACTTCCGCAACCGTCAGCAGATGCTGGGCTGGCGGACACTTGCTGAGGTTCCCGCA +GAAATGCTGGACGGTAAAAGCGAGCGTATCCGGCACTACCACAACGCTGTTTTTTGCTGG +GCGCGCGCTGTGCTTAATGAGCGTTATCAGGACTATGACGCCACGGCGTCAGGCGTGAAG +CGAGGGGAGGAGCTGGCGGAGGCCAGCGGCGATCTGTGGCGTGATGCCCGCTGGGCCATC +AGCCGGGTGCAGGATGCACCGCACTGTACGGTGGAGCTTATCTGA +>OHLPOIIB_00013 Tyrosine recombinase XerC +ATGGAAACAAATATTACCTGGCAACAATTGATAGATGAATATTTCTTCGCAAAACCTCTG +CGCTCAGCATCTGAATGGAGTTACACCAAAGTCTTCAAATCATTTGTACATTATATGGGG +CCGTTAAGCTGCCCTAATGATGTGACATATCACAAAGTGCTTGCCTGGCGCCGTTTTCTT +TTAAAAGAGAAAAAGCTGTCCGGACGTACCTGGAATAACAAGGTGGCGCATATGCGGGCC +ATCTTTAACTACGGAATACAGCGAGGGTTACTGCACTATGACGAAAATCCGTTTAACAAT +TCGGTAGTTAAACCGGACAAGAAGAGAAAGAAAACGCTCACTCAGGCACAGATTGAGTAT +GCCTATCAGATCATGGAGCAGTATGAAAATCAGGAGAATACAGGGCTGGGACTGAAATAT +TCCCGCTGCGCCTTATTTCCTGCATGGTTCTGGCTCACTGTCCTGGATACGCTCTATTAC +ACAGGGATACGTCAGAACCAGTTATTACATATTCGGCTGAATGATGTTGATTTGAGAGAA +GGGCAGATTCGGCTGATTACGGAGGGGTGTAAAAATCACAAAGAACACTATGTGCCGGTG +ATCAGTTTTCTGCGTCCACGGCTGACCTGTTTAATGGAGAAAGCGCAGAGCGAAGGATTG +AAAGGTAATGACCGCCTGTTCAATATTGCACTTTTTACCGGCAAAGATCCCGCCATTGGC +GATGACATGGATTCTCCTCAGGTAAGAGCATTCTTCCGTCGTCTGTCCAAGGAGTGTCAG +TTTGCGATCAGTCCTCATCGTTTCAGACACACGCTGGCCACGGAGATGATGAAAATGCCG +GAACAGAATCTGCATATGGCGCAAAGTGTGCTGGGTCATTCAAACATGAAATCCACGCTG +GAGTATGTGGAGAATGATATTGCAGTGATGGGGAGGGCTCTGGAAGCGCAGTTTATGCAG +ATTAAGGCAGCACATGCCCGAAGCATTTACAGTGGGTTGACAAAGAATAGATAA diff --git a/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.fna b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.fna new file mode 100644 index 0000000000000000000000000000000000000000..500883173662b59ed235adfba0c1c25579abf859 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.fna @@ -0,0 +1,184 @@ +>this_is_genome_1 +ATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTG +CTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGTGATTAGCATTATCGGCTCGTCC +GGTTCCGGTAAAAGCACTTTTTTGCGCTGCATTAACTTCCTCGAAAAATCGAGCGAAGGC +GCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGACGGGCAGCTCAAA +GTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCAC +TTTAACCTCTGGAACCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTA +CTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGG +ATTGATGAGCGCGCTCAGGGCAAATATCCCGTCCATCTCTCCGGCGGCCAACAGCAGCGC +GTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACA +TCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAA +GAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCCCGCCATGTCTCTTCG +CACGTGATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGCAATCCGGAGCAGGTGTTC +GGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAAACGCTA +GAGGACGCGCCTCTCAGAGAGCGCGCTCTCTCAGAGAGGCGCGCGCCTCTTTCGCAGAGA +CCNNCGCTCATGAGCGATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTGACGGCAGG +TTCCGCGCTTGCTGTTCTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCG +CCAGACCGTCGATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGC +CTTCGCCGACGGGCAGACTGTGGTCGTGCCGTCAGGATGGGTGTGTGAAAATATCAATGC +GGCGATAACGATTCCGGCGGGAAAAACGCTGCGGATACAGGGCGCGGTGCGTGGGAATGG +CCGGGGACGGTTTATTTTGCTGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCT +GCACAATGTGACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGACGATGAG +CGGCTTTGGCCCCGTCGCGCAAATTTTCATCGGCGGTAAGGAACCGCAGGTGATGCGTAA +TCTCATTATCGATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATT +TCATAACCAAATGGACGGCGCGCGGATTACGCATAGTCGCTTTAGCGATTTGCAGGGGGA +CGCCATTGAGTGGAATGTCGCGATTCATGACCGCGACATCCTGATTTCCGATCATGTCAT +CGAACGCATTGATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGG +TAGCGCCTATGACAATAGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTGGTGGCCAA +TATTACCGGATCTGATTGCCGACAACTGGTACACGTAGAAAATGGCAAACATTTCGTCAT +TCGCAATGTCAAAGCCAAAAACATCACGCCCGATTTCAGTAAAAATGCGGGTATTGATAA +CGCAACGATCGCCATTTATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAA +TAGTGCCGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCA +AAACTTTAAATTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGG +CATTCAAATTTCCTCCGGTAACGCCCCCTCATTTGTTGCCATCACCAATGTACGGATGAC +GCGTGCTACGCTGGAACTGCATAATCAACCGCAGCACCTCTTTTTGCGTAATATCAACGT +GATGCAAACTTCAGCGATTGGCCCGGCGTTAAAAATGCATTTTGATTTGCGTAAAGATGT +CCGTGGTCAATTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCAT +CAATGAAAACGGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGT +CGAAGCAGTGAATTTTTCGCTGCCGAAGCGGGGAGGGTAACACTCGCGCATAGAGAGCTC +TCAGAGGAGCGCGCGCGCTATAGCGCGC +>this_is_genome_2 +CGCGATAATATAGCGCGCGCTCATAGCATGAGCGCGCTATTTCTGGCCATCCCGTTAACC +ATTTTTGTGTTGTTTGTGTTACCGATTTGGCTGTGGCTGCATTACAGCAACCGCGCCGGT +CGGGGAGAACTGTCGCAAAGCGAGCAGCAACGCTTACTGCAACTCACAGACGACGCGCAA +CGTATGCGCGAGCGCATTCAGGCGCTGGAAGATATTCTTGATGCAGAGCATCCGAACTGG +AGAGAGCGCTAACGCCATATTATAGCGCGCCTCATAAGAGCGGCCTATAGCGCGCTANNN +ATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTTACCAAGC +CTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCACCGGGATTTTCCCCATCGGCGGTATC +CGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGATTACGCG +TTTGTGCATATGACGCTGAAAATCGGCGCCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTC +GGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCGACCTGATGGAGAACCGCTAT +CTGGCGCTGTCGTTTGAGATTGCCGAGTTACATCCAACGCTCAATTACAAACAAAACAAC +GTACACGCGTTATTTAAATAGCCAGAGATTCGCGCGTTCAGAGAGGAGCTCTCTCATAGA +CGCGCGCATATGCGCTCTAGAGAGGCGCGCCTAATGGCGCGCTATGATGAAAGCGCTACT +GTGGCTGGTTGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGCGAAAAAGGAATTATCGA +TAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCCTATCCGCGCATTAA +AGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGACGTTAACGGG +TCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTATATGGCGGTAAACC +GCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCTTGGCATGCGGGCGTCAGTTTCTG +GCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAATCGCGGCTG +GCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAATTCAGGCATT +GATCCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCGCAGAATGTGGTGGC +CCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGCTTCCCGTGGCGCGA +GCTGGCGGCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTGGCGTTTTATCTGGC +TGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCGTTACTCTCGCGCTA +TGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGGGTGATTATGGCGTT +CCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCCGAAACGCAGGCGAT +TGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACAAAGCGCTCTCAGAGAGAGCGC +TCTGCAGATGAGTGAGGCTGAAGCCCGCCCGACTAACTTTATTCGTCAGATTATTGATGA +AGATCTGGCGAGTGGTAAACATACCACTGTCCATACCCGTTTTCCGCCGGAGCCGAATGG +CTATCTGCACATCGGCCACGCGAAATCTATCTGCCTGAACTTTGGCATCGCGCAAGATTA +TCAGGGCCAGTGCAACCTGCGTTTCGATGACACCAACCCGGTAAAAGAAGATATCGAGTA +CGTTGATTCGATCAAAAACGACGTCGAGTGGTTAGGCTTTCACTGGTCTGGCGATATTCG +CTACTCCTCCGATTACTTTGACCAACTGCACGCCTATGCGGTCGAGCTCATCAATAAAGG +CCTGGCCTACGTTGATGAGCTGACGCCGGAGCAGATCCGCGAATACCGTGGTACGCTGAC +CGCGCCGGGTAAAAACAGCCCGTTCCGCGATCGCAGCGTGGAAGAAAACCTCGCGCTATT +TGAAAAAATGCGTACCGGCGGTTTTGAAGAGGGTAAAGCCTGTCTGCGCGCTAAAATCGA +CATGGCGTCGCCGTTTATCGTGATGCGCGATCCGGTGCTGTATCGCATTAAATTCGCCGA +GCATCATCAGACCGGCACGAAGTGGTGCATCTATCCGATGTACGACTTTACTCACTGCAT +CAGCGATGCGCTGGAAGGCATTACTCATTCTCTGTGTACGCTGGAGTTCCAGGACAACCG +TCGTCTGTACGACTGGGTGCTGGACAACATCACCATTCCGGTTCACCCGCGCCAGTACGA +ATTCTCGCGCCTGAATCTGGAATACACCGTGATGTCCAAGCGTAAGCTGAACCTGCTGGT +GACCGACAAACACGTCGAAGGTTGGGATGATCCGCGTATGCCGACTATTTCCGGTCTGCG +CCGTCGCGGCTATACCGCGGCTTCTATTCGTGAGTTCTGCAAACGCATCGGCGTCACCAA +GCAGGACAACACTATTGAGATGGCGTCGCTGGAATCCTGCATTCGCGAAGATCTGAACGA +AAACGCGCCGCGCGCGATGGCGGTAATCGATCCGGTAAAACTGGTTATCGAAAACTACCC +GCAGGGCGAGAGCGAAATGGTTACCATGCCTAACCATCCGAATAAACCGGAGATGGGCAG +CCGTGAAGTGCCGTTTAGCGGTGAGATCTGGATCGATCGCGCGGATTTCCGCGAAGAAGC +GAACAAACAGTACAAACGTCTGGTGATGGGCAAAGAAGTGCGTCTGCGTAATGCCTACGT +CATTAAAGCGGAGCGCGTAGAGAAGGATGCCGAAGGGAATATCACCACCATCTTCTGTAC +CTATGATGCTGATACGCTGAGTAAAGATCCGGCTGACGGGCGTAAAGTGAAAGGCGTAAT +CCACTGGGTTAGCGTAGCACATGCGCTGCCGATTGAAATTCGTCTCTACGACCGTCTGTT +CAGCGTGCCGAACCCGGGCGCCGCGGAGGACTTCCTGTCTGTTATCAACCCCGAATCATT +AGTGATTAAGCAGGGGTATGGCGAGCCGTCGCTGAAAGCGGCGGTAGCAGGGAAAGCTTT +CCAGTTTGAACGTGAAGGTTACTTCTGCCTTGACAGCCGCTATGCAACGGCCGATAAGCT +GGTCTTTAACCGCACCGTGGGCCTGCGTGATACCTGGGCGAAAGCGGGCGAATAA +>this_is_genome_3 +ACGCGCTATAGGGCTCTCAGAGAGTCTCAGTGTTTATTGGCATCGTTAGCCTGTTTCCTG +AAATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAAAGGCCTGC +TGAACATCCAAAGCTGGAGTCCTCGCGACTTCGCGCATGACCGGCACCGTACCGTGGACG +ACCGTCCTTACGGCGGCGGACCGGGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCA +TTCATGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGG +GACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAACTCATTCTGG +TGTGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGGCCGAAATTGACGAAGAAT +GGTCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGATTGACT +CCGTCGCCCGGTTTATACCGGGGGTTCTGGGGCATGAGGCATCAGCAATCGAAGATTCGT +TTGCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGG +AAGTACCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCGCTGGCGTTTGAAAC +AGTCACTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTG +AAGAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGCATAAAC +ATGATGGGATGGCATAGCGAGATCGCGCATAGCGCGCGGCGAGATCCGCGAGACATGAAT +AATCATTTTGGGAAAGGGTTAATGGCCGGGTTGCACGCGCCATATGCATATAGCGCGCAT +CATGCGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGT +ATGTTCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGCCGGAATTCTGACGCGT +CGCTATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATG +GCGGTTCGCTTCTTTATGGCTGGTTATCGACTCGAAGGTTGAACGCCTCTAGAGCGCTAG +AGGCGCGCGCGATATACGCGGCGCGAGACATGTCTACGCTTCTCTATTTGCACGGATTCA +ACAGTTCCCCTCGCTCGGCAAAAGCGTGCCAGCTAAAAAACTGGCTGGCGGAGCGTCATC +CGCATGTTGAGATGATCGTCCCTCAACTGCCGCCGTATCCTGCCGATGCGGCGGAGTTGC +TGGAATCTCTCGTACTTGAGCATGGCGGTGCGCCATTAGGGCTGGTAGGATCGTCGCTGG +GTGGTTATTACGCCACCTGGCTGTCGCAATGTTTTATGCTGCCGGCTGTGGTGGTGAATC +CCGCCGTGCGGCCCTTTGAATTACTGACCGACTATCTCGGTCAGAACGAGAACCCCTACA +CCGGGCAGCAATATGTGCTAGAGTCTCGCCATATTTATGATCTTAAAGTCATGCAGATTG +ACCCGCTGGAAGCGCCGGACCTGATCTGGCTACTGCAACAGACGGGCGATGAAGTGCTGG +ATTACCGCCAGGCGGTGGCATATTACGCCTCCTGCCGTCAGACAGTGACCGAGGGGGGTA +ATCACGCATTCACGGGCTTCGAAGATTATTTCAACCAGATTGTCGATTTTCTTGGACTGC +ACAGTTGCTGACCGGATATACGCGCGCGCGCTATATAGCGCGCGCGGCGATATAGCGCAT +GAAGTTTGTTGCGCCAGAACAGGCACCGGAACAGGCGGAAATCATCAGAAATACGCCGTT +CTGGCCTGATGTGGACCTGTCGGAGTTTCGCAGCATGATGCGCACTGATGGCACGGTGAC +GCAGCCGCGTTTAAAGCAGGTTGCGCTGTCGGCAATTTCGGAGGTCAACGCAGAGCTGTA +TGAGTTTCGCAGACGCCAGCAGATGCTGGGGTATGCCTCGCTGGCAGAAGTCCCGGCGGA +ACAACTGGACGGCAAAAGCGAGCGCATTCAGCACTATTTCAACGCGGTTTACTGCTGGGC +ACGCGCCATGCTCAACGAACGTTACCAGGACTATGACGCCACGGCATCCGGTGTGAAGCG +GGGCGAAGAACTGGCAGAAGCCAGCGGTGATTTGTGGCGTGACGCCCGCTGGGCCATCAG +CCGGGTGCAGGATGCGCCGCACTGCACAGTGGAGCTTATCTGA +>this_is_genome_4 +ATGATTGACCCTATTTTTGCGTCCTGTACGCTAATTGCCGTCTTTGTTGTTTTACTGGCC +ATGGGCGCGCCTATCGGGATCTGCATCGTTATCGCCTCTTTCAGCACCATGATGCTGGTA +CTGCCTTTCGATATTTCGATGTTCGCCACCGCGCAAAAAATGTTCTCCAGCCTGGACAGT +TTTGCCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGCGGGGGA +ATTGCCGCCCGACTGGTCAATTTTGCCAAACTGTTTACTGGCAAACTGCCCGGCTCGCTC +TCTTACACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCGATTGCC +GCCTCAACCTCTATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTACGATCGC +GGTTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAATTCCGCCCACC +ACGGCTTTTATCCTTTATGCGCTGGCAAGCGGGGGAACATCGATTGCCGCTCTGTTCGCC +GGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTGGTGGTC +GCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGCATGGCGCTAAAAGTT +GCCGTTGAGGCCATTCCCAGCCTGTTACTGATCGTGATTATTGTCGGCGGCATTGTGCAG +GGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTATACGTTATTGCTGACG +ATAGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAGACAGTG +GTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTCTCGATG +TCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCCAATAAA +CTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATGGATATC +GGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCGATTATGACTAAACTGGGCGTC +GATCCGGTGCATTTCGGCATTATCATGATCTATAACCTGGCGATAGGCACCATTACGCCG +CCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAGGACGTT +ATCAAACCGTTGATGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTATTAATTACCTAC +ATTCCGGAAATCACACTGTTTTTACCCCGTCTACTGGGCATCATGTAAACGCTCATAGGC +GGCGCGCGCGCTCTCAGGAATGAAGTTTGTTGCGCCAGAACAGGCACCGGAACAGGCGGA +GGTCATCAAAAATACGCCGTTCTGGCCTGATGTGAACCTGTCGGAATTTCGCAGTGTGAT +GCGCACTGACGGCACGGTGACGCAGCCGCGTTTAAAGCAGGTCGTGCTGACGGCGATCTC +TGAGGTTAACGCTGAGCTGTACGACTTCCGCAACCGTCAGCAGATGCTGGGCTGGCGGAC +ACTTGCTGAGGTTCCCGCAGAAATGCTGGACGGTAAAAGCGAGCGTATCCGGCACTACCA +CAACGCTGTTTTTTGCTGGGCGCGCGCTGTGCTTAATGAGCGTTATCAGGACTATGACGC +CACGGCGTCAGGCGTGAAGCGAGGGGAGGAGCTGGCGGAGGCCAGCGGCGATCTGTGGCG +TGATGCCCGCTGGGCCATCAGCCGGGTGCAGGATGCACCGCACTGTACGGTGGAGCTTAT +CTGACGCTCATAGGCGCGCGCTCATAGCGCGATGGAAACAAATATTACCTGGCAACAATT +GATAGATGAATATTTCTTCGCAAAACCTCTGCGCTCAGCATCTGAATGGAGTTACACCAA +AGTCTTCAAATCATTTGTACATTATATGGGGCCGTTAAGCTGCCCTAATGATGTGACATA +TCACAAAGTGCTTGCCTGGCGCCGTTTTCTTTTAAAAGAGAAAAAGCTGTCCGGACGTAC +CTGGAATAACAAGGTGGCGCATATGCGGGCCATCTTTAACTACGGAATACAGCGAGGGTT +ACTGCACTATGACGAAAATCCGTTTAACAATTCGGTAGTTAAACCGGACAAGAAGAGAAA +GAAAACGCTCACTCAGGCACAGATTGAGTATGCCTATCAGATCATGGAGCAGTATGAAAA +TCAGGAGAATACAGGGCTGGGACTGAAATATTCCCGCTGCGCCTTATTTCCTGCATGGTT +CTGGCTCACTGTCCTGGATACGCTCTATTACACAGGGATACGTCAGAACCAGTTATTACA +TATTCGGCTGAATGATGTTGATTTGAGAGAAGGGCAGATTCGGCTGATTACGGAGGGGTG +TAAAAATCACAAAGAACACTATGTGCCGGTGATCAGTTTTCTGCGTCCACGGCTGACCTG +TTTAATGGAGAAAGCGCAGAGCGAAGGATTGAAAGGTAATGACCGCCTGTTCAATATTGC +ACTTTTTACCGGCAAAGATCCCGCCATTGGCGATGACATGGATTCTCCTCAGGTAAGAGC +ATTCTTCCGTCGTCTGTCCAAGGAGTGTCAGTTTGCGATCAGTCCTCATCGTTTCAGACA +CACGCTGGCCACGGAGATGATGAAAATGCCGGAACAGAATCTGCATATGGCGCAAAGTGT +GCTGGGTCATTCAAACATGAAATCCACGCTGGAGTATGTGGAGAATGATATTGCAGTGAT +GGGGAGGGCTCTGGAAGCGCAGTTTATGCAGATTAAGGCAGCACATGCCCGAAGCATTTA +CAGTGGGTTGACAAAGAATAGATAA diff --git a/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.fsa b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.fsa new file mode 100644 index 0000000000000000000000000000000000000000..cca31356970595150a952c269dce73d8f53e0e0d --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.fsa @@ -0,0 +1,184 @@ +>this_is_genome_1 [gcode=11] [organism=Genus species] [strain=strain] +ATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTG +CTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGTGATTAGCATTATCGGCTCGTCC +GGTTCCGGTAAAAGCACTTTTTTGCGCTGCATTAACTTCCTCGAAAAATCGAGCGAAGGC +GCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGACGGGCAGCTCAAA +GTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCAC +TTTAACCTCTGGAACCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTA +CTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGG +ATTGATGAGCGCGCTCAGGGCAAATATCCCGTCCATCTCTCCGGCGGCCAACAGCAGCGC +GTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACA +TCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAA +GAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCCCGCCATGTCTCTTCG +CACGTGATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGCAATCCGGAGCAGGTGTTC +GGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAAACGCTA +GAGGACGCGCCTCTCAGAGAGCGCGCTCTCTCAGAGAGGCGCGCGCCTCTTTCGCAGAGA +CCNNCGCTCATGAGCGATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTGACGGCAGG +TTCCGCGCTTGCTGTTCTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCG +CCAGACCGTCGATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGC +CTTCGCCGACGGGCAGACTGTGGTCGTGCCGTCAGGATGGGTGTGTGAAAATATCAATGC +GGCGATAACGATTCCGGCGGGAAAAACGCTGCGGATACAGGGCGCGGTGCGTGGGAATGG +CCGGGGACGGTTTATTTTGCTGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCT +GCACAATGTGACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGACGATGAG +CGGCTTTGGCCCCGTCGCGCAAATTTTCATCGGCGGTAAGGAACCGCAGGTGATGCGTAA +TCTCATTATCGATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATT +TCATAACCAAATGGACGGCGCGCGGATTACGCATAGTCGCTTTAGCGATTTGCAGGGGGA +CGCCATTGAGTGGAATGTCGCGATTCATGACCGCGACATCCTGATTTCCGATCATGTCAT +CGAACGCATTGATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGG +TAGCGCCTATGACAATAGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTGGTGGCCAA +TATTACCGGATCTGATTGCCGACAACTGGTACACGTAGAAAATGGCAAACATTTCGTCAT +TCGCAATGTCAAAGCCAAAAACATCACGCCCGATTTCAGTAAAAATGCGGGTATTGATAA +CGCAACGATCGCCATTTATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAA +TAGTGCCGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCA +AAACTTTAAATTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGG +CATTCAAATTTCCTCCGGTAACGCCCCCTCATTTGTTGCCATCACCAATGTACGGATGAC +GCGTGCTACGCTGGAACTGCATAATCAACCGCAGCACCTCTTTTTGCGTAATATCAACGT +GATGCAAACTTCAGCGATTGGCCCGGCGTTAAAAATGCATTTTGATTTGCGTAAAGATGT +CCGTGGTCAATTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCAT +CAATGAAAACGGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGT +CGAAGCAGTGAATTTTTCGCTGCCGAAGCGGGGAGGGTAACACTCGCGCATAGAGAGCTC +TCAGAGGAGCGCGCGCGCTATAGCGCGC +>this_is_genome_2 [gcode=11] [organism=Genus species] [strain=strain] +CGCGATAATATAGCGCGCGCTCATAGCATGAGCGCGCTATTTCTGGCCATCCCGTTAACC +ATTTTTGTGTTGTTTGTGTTACCGATTTGGCTGTGGCTGCATTACAGCAACCGCGCCGGT +CGGGGAGAACTGTCGCAAAGCGAGCAGCAACGCTTACTGCAACTCACAGACGACGCGCAA +CGTATGCGCGAGCGCATTCAGGCGCTGGAAGATATTCTTGATGCAGAGCATCCGAACTGG +AGAGAGCGCTAACGCCATATTATAGCGCGCCTCATAAGAGCGGCCTATAGCGCGCTANNN +ATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTTACCAAGC +CTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCACCGGGATTTTCCCCATCGGCGGTATC +CGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGATTACGCG +TTTGTGCATATGACGCTGAAAATCGGCGCCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTC +GGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCGACCTGATGGAGAACCGCTAT +CTGGCGCTGTCGTTTGAGATTGCCGAGTTACATCCAACGCTCAATTACAAACAAAACAAC +GTACACGCGTTATTTAAATAGCCAGAGATTCGCGCGTTCAGAGAGGAGCTCTCTCATAGA +CGCGCGCATATGCGCTCTAGAGAGGCGCGCCTAATGGCGCGCTATGATGAAAGCGCTACT +GTGGCTGGTTGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGCGAAAAAGGAATTATCGA +TAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCCTATCCGCGCATTAA +AGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGACGTTAACGGG +TCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTATATGGCGGTAAACC +GCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCTTGGCATGCGGGCGTCAGTTTCTG +GCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAATCGCGGCTG +GCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAATTCAGGCATT +GATCCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCGCAGAATGTGGTGGC +CCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGCTTCCCGTGGCGCGA +GCTGGCGGCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTGGCGTTTTATCTGGC +TGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCGTTACTCTCGCGCTA +TGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGGGTGATTATGGCGTT +CCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCCGAAACGCAGGCGAT +TGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACAAAGCGCTCTCAGAGAGAGCGC +TCTGCAGATGAGTGAGGCTGAAGCCCGCCCGACTAACTTTATTCGTCAGATTATTGATGA +AGATCTGGCGAGTGGTAAACATACCACTGTCCATACCCGTTTTCCGCCGGAGCCGAATGG +CTATCTGCACATCGGCCACGCGAAATCTATCTGCCTGAACTTTGGCATCGCGCAAGATTA +TCAGGGCCAGTGCAACCTGCGTTTCGATGACACCAACCCGGTAAAAGAAGATATCGAGTA +CGTTGATTCGATCAAAAACGACGTCGAGTGGTTAGGCTTTCACTGGTCTGGCGATATTCG +CTACTCCTCCGATTACTTTGACCAACTGCACGCCTATGCGGTCGAGCTCATCAATAAAGG +CCTGGCCTACGTTGATGAGCTGACGCCGGAGCAGATCCGCGAATACCGTGGTACGCTGAC +CGCGCCGGGTAAAAACAGCCCGTTCCGCGATCGCAGCGTGGAAGAAAACCTCGCGCTATT +TGAAAAAATGCGTACCGGCGGTTTTGAAGAGGGTAAAGCCTGTCTGCGCGCTAAAATCGA +CATGGCGTCGCCGTTTATCGTGATGCGCGATCCGGTGCTGTATCGCATTAAATTCGCCGA +GCATCATCAGACCGGCACGAAGTGGTGCATCTATCCGATGTACGACTTTACTCACTGCAT +CAGCGATGCGCTGGAAGGCATTACTCATTCTCTGTGTACGCTGGAGTTCCAGGACAACCG +TCGTCTGTACGACTGGGTGCTGGACAACATCACCATTCCGGTTCACCCGCGCCAGTACGA +ATTCTCGCGCCTGAATCTGGAATACACCGTGATGTCCAAGCGTAAGCTGAACCTGCTGGT +GACCGACAAACACGTCGAAGGTTGGGATGATCCGCGTATGCCGACTATTTCCGGTCTGCG +CCGTCGCGGCTATACCGCGGCTTCTATTCGTGAGTTCTGCAAACGCATCGGCGTCACCAA +GCAGGACAACACTATTGAGATGGCGTCGCTGGAATCCTGCATTCGCGAAGATCTGAACGA +AAACGCGCCGCGCGCGATGGCGGTAATCGATCCGGTAAAACTGGTTATCGAAAACTACCC +GCAGGGCGAGAGCGAAATGGTTACCATGCCTAACCATCCGAATAAACCGGAGATGGGCAG +CCGTGAAGTGCCGTTTAGCGGTGAGATCTGGATCGATCGCGCGGATTTCCGCGAAGAAGC +GAACAAACAGTACAAACGTCTGGTGATGGGCAAAGAAGTGCGTCTGCGTAATGCCTACGT +CATTAAAGCGGAGCGCGTAGAGAAGGATGCCGAAGGGAATATCACCACCATCTTCTGTAC +CTATGATGCTGATACGCTGAGTAAAGATCCGGCTGACGGGCGTAAAGTGAAAGGCGTAAT +CCACTGGGTTAGCGTAGCACATGCGCTGCCGATTGAAATTCGTCTCTACGACCGTCTGTT +CAGCGTGCCGAACCCGGGCGCCGCGGAGGACTTCCTGTCTGTTATCAACCCCGAATCATT +AGTGATTAAGCAGGGGTATGGCGAGCCGTCGCTGAAAGCGGCGGTAGCAGGGAAAGCTTT +CCAGTTTGAACGTGAAGGTTACTTCTGCCTTGACAGCCGCTATGCAACGGCCGATAAGCT +GGTCTTTAACCGCACCGTGGGCCTGCGTGATACCTGGGCGAAAGCGGGCGAATAA +>this_is_genome_3 [gcode=11] [organism=Genus species] [strain=strain] +ACGCGCTATAGGGCTCTCAGAGAGTCTCAGTGTTTATTGGCATCGTTAGCCTGTTTCCTG +AAATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAAAGGCCTGC +TGAACATCCAAAGCTGGAGTCCTCGCGACTTCGCGCATGACCGGCACCGTACCGTGGACG +ACCGTCCTTACGGCGGCGGACCGGGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCA +TTCATGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGG +GACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAACTCATTCTGG +TGTGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGGCCGAAATTGACGAAGAAT +GGTCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGATTGACT +CCGTCGCCCGGTTTATACCGGGGGTTCTGGGGCATGAGGCATCAGCAATCGAAGATTCGT +TTGCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGG +AAGTACCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCGCTGGCGTTTGAAAC +AGTCACTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTG +AAGAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGCATAAAC +ATGATGGGATGGCATAGCGAGATCGCGCATAGCGCGCGGCGAGATCCGCGAGACATGAAT +AATCATTTTGGGAAAGGGTTAATGGCCGGGTTGCACGCGCCATATGCATATAGCGCGCAT +CATGCGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGT +ATGTTCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGCCGGAATTCTGACGCGT +CGCTATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATG +GCGGTTCGCTTCTTTATGGCTGGTTATCGACTCGAAGGTTGAACGCCTCTAGAGCGCTAG +AGGCGCGCGCGATATACGCGGCGCGAGACATGTCTACGCTTCTCTATTTGCACGGATTCA +ACAGTTCCCCTCGCTCGGCAAAAGCGTGCCAGCTAAAAAACTGGCTGGCGGAGCGTCATC +CGCATGTTGAGATGATCGTCCCTCAACTGCCGCCGTATCCTGCCGATGCGGCGGAGTTGC +TGGAATCTCTCGTACTTGAGCATGGCGGTGCGCCATTAGGGCTGGTAGGATCGTCGCTGG +GTGGTTATTACGCCACCTGGCTGTCGCAATGTTTTATGCTGCCGGCTGTGGTGGTGAATC +CCGCCGTGCGGCCCTTTGAATTACTGACCGACTATCTCGGTCAGAACGAGAACCCCTACA +CCGGGCAGCAATATGTGCTAGAGTCTCGCCATATTTATGATCTTAAAGTCATGCAGATTG +ACCCGCTGGAAGCGCCGGACCTGATCTGGCTACTGCAACAGACGGGCGATGAAGTGCTGG +ATTACCGCCAGGCGGTGGCATATTACGCCTCCTGCCGTCAGACAGTGACCGAGGGGGGTA +ATCACGCATTCACGGGCTTCGAAGATTATTTCAACCAGATTGTCGATTTTCTTGGACTGC +ACAGTTGCTGACCGGATATACGCGCGCGCGCTATATAGCGCGCGCGGCGATATAGCGCAT +GAAGTTTGTTGCGCCAGAACAGGCACCGGAACAGGCGGAAATCATCAGAAATACGCCGTT +CTGGCCTGATGTGGACCTGTCGGAGTTTCGCAGCATGATGCGCACTGATGGCACGGTGAC +GCAGCCGCGTTTAAAGCAGGTTGCGCTGTCGGCAATTTCGGAGGTCAACGCAGAGCTGTA +TGAGTTTCGCAGACGCCAGCAGATGCTGGGGTATGCCTCGCTGGCAGAAGTCCCGGCGGA +ACAACTGGACGGCAAAAGCGAGCGCATTCAGCACTATTTCAACGCGGTTTACTGCTGGGC +ACGCGCCATGCTCAACGAACGTTACCAGGACTATGACGCCACGGCATCCGGTGTGAAGCG +GGGCGAAGAACTGGCAGAAGCCAGCGGTGATTTGTGGCGTGACGCCCGCTGGGCCATCAG +CCGGGTGCAGGATGCGCCGCACTGCACAGTGGAGCTTATCTGA +>this_is_genome_4 [gcode=11] [organism=Genus species] [strain=strain] +ATGATTGACCCTATTTTTGCGTCCTGTACGCTAATTGCCGTCTTTGTTGTTTTACTGGCC +ATGGGCGCGCCTATCGGGATCTGCATCGTTATCGCCTCTTTCAGCACCATGATGCTGGTA +CTGCCTTTCGATATTTCGATGTTCGCCACCGCGCAAAAAATGTTCTCCAGCCTGGACAGT +TTTGCCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGCGGGGGA +ATTGCCGCCCGACTGGTCAATTTTGCCAAACTGTTTACTGGCAAACTGCCCGGCTCGCTC +TCTTACACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCGATTGCC +GCCTCAACCTCTATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTACGATCGC +GGTTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAATTCCGCCCACC +ACGGCTTTTATCCTTTATGCGCTGGCAAGCGGGGGAACATCGATTGCCGCTCTGTTCGCC +GGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTGGTGGTC +GCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGCATGGCGCTAAAAGTT +GCCGTTGAGGCCATTCCCAGCCTGTTACTGATCGTGATTATTGTCGGCGGCATTGTGCAG +GGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTATACGTTATTGCTGACG +ATAGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAGACAGTG +GTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTCTCGATG +TCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCCAATAAA +CTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATGGATATC +GGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCGATTATGACTAAACTGGGCGTC +GATCCGGTGCATTTCGGCATTATCATGATCTATAACCTGGCGATAGGCACCATTACGCCG +CCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAGGACGTT +ATCAAACCGTTGATGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTATTAATTACCTAC +ATTCCGGAAATCACACTGTTTTTACCCCGTCTACTGGGCATCATGTAAACGCTCATAGGC +GGCGCGCGCGCTCTCAGGAATGAAGTTTGTTGCGCCAGAACAGGCACCGGAACAGGCGGA +GGTCATCAAAAATACGCCGTTCTGGCCTGATGTGAACCTGTCGGAATTTCGCAGTGTGAT +GCGCACTGACGGCACGGTGACGCAGCCGCGTTTAAAGCAGGTCGTGCTGACGGCGATCTC +TGAGGTTAACGCTGAGCTGTACGACTTCCGCAACCGTCAGCAGATGCTGGGCTGGCGGAC +ACTTGCTGAGGTTCCCGCAGAAATGCTGGACGGTAAAAGCGAGCGTATCCGGCACTACCA +CAACGCTGTTTTTTGCTGGGCGCGCGCTGTGCTTAATGAGCGTTATCAGGACTATGACGC +CACGGCGTCAGGCGTGAAGCGAGGGGAGGAGCTGGCGGAGGCCAGCGGCGATCTGTGGCG +TGATGCCCGCTGGGCCATCAGCCGGGTGCAGGATGCACCGCACTGTACGGTGGAGCTTAT +CTGACGCTCATAGGCGCGCGCTCATAGCGCGATGGAAACAAATATTACCTGGCAACAATT +GATAGATGAATATTTCTTCGCAAAACCTCTGCGCTCAGCATCTGAATGGAGTTACACCAA +AGTCTTCAAATCATTTGTACATTATATGGGGCCGTTAAGCTGCCCTAATGATGTGACATA +TCACAAAGTGCTTGCCTGGCGCCGTTTTCTTTTAAAAGAGAAAAAGCTGTCCGGACGTAC +CTGGAATAACAAGGTGGCGCATATGCGGGCCATCTTTAACTACGGAATACAGCGAGGGTT +ACTGCACTATGACGAAAATCCGTTTAACAATTCGGTAGTTAAACCGGACAAGAAGAGAAA +GAAAACGCTCACTCAGGCACAGATTGAGTATGCCTATCAGATCATGGAGCAGTATGAAAA +TCAGGAGAATACAGGGCTGGGACTGAAATATTCCCGCTGCGCCTTATTTCCTGCATGGTT +CTGGCTCACTGTCCTGGATACGCTCTATTACACAGGGATACGTCAGAACCAGTTATTACA +TATTCGGCTGAATGATGTTGATTTGAGAGAAGGGCAGATTCGGCTGATTACGGAGGGGTG +TAAAAATCACAAAGAACACTATGTGCCGGTGATCAGTTTTCTGCGTCCACGGCTGACCTG +TTTAATGGAGAAAGCGCAGAGCGAAGGATTGAAAGGTAATGACCGCCTGTTCAATATTGC +ACTTTTTACCGGCAAAGATCCCGCCATTGGCGATGACATGGATTCTCCTCAGGTAAGAGC +ATTCTTCCGTCGTCTGTCCAAGGAGTGTCAGTTTGCGATCAGTCCTCATCGTTTCAGACA +CACGCTGGCCACGGAGATGATGAAAATGCCGGAACAGAATCTGCATATGGCGCAAAGTGT +GCTGGGTCATTCAAACATGAAATCCACGCTGGAGTATGTGGAGAATGATATTGCAGTGAT +GGGGAGGGCTCTGGAAGCGCAGTTTATGCAGATTAAGGCAGCACATGCCCGAAGCATTTA +CAGTGGGTTGACAAAGAATAGATAA diff --git a/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.gbk b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.gbk new file mode 100644 index 0000000000000000000000000000000000000000..6150ab31320117628112981efd3c7fdb6042456c --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.gbk @@ -0,0 +1,415 @@ +LOCUS this_is_genome_1 2308 bp DNA linear 12-FEB-2019 +DEFINITION Genus species strain strain. +ACCESSION +VERSION +KEYWORDS . +SOURCE Genus species + ORGANISM Genus species + Unclassified. +COMMENT Annotated using prokka 1.14-dev from + https://github.com/tseemann/prokka. +FEATURES Location/Qualifiers + source 1..2308 + /organism="Genus species" + /mol_type="genomic DNA" + /strain="strain" + CDS 1..774 + /gene="hisP" + /locus_tag="OHLPOIIB_00001" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:P02915" + /codon_start=1 + /transl_table=11 + /product="Histidine transport ATP-binding protein HisP" + /translation="MSENKLHVIDLHKRYGGHEVLKGVSLQARAGDVISIIGSSGSGK + STFLRCINFLEKSSEGAIIVNGQNINLVRDKDGQLKVADKNQLRLLRTRLTMVFQHFN + LWNHMTVLENVMEAPIQVLGLSKHDARERALKYLAKVGIDERAQGKYPVHLSGGQQQR + VSIARALAMEPDVLLFDEPTSALDPELVGEVLRIMQQLAEEGKTMVVVTHEMGFARHV + SSHVIFLHQGKIEEEGNPEQVFGNPQSPRLQQFLKGSLK" + CDS 857..2260 + /locus_tag="OHLPOIIB_00002" + /inference="ab initio prediction:Prodigal:2.6" + /codon_start=1 + /transl_table=11 + /product="hypothetical protein" + /translation="MPATKFSRRTLLTAGSALAVLPFLRALPVQAREPRQTVDIKDYP + ADDGIASFKQAFADGQTVVVPSGWVCENINAAITIPAGKTLRIQGAVRGNGRGRFILL + DGCQVVGEQGGSLHNVTLDVRGSDCVIKGVTMSGFGPVAQIFIGGKEPQVMRNLIIDD + ITVTHANYAILRQGFHNQMDGARITHSRFSDLQGDAIEWNVAIHDRDILISDHVIERI + DCTNGKINWGIGIGLAGSAYDNSYPEDQAVKNFVVANITGSDCRQLVHVENGKHFVIR + NVKAKNITPDFSKNAGIDNATIAIYGCDNFVIDNIDMTNSAGMLIGYGVVKGKYLSIP + QNFKLNAIRLDNRQVAYKLRGIQISSGNAPSFVAITNVRMTRATLELHNQPQHLFLRN + INVMQTSAIGPALKMHFDLRKDVRGQFMARQDTLLSLANVHAINENGQSSVDIDRINH + QTVNVEAVNFSLPKRGG" +ORIGIN + 1 atgtcagaaa ataaattaca cgttatcgat ttgcacaaac gctacggcgg tcatgaagtg + 61 ctgaaagggg tatcgctgca ggcccgcgcc ggagatgtga ttagcattat cggctcgtcc + 121 ggttccggta aaagcacttt tttgcgctgc attaacttcc tcgaaaaatc gagcgaaggc + 181 gcgattatcg tgaacggtca gaacattaat ctggtgcgcg acaaagacgg gcagctcaaa + 241 gtggcggata aaaatcagct acgcttgttg cgtacccgcc tgacgatggt gtttcagcac + 301 tttaacctct ggaaccacat gacggtgctg gaaaatgtga tggaagcgcc gattcaggta + 361 ctgggattaa gcaagcacga cgcgcgcgag cgggcgttga aatatctggc gaaggtgggg + 421 attgatgagc gcgctcaggg caaatatccc gtccatctct ccggcggcca acagcagcgc + 481 gtttctattg cgcgcgcgct ggcgatggaa cctgacgttt tactgttcga tgaacccaca + 541 tcggcgctcg atcctgaact ggtcggcgaa gtgttgcgca tcatgcaaca actggcggaa + 601 gaaggcaaaa cgatggtggt ggtcacgcat gaaatgggct tcgcccgcca tgtctcttcg + 661 cacgtgattt ttctgcatca ggggaaaatt gaagaagagg gcaatccgga gcaggtgttc + 721 ggcaatccgc aaagcccgcg tttacagcaa ttcctgaaag gctcgctgaa ataaacgcta + 781 gaggacgcgc ctctcagaga gcgcgctctc tcagagaggc gcgcgcctct ttcgcagaga + 841 ccnncgctca tgagcgatgc ccgcgactaa attctcccga cgtaccctcc tgacggcagg + 901 ttccgcgctt gctgttcttc cttttctgcg cgccttgccg gtacaggcgc gtgaacctcg + 961 ccagaccgtc gatattaagg attatccggc ggatgacggt atcgcctcgt tcaaacaggc + 1021 cttcgccgac gggcagactg tggtcgtgcc gtcaggatgg gtgtgtgaaa atatcaatgc + 1081 ggcgataacg attccggcgg gaaaaacgct gcggatacag ggcgcggtgc gtgggaatgg + 1141 ccggggacgg tttattttgc tggacgggtg tcaggtggtg ggggagcagg gcggcagtct + 1201 gcacaatgtg acgctggatg ttcgcgggtc ggactgtgtg attaaaggcg tgacgatgag + 1261 cggctttggc cccgtcgcgc aaattttcat cggcggtaag gaaccgcagg tgatgcgtaa + 1321 tctcattatc gatgacatca ccgttaccca cgccaactac gccattctcc gccagggatt + 1381 tcataaccaa atggacggcg cgcggattac gcatagtcgc tttagcgatt tgcaggggga + 1441 cgccattgag tggaatgtcg cgattcatga ccgcgacatc ctgatttccg atcatgtcat + 1501 cgaacgcatt gattgtacca atggcaaaat caactggggg atcggcatcg ggctggcggg + 1561 tagcgcctat gacaatagtt atcctgaaga ccaggcagta aaaaactttg tggtggccaa + 1621 tattaccgga tctgattgcc gacaactggt acacgtagaa aatggcaaac atttcgtcat + 1681 tcgcaatgtc aaagccaaaa acatcacgcc cgatttcagt aaaaatgcgg gtattgataa + 1741 cgcaacgatc gccatttatg gctgtgataa tttcgtcatt gataatattg atatgacgaa + 1801 tagtgccggg atgctcatcg gctatggcgt cgttaaagga aaatacctgt caattccgca + 1861 aaactttaaa ttaaacgcta ttcggttgga taatcgccag gttgcttata aattacgcgg + 1921 cattcaaatt tcctccggta acgccccctc atttgttgcc atcaccaatg tacggatgac + 1981 gcgtgctacg ctggaactgc ataatcaacc gcagcacctc tttttgcgta atatcaacgt + 2041 gatgcaaact tcagcgattg gcccggcgtt aaaaatgcat tttgatttgc gtaaagatgt + 2101 ccgtggtcaa tttatggccc gccaggacac gctgctttcc ctcgctaatg ttcatgccat + 2161 caatgaaaac gggcagagtt ccgtggatat cgacaggatt aatcaccaaa ccgtgaatgt + 2221 cgaagcagtg aatttttcgc tgccgaagcg gggagggtaa cactcgcgca tagagagctc + 2281 tcagaggagc gcgcgcgcta tagcgcgc +// +LOCUS this_is_genome_2 3295 bp DNA linear 12-FEB-2019 +DEFINITION Genus species strain strain. +ACCESSION +VERSION +KEYWORDS . +SOURCE Genus species + ORGANISM Genus species + Unclassified. +COMMENT Annotated using prokka 1.14-dev from + https://github.com/tseemann/prokka. +FEATURES Location/Qualifiers + source 1..3295 + /organism="Genus species" + /mol_type="genomic DNA" + /strain="strain" + CDS 28..252 + /gene="pspB" + /locus_tag="OHLPOIIB_00003" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:P0AFM9" + /codon_start=1 + /transl_table=11 + /product="Phage shock protein B" + /translation="MSALFLAIPLTIFVLFVLPIWLWLHYSNRAGRGELSQSEQQRLL + QLTDDAQRMRERIQALEDILDAEHPNWRER" + CDS 301..681 + /gene="hpcD" + /locus_tag="OHLPOIIB_00004" + /EC_number="5.3.3.10" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:Q05354" + /codon_start=1 + /transl_table=11 + /product="5-carboxymethyl-2-hydroxymuconate + Delta-isomerase" + /translation="MPHFIAECTENIREQADLPSLFSKVNEALAATGIFPIGGIRSRA + HWLDTWQMADGKHDYAFVHMTLKIGAGRSLESRQEVGEMLFGLIKAHFADLMENRYLA + LSFEIAELHPTLNYKQNNVHALFK" + CDS 764..1597 + /gene="amiD" + /locus_tag="OHLPOIIB_00005" + /EC_number="3.5.1.28" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:P75820" + /codon_start=1 + /transl_table=11 + /product="N-acetylmuramoyl-L-alanine amidase AmiD" + /translation="MMKALLWLVGLALLLTGCASEKGIIDKEGYQLDTRHRAQAAYPR + IKVLVIHYTAENFDVSLATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAG + VSFWRGATRLNDTSIGIELENRGWRMSGGVKSFAPFESAQIQALIPLAKDIIARYDIK + PQNVVAHADIAPQRKDDPGPRFPWRELAAQGIGAWPDAQRVAFYLAGRAPYTPVDTAT + VLALLSRYGYEVKADMTAREQQRVIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQ + D" + CDS 1628..3295 + /gene="glnS" + /locus_tag="OHLPOIIB_00006" + /EC_number="6.1.1.18" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:P00962" + /codon_start=1 + /transl_table=11 + /product="Glutamine--tRNA ligase" + /translation="MSEAEARPTNFIRQIIDEDLASGKHTTVHTRFPPEPNGYLHIGH + AKSICLNFGIAQDYQGQCNLRFDDTNPVKEDIEYVDSIKNDVEWLGFHWSGDIRYSSD + YFDQLHAYAVELINKGLAYVDELTPEQIREYRGTLTAPGKNSPFRDRSVEENLALFEK + MRTGGFEEGKACLRAKIDMASPFIVMRDPVLYRIKFAEHHQTGTKWCIYPMYDFTHCI + SDALEGITHSLCTLEFQDNRRLYDWVLDNITIPVHPRQYEFSRLNLEYTVMSKRKLNL + LVTDKHVEGWDDPRMPTISGLRRRGYTAASIREFCKRIGVTKQDNTIEMASLESCIRE + DLNENAPRAMAVIDPVKLVIENYPQGESEMVTMPNHPNKPEMGSREVPFSGEIWIDRA + DFREEANKQYKRLVMGKEVRLRNAYVIKAERVEKDAEGNITTIFCTYDADTLSKDPAD + GRKVKGVIHWVSVAHALPIEIRLYDRLFSVPNPGAAEDFLSVINPESLVIKQGYGEPS + LKAAVAGKAFQFEREGYFCLDSRYATADKLVFNRTVGLRDTWAKAGE" +ORIGIN + 1 cgcgataata tagcgcgcgc tcatagcatg agcgcgctat ttctggccat cccgttaacc + 61 atttttgtgt tgtttgtgtt accgatttgg ctgtggctgc attacagcaa ccgcgccggt + 121 cggggagaac tgtcgcaaag cgagcagcaa cgcttactgc aactcacaga cgacgcgcaa + 181 cgtatgcgcg agcgcattca ggcgctggaa gatattcttg atgcagagca tccgaactgg + 241 agagagcgct aacgccatat tatagcgcgc ctcataagag cggcctatag cgcgctannn + 301 atgccgcact ttattgctga atgtactgaa aatattcgcg agcaggctga tttaccaagc + 361 ctgttcagca aggtaaacga ggcgctggcc gccaccggga ttttccccat cggcggtatc + 421 cgcagtcgcg cccactggct ggatacctgg cagatggctg acggtaagca tgattacgcg + 481 tttgtgcata tgacgctgaa aatcggcgcc gggcgcagcc tggagagccg tcaggaagtc + 541 ggcgaaatgc tgtttgggct gattaaagcc cacttcgccg acctgatgga gaaccgctat + 601 ctggcgctgt cgtttgagat tgccgagtta catccaacgc tcaattacaa acaaaacaac + 661 gtacacgcgt tatttaaata gccagagatt cgcgcgttca gagaggagct ctctcataga + 721 cgcgcgcata tgcgctctag agaggcgcgc ctaatggcgc gctatgatga aagcgctact + 781 gtggctggtt ggtctcgcgt tgctgttaac aggctgcgcg agcgaaaaag gaattatcga + 841 taaagaggga tatcagcttg atacccgaca tcgggcgcag gcggcctatc cgcgcattaa + 901 agtcctggtg attcactata cggcggaaaa ctttgacgtt tcgctggcga cgttaacggg + 961 tcgcaacgtc agttcgcatt acctgattcc cgcaaccccg ccattatatg gcggtaaacc + 1021 gcgcatctgg caactggtgc cggaacagga tcaggcttgg catgcgggcg tcagtttctg + 1081 gcgaggcgcc acgcgtctca atgatacgtc tattggcatt gagctggaaa atcgcggctg + 1141 gcgaatgtcc ggcggggtga aatctttcgc gccgtttgaa tccgcgcaaa ttcaggcatt + 1201 gatcccgtta gcgaaggaca ttatcgcgcg ctatgacatc aaaccgcaga atgtggtggc + 1261 ccatgcggat atcgcgccgc agcgtaaaga cgatcccggc ccgcgcttcc cgtggcgcga + 1321 gctggcggca caggggattg gcgcctggcc tgacgcccag cgtgtggcgt tttatctggc + 1381 tggacgcgcg ccgtatacgc cagtcgatac cgcaacggtg cttgcgttac tctcgcgcta + 1441 tggctatgaa gtcaaagccg atatgacggc gcgcgagcaa cagcgggtga ttatggcgtt + 1501 ccagatgcac ttccgtccgg cgcaatggaa cggtatcgca gatgccgaaa cgcaggcgat + 1561 tgccgaagca ttactggaga agtacggcca ggattaacaa agcgctctca gagagagcgc + 1621 tctgcagatg agtgaggctg aagcccgccc gactaacttt attcgtcaga ttattgatga + 1681 agatctggcg agtggtaaac ataccactgt ccatacccgt tttccgccgg agccgaatgg + 1741 ctatctgcac atcggccacg cgaaatctat ctgcctgaac tttggcatcg cgcaagatta + 1801 tcagggccag tgcaacctgc gtttcgatga caccaacccg gtaaaagaag atatcgagta + 1861 cgttgattcg atcaaaaacg acgtcgagtg gttaggcttt cactggtctg gcgatattcg + 1921 ctactcctcc gattactttg accaactgca cgcctatgcg gtcgagctca tcaataaagg + 1981 cctggcctac gttgatgagc tgacgccgga gcagatccgc gaataccgtg gtacgctgac + 2041 cgcgccgggt aaaaacagcc cgttccgcga tcgcagcgtg gaagaaaacc tcgcgctatt + 2101 tgaaaaaatg cgtaccggcg gttttgaaga gggtaaagcc tgtctgcgcg ctaaaatcga + 2161 catggcgtcg ccgtttatcg tgatgcgcga tccggtgctg tatcgcatta aattcgccga + 2221 gcatcatcag accggcacga agtggtgcat ctatccgatg tacgacttta ctcactgcat + 2281 cagcgatgcg ctggaaggca ttactcattc tctgtgtacg ctggagttcc aggacaaccg + 2341 tcgtctgtac gactgggtgc tggacaacat caccattccg gttcacccgc gccagtacga + 2401 attctcgcgc ctgaatctgg aatacaccgt gatgtccaag cgtaagctga acctgctggt + 2461 gaccgacaaa cacgtcgaag gttgggatga tccgcgtatg ccgactattt ccggtctgcg + 2521 ccgtcgcggc tataccgcgg cttctattcg tgagttctgc aaacgcatcg gcgtcaccaa + 2581 gcaggacaac actattgaga tggcgtcgct ggaatcctgc attcgcgaag atctgaacga + 2641 aaacgcgccg cgcgcgatgg cggtaatcga tccggtaaaa ctggttatcg aaaactaccc + 2701 gcagggcgag agcgaaatgg ttaccatgcc taaccatccg aataaaccgg agatgggcag + 2761 ccgtgaagtg ccgtttagcg gtgagatctg gatcgatcgc gcggatttcc gcgaagaagc + 2821 gaacaaacag tacaaacgtc tggtgatggg caaagaagtg cgtctgcgta atgcctacgt + 2881 cattaaagcg gagcgcgtag agaaggatgc cgaagggaat atcaccacca tcttctgtac + 2941 ctatgatgct gatacgctga gtaaagatcc ggctgacggg cgtaaagtga aaggcgtaat + 3001 ccactgggtt agcgtagcac atgcgctgcc gattgaaatt cgtctctacg accgtctgtt + 3061 cagcgtgccg aacccgggcg ccgcggagga cttcctgtct gttatcaacc ccgaatcatt + 3121 agtgattaag caggggtatg gcgagccgtc gctgaaagcg gcggtagcag ggaaagcttt + 3181 ccagtttgaa cgtgaaggtt acttctgcct tgacagccgc tatgcaacgg ccgataagct + 3241 ggtctttaac cgcaccgtgg gcctgcgtga tacctgggcg aaagcgggcg aataa +// +LOCUS this_is_genome_3 2263 bp DNA linear 12-FEB-2019 +DEFINITION Genus species strain strain. +ACCESSION +VERSION +KEYWORDS . +SOURCE Genus species + ORGANISM Genus species + Unclassified. +COMMENT Annotated using prokka 1.14-dev from + https://github.com/tseemann/prokka. +FEATURES Location/Qualifiers + source 1..2263 + /organism="Genus species" + /mol_type="genomic DNA" + /strain="strain" + CDS 30..797 + /gene="trmD" + /locus_tag="OHLPOIIB_00007" + /EC_number="2.1.1.228" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:P0A873" + /codon_start=1 + /transl_table=11 + /product="tRNA (guanine-N(1)-)-methyltransferase" + /translation="MFIGIVSLFPEMFRAITDYGVTGRAVKKGLLNIQSWSPRDFAHD + RHRTVDDRPYGGGPGMLMMVQPLRDAIHAAKAAAGEGAKVIYLSPQGRKLDQAGVSEL + ATNQKLILVCGRYEGVDERVIQAEIDEEWSIGDYVLSGGELPAMTLIDSVARFIPGVL + GHEASAIEDSFADGLLDCPHYTRPEVLEGMEVPPVLLSGNHAEIRRWRLKQSLGRTWL + RRPELLENLALTEEQARLLAEFKTEHAQQQHKHDGMA" + CDS 862..1122 + /locus_tag="OHLPOIIB_00008" + /inference="ab initio prediction:Prodigal:2.6" + /codon_start=1 + /transl_table=11 + /product="hypothetical protein" + /translation="MAGLHAPYAYSAHHAVNFCSEYKRGFVLGFTHRMFEKTGDRQLS + AWEAGILTRRYGLDKEMVMDFFKENHSGMAVRFFMAGYRLEG" + CDS 1170..1751 + /locus_tag="OHLPOIIB_00009" + /inference="ab initio prediction:Prodigal:2.6" + /codon_start=1 + /transl_table=11 + /product="hypothetical protein" + /translation="MSTLLYLHGFNSSPRSAKACQLKNWLAERHPHVEMIVPQLPPYP + ADAAELLESLVLEHGGAPLGLVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDY + LGQNENPYTGQQYVLESRHIYDLKVMQIDPLEAPDLIWLLQQTGDEVLDYRQAVAYYA + SCRQTVTEGGNHAFTGFEDYFNQIVDFLGLHSC" + CDS 1895..2263 + /locus_tag="OHLPOIIB_00010" + /inference="ab initio prediction:Prodigal:2.6" + /codon_start=1 + /transl_table=11 + /product="hypothetical protein" + /translation="MMRTDGTVTQPRLKQVALSAISEVNAELYEFRRRQQMLGYASLA + EVPAEQLDGKSERIQHYFNAVYCWARAMLNERYQDYDATASGVKRGEELAEASGDLWR + DARWAISRVQDAPHCTVELI" +ORIGIN + 1 acgcgctata gggctctcag agagtctcag tgtttattgg catcgttagc ctgtttcctg + 61 aaatgttccg cgcaattacc gattacgggg taactggccg ggcagtaaaa aaaggcctgc + 121 tgaacatcca aagctggagt cctcgcgact tcgcgcatga ccggcaccgt accgtggacg + 181 accgtcctta cggcggcgga ccggggatgt taatgatggt gcaacccttg cgggacgcca + 241 ttcatgcagc aaaagccgcg gcaggtgaag gcgctaaagt gatttatctg tcgcctcagg + 301 gacgcaagct tgatcaagcg ggcgttagcg agctggccac gaatcagaaa ctcattctgg + 361 tgtgtggtcg ctacgaaggc gtagatgagc gcgtaattca ggccgaaatt gacgaagaat + 421 ggtcaattgg cgattacgtt ctcagcggtg gcgaactacc ggcaatgacg ctgattgact + 481 ccgtcgcccg gtttataccg ggggttctgg ggcatgaggc atcagcaatc gaagattcgt + 541 ttgctgatgg gttgctggat tgtccgcact atacgcgccc tgaagtgtta gaggggatgg + 601 aagtaccgcc agtattgctg tcgggaaacc atgctgagat acgtcgctgg cgtttgaaac + 661 agtcactggg ccgaacctgg cttagaagac ctgaacttct ggaaaacctg gctctgactg + 721 aagagcaagc aaggttgctg gcggagttca aaacagaaca cgcacaacag cagcataaac + 781 atgatgggat ggcatagcga gatcgcgcat agcgcgcggc gagatccgcg agacatgaat + 841 aatcattttg ggaaagggtt aatggccggg ttgcacgcgc catatgcata tagcgcgcat + 901 catgcggtga atttctgttc tgagtataaa cgtggctttg tattgggttt tacacaccgt + 961 atgttcgaaa agaccggcga tcgtcaactt agcgcgtggg aggccggaat tctgacgcgt + 1021 cgctatggtc tggataaaga aatggtgatg gatttcttta aagagaatca ttccgggatg + 1081 gcggttcgct tctttatggc tggttatcga ctcgaaggtt gaacgcctct agagcgctag + 1141 aggcgcgcgc gatatacgcg gcgcgagaca tgtctacgct tctctatttg cacggattca + 1201 acagttcccc tcgctcggca aaagcgtgcc agctaaaaaa ctggctggcg gagcgtcatc + 1261 cgcatgttga gatgatcgtc cctcaactgc cgccgtatcc tgccgatgcg gcggagttgc + 1321 tggaatctct cgtacttgag catggcggtg cgccattagg gctggtagga tcgtcgctgg + 1381 gtggttatta cgccacctgg ctgtcgcaat gttttatgct gccggctgtg gtggtgaatc + 1441 ccgccgtgcg gccctttgaa ttactgaccg actatctcgg tcagaacgag aacccctaca + 1501 ccgggcagca atatgtgcta gagtctcgcc atatttatga tcttaaagtc atgcagattg + 1561 acccgctgga agcgccggac ctgatctggc tactgcaaca gacgggcgat gaagtgctgg + 1621 attaccgcca ggcggtggca tattacgcct cctgccgtca gacagtgacc gaggggggta + 1681 atcacgcatt cacgggcttc gaagattatt tcaaccagat tgtcgatttt cttggactgc + 1741 acagttgctg accggatata cgcgcgcgcg ctatatagcg cgcgcggcga tatagcgcat + 1801 gaagtttgtt gcgccagaac aggcaccgga acaggcggaa atcatcagaa atacgccgtt + 1861 ctggcctgat gtggacctgt cggagtttcg cagcatgatg cgcactgatg gcacggtgac + 1921 gcagccgcgt ttaaagcagg ttgcgctgtc ggcaatttcg gaggtcaacg cagagctgta + 1981 tgagtttcgc agacgccagc agatgctggg gtatgcctcg ctggcagaag tcccggcgga + 2041 acaactggac ggcaaaagcg agcgcattca gcactatttc aacgcggttt actgctgggc + 2101 acgcgccatg ctcaacgaac gttaccagga ctatgacgcc acggcatccg gtgtgaagcg + 2161 gggcgaagaa ctggcagaag ccagcggtga tttgtggcgt gacgcccgct gggccatcag + 2221 ccgggtgcag gatgcgccgc actgcacagt ggagcttatc tga +// +LOCUS this_is_genome_4 2845 bp DNA linear 12-FEB-2019 +DEFINITION Genus species strain strain. +ACCESSION +VERSION +KEYWORDS . +SOURCE Genus species + ORGANISM Genus species + Unclassified. +COMMENT Annotated using prokka 1.14-dev from + https://github.com/tseemann/prokka. +FEATURES Location/Qualifiers + source 1..2845 + /organism="Genus species" + /mol_type="genomic DNA" + /strain="strain" + CDS 1..1308 + /gene="dctM" + /locus_tag="OHLPOIIB_00011" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:O07838" + /codon_start=1 + /transl_table=11 + /product="C4-dicarboxylate TRAP transporter large permease + protein DctM" + /translation="MIDPIFASCTLIAVFVVLLAMGAPIGICIVIASFSTMMLVLPFD + ISMFATAQKMFSSLDSFALLAVPFFVLSGVIMNSGGIAARLVNFAKLFTGKLPGSLSY + TNIVGNMMFGAISGSAIAASTSIGGVMVPMSAREGYDRGFAAAVNIASAPTGMLIPPT + TAFILYALASGGTSIAALFAGGLVAGVLWGVGCMLVTLVVAKRRNYRVFFTVQKGMAL + KVAVEAIPSLLLIVIIVGGIVQGIFTAIEASAIAVVYTLLLTIVFYRTLKIKDLPSIL + LQTVVMTGVIMFLLATSSAMSFSMSITNIPAALSDMILGISANKLVILLVITVFLLII + GAFMDIGPAILIFTPILLPIMTKLGVDPVHFGIIMIYNLAIGTITPPVGSGLYVGASV + GKVKVEDVIKPLMPFYGAIIGVLLLITYIPEITLFLPRLLGIM" + CDS 1340..1804 + /locus_tag="OHLPOIIB_00012" + /inference="ab initio prediction:Prodigal:2.6" + /codon_start=1 + /transl_table=11 + /product="hypothetical protein" + /translation="MKFVAPEQAPEQAEVIKNTPFWPDVNLSEFRSVMRTDGTVTQPR + LKQVVLTAISEVNAELYDFRNRQQMLGWRTLAEVPAEMLDGKSERIRHYHNAVFCWAR + AVLNERYQDYDATASGVKRGEELAEASGDLWRDARWAISRVQDAPHCTVELI" + CDS 1832..2845 + /gene="xerC" + /locus_tag="OHLPOIIB_00013" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:P39776" + /codon_start=1 + /transl_table=11 + /product="Tyrosine recombinase XerC" + /translation="METNITWQQLIDEYFFAKPLRSASEWSYTKVFKSFVHYMGPLSC + PNDVTYHKVLAWRRFLLKEKKLSGRTWNNKVAHMRAIFNYGIQRGLLHYDENPFNNSV + VKPDKKRKKTLTQAQIEYAYQIMEQYENQENTGLGLKYSRCALFPAWFWLTVLDTLYY + TGIRQNQLLHIRLNDVDLREGQIRLITEGCKNHKEHYVPVISFLRPRLTCLMEKAQSE + GLKGNDRLFNIALFTGKDPAIGDDMDSPQVRAFFRRLSKECQFAISPHRFRHTLATEM + MKMPEQNLHMAQSVLGHSNMKSTLEYVENDIAVMGRALEAQFMQIKAAHARSIYSGLT + KNR" +ORIGIN + 1 atgattgacc ctatttttgc gtcctgtacg ctaattgccg tctttgttgt tttactggcc + 61 atgggcgcgc ctatcgggat ctgcatcgtt atcgcctctt tcagcaccat gatgctggta + 121 ctgcctttcg atatttcgat gttcgccacc gcgcaaaaaa tgttctccag cctggacagt + 181 tttgccttgc tggccgtgcc gttcttcgtt ttgtccgggg tgatcatgaa tagcggggga + 241 attgccgccc gactggtcaa ttttgccaaa ctgtttactg gcaaactgcc cggctcgctc + 301 tcttacacca acatcgtcgg caatatgatg ttcggtgcaa tttccggatc ggcgattgcc + 361 gcctcaacct ctatcggcgg cgtgatggtg ccgatgagcg cgcgcgaagg ttacgatcgc + 421 ggttttgcgg ccgcggtgaa tatcgcctcc gcgccgacgg gaatgttaat tccgcccacc + 481 acggctttta tcctttatgc gctggcaagc gggggaacat cgattgccgc tctgttcgcc + 541 ggcggtctgg tcgcgggagt gctgtggggc gttggctgta tgctggtcac gctggtggtc + 601 gctaagcgtc gaaattatcg ggttttcttc accgtccaaa aaggcatggc gctaaaagtt + 661 gccgttgagg ccattcccag cctgttactg atcgtgatta ttgtcggcgg cattgtgcag + 721 gggattttca ccgccattga agcctccgcg attgccgtgg tgtatacgtt attgctgacg + 781 atagtgtttt accgcacgct gaaaattaag gatttgcctt cgattttgct ccagacagtg + 841 gtaatgaccg gggtcatcat gttcctgctg gcaacctctt cggcgatgtc cttctcgatg + 901 tcgatcacca atattcctgc ggcgctgagc gatatgatcc tcggtatttc cgccaataaa + 961 ctggttatcc tgttagtcat taccgtcttt ttgttgatta tcggcgcatt tatggatatc + 1021 ggtccggcca ttctgatttt taccccgatt ctgctgccga ttatgactaa actgggcgtc + 1081 gatccggtgc atttcggcat tatcatgatc tataacctgg cgataggcac cattacgccg + 1141 ccagttggca gtggtttata tgtcggggcg agcgtcggta aggtcaaagt tgaggacgtt + 1201 atcaaaccgt tgatgccttt ttacggcgcg attatcggcg ttctgttatt aattacctac + 1261 attccggaaa tcacactgtt tttaccccgt ctactgggca tcatgtaaac gctcataggc + 1321 ggcgcgcgcg ctctcaggaa tgaagtttgt tgcgccagaa caggcaccgg aacaggcgga + 1381 ggtcatcaaa aatacgccgt tctggcctga tgtgaacctg tcggaatttc gcagtgtgat + 1441 gcgcactgac ggcacggtga cgcagccgcg tttaaagcag gtcgtgctga cggcgatctc + 1501 tgaggttaac gctgagctgt acgacttccg caaccgtcag cagatgctgg gctggcggac + 1561 acttgctgag gttcccgcag aaatgctgga cggtaaaagc gagcgtatcc ggcactacca + 1621 caacgctgtt ttttgctggg cgcgcgctgt gcttaatgag cgttatcagg actatgacgc + 1681 cacggcgtca ggcgtgaagc gaggggagga gctggcggag gccagcggcg atctgtggcg + 1741 tgatgcccgc tgggccatca gccgggtgca ggatgcaccg cactgtacgg tggagcttat + 1801 ctgacgctca taggcgcgcg ctcatagcgc gatggaaaca aatattacct ggcaacaatt + 1861 gatagatgaa tatttcttcg caaaacctct gcgctcagca tctgaatgga gttacaccaa + 1921 agtcttcaaa tcatttgtac attatatggg gccgttaagc tgccctaatg atgtgacata + 1981 tcacaaagtg cttgcctggc gccgttttct tttaaaagag aaaaagctgt ccggacgtac + 2041 ctggaataac aaggtggcgc atatgcgggc catctttaac tacggaatac agcgagggtt + 2101 actgcactat gacgaaaatc cgtttaacaa ttcggtagtt aaaccggaca agaagagaaa + 2161 gaaaacgctc actcaggcac agattgagta tgcctatcag atcatggagc agtatgaaaa + 2221 tcaggagaat acagggctgg gactgaaata ttcccgctgc gccttatttc ctgcatggtt + 2281 ctggctcact gtcctggata cgctctatta cacagggata cgtcagaacc agttattaca + 2341 tattcggctg aatgatgttg atttgagaga agggcagatt cggctgatta cggaggggtg + 2401 taaaaatcac aaagaacact atgtgccggt gatcagtttt ctgcgtccac ggctgacctg + 2461 tttaatggag aaagcgcaga gcgaaggatt gaaaggtaat gaccgcctgt tcaatattgc + 2521 actttttacc ggcaaagatc ccgccattgg cgatgacatg gattctcctc aggtaagagc + 2581 attcttccgt cgtctgtcca aggagtgtca gtttgcgatc agtcctcatc gtttcagaca + 2641 cacgctggcc acggagatga tgaaaatgcc ggaacagaat ctgcatatgg cgcaaagtgt + 2701 gctgggtcat tcaaacatga aatccacgct ggagtatgtg gagaatgata ttgcagtgat + 2761 ggggagggct ctggaagcgc agtttatgca gattaaggca gcacatgccc gaagcattta + 2821 cagtgggttg acaaagaata gataa +// diff --git a/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.gff b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.gff new file mode 100644 index 0000000000000000000000000000000000000000..401a3a9388641ab76096af266eba40dd541e3030 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.gff @@ -0,0 +1,203 @@ +##gff-version 3 +##sequence-region this_is_genome_1 1 2308 +##sequence-region this_is_genome_2 1 3295 +##sequence-region this_is_genome_3 1 2263 +##sequence-region this_is_genome_4 1 2845 +this_is_genome_1 Prodigal:2.6 CDS 1 774 . + 0 ID=OHLPOIIB_00001;Name=hisP;dbxref=COG:COG4598;gene=hisP;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:P02915;locus_tag=OHLPOIIB_00001;product=Histidine transport ATP-binding protein HisP +this_is_genome_1 Prodigal:2.6 CDS 857 2260 . + 0 ID=OHLPOIIB_00002;inference=ab initio prediction:Prodigal:2.6;locus_tag=OHLPOIIB_00002;product=hypothetical protein +this_is_genome_2 Prodigal:2.6 CDS 28 252 . + 0 ID=OHLPOIIB_00003;Name=pspB;gene=pspB;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:P0AFM9;locus_tag=OHLPOIIB_00003;product=Phage shock protein B +this_is_genome_2 Prodigal:2.6 CDS 301 681 . + 0 ID=OHLPOIIB_00004;eC_number=5.3.3.10;Name=hpcD;gene=hpcD;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:Q05354;locus_tag=OHLPOIIB_00004;product=5-carboxymethyl-2-hydroxymuconate Delta-isomerase +this_is_genome_2 Prodigal:2.6 CDS 764 1597 . + 0 ID=OHLPOIIB_00005;eC_number=3.5.1.28;Name=amiD;dbxref=COG:COG3023;gene=amiD;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:P75820;locus_tag=OHLPOIIB_00005;product=N-acetylmuramoyl-L-alanine amidase AmiD +this_is_genome_2 Prodigal:2.6 CDS 1628 3295 . + 0 ID=OHLPOIIB_00006;eC_number=6.1.1.18;Name=glnS;dbxref=COG:COG0008;gene=glnS;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:P00962;locus_tag=OHLPOIIB_00006;product=Glutamine--tRNA ligase +this_is_genome_3 Prodigal:2.6 CDS 30 797 . + 0 ID=OHLPOIIB_00007;eC_number=2.1.1.228;Name=trmD;dbxref=COG:COG0336;gene=trmD;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:P0A873;locus_tag=OHLPOIIB_00007;product=tRNA (guanine-N(1)-)-methyltransferase +this_is_genome_3 Prodigal:2.6 CDS 862 1122 . + 0 ID=OHLPOIIB_00008;inference=ab initio prediction:Prodigal:2.6;locus_tag=OHLPOIIB_00008;product=hypothetical protein +this_is_genome_3 Prodigal:2.6 CDS 1170 1751 . + 0 ID=OHLPOIIB_00009;inference=ab initio prediction:Prodigal:2.6;locus_tag=OHLPOIIB_00009;product=hypothetical protein +this_is_genome_3 Prodigal:2.6 CDS 1895 2263 . + 0 ID=OHLPOIIB_00010;inference=ab initio prediction:Prodigal:2.6;locus_tag=OHLPOIIB_00010;product=hypothetical protein +this_is_genome_4 Prodigal:2.6 CDS 1 1308 . + 0 ID=OHLPOIIB_00011;Name=dctM;dbxref=COG:COG1593;gene=dctM;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:O07838;locus_tag=OHLPOIIB_00011;product=C4-dicarboxylate TRAP transporter large permease protein DctM +this_is_genome_4 Prodigal:2.6 CDS 1340 1804 . + 0 ID=OHLPOIIB_00012;inference=ab initio prediction:Prodigal:2.6;locus_tag=OHLPOIIB_00012;product=hypothetical protein +this_is_genome_4 Prodigal:2.6 CDS 1832 2845 . + 0 ID=OHLPOIIB_00013;Name=xerC;dbxref=COG:COG4974;gene=xerC;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:P39776;locus_tag=OHLPOIIB_00013;product=Tyrosine recombinase XerC +##FASTA +>this_is_genome_1 +ATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTG +CTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGTGATTAGCATTATCGGCTCGTCC +GGTTCCGGTAAAAGCACTTTTTTGCGCTGCATTAACTTCCTCGAAAAATCGAGCGAAGGC +GCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGACGGGCAGCTCAAA +GTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCAC +TTTAACCTCTGGAACCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTA +CTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGG +ATTGATGAGCGCGCTCAGGGCAAATATCCCGTCCATCTCTCCGGCGGCCAACAGCAGCGC +GTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACA +TCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAA +GAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCCCGCCATGTCTCTTCG +CACGTGATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGCAATCCGGAGCAGGTGTTC +GGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAAACGCTA +GAGGACGCGCCTCTCAGAGAGCGCGCTCTCTCAGAGAGGCGCGCGCCTCTTTCGCAGAGA +CCNNCGCTCATGAGCGATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTGACGGCAGG +TTCCGCGCTTGCTGTTCTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCG +CCAGACCGTCGATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGC +CTTCGCCGACGGGCAGACTGTGGTCGTGCCGTCAGGATGGGTGTGTGAAAATATCAATGC +GGCGATAACGATTCCGGCGGGAAAAACGCTGCGGATACAGGGCGCGGTGCGTGGGAATGG +CCGGGGACGGTTTATTTTGCTGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCT +GCACAATGTGACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGACGATGAG +CGGCTTTGGCCCCGTCGCGCAAATTTTCATCGGCGGTAAGGAACCGCAGGTGATGCGTAA +TCTCATTATCGATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATT +TCATAACCAAATGGACGGCGCGCGGATTACGCATAGTCGCTTTAGCGATTTGCAGGGGGA +CGCCATTGAGTGGAATGTCGCGATTCATGACCGCGACATCCTGATTTCCGATCATGTCAT +CGAACGCATTGATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGG +TAGCGCCTATGACAATAGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTGGTGGCCAA +TATTACCGGATCTGATTGCCGACAACTGGTACACGTAGAAAATGGCAAACATTTCGTCAT +TCGCAATGTCAAAGCCAAAAACATCACGCCCGATTTCAGTAAAAATGCGGGTATTGATAA +CGCAACGATCGCCATTTATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAA +TAGTGCCGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCA +AAACTTTAAATTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGG +CATTCAAATTTCCTCCGGTAACGCCCCCTCATTTGTTGCCATCACCAATGTACGGATGAC +GCGTGCTACGCTGGAACTGCATAATCAACCGCAGCACCTCTTTTTGCGTAATATCAACGT +GATGCAAACTTCAGCGATTGGCCCGGCGTTAAAAATGCATTTTGATTTGCGTAAAGATGT +CCGTGGTCAATTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCAT +CAATGAAAACGGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGT +CGAAGCAGTGAATTTTTCGCTGCCGAAGCGGGGAGGGTAACACTCGCGCATAGAGAGCTC +TCAGAGGAGCGCGCGCGCTATAGCGCGC +>this_is_genome_2 +CGCGATAATATAGCGCGCGCTCATAGCATGAGCGCGCTATTTCTGGCCATCCCGTTAACC +ATTTTTGTGTTGTTTGTGTTACCGATTTGGCTGTGGCTGCATTACAGCAACCGCGCCGGT +CGGGGAGAACTGTCGCAAAGCGAGCAGCAACGCTTACTGCAACTCACAGACGACGCGCAA +CGTATGCGCGAGCGCATTCAGGCGCTGGAAGATATTCTTGATGCAGAGCATCCGAACTGG +AGAGAGCGCTAACGCCATATTATAGCGCGCCTCATAAGAGCGGCCTATAGCGCGCTANNN +ATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTTACCAAGC +CTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCACCGGGATTTTCCCCATCGGCGGTATC +CGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGATTACGCG +TTTGTGCATATGACGCTGAAAATCGGCGCCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTC +GGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCGACCTGATGGAGAACCGCTAT +CTGGCGCTGTCGTTTGAGATTGCCGAGTTACATCCAACGCTCAATTACAAACAAAACAAC +GTACACGCGTTATTTAAATAGCCAGAGATTCGCGCGTTCAGAGAGGAGCTCTCTCATAGA +CGCGCGCATATGCGCTCTAGAGAGGCGCGCCTAATGGCGCGCTATGATGAAAGCGCTACT +GTGGCTGGTTGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGCGAAAAAGGAATTATCGA +TAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCCTATCCGCGCATTAA +AGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGACGTTAACGGG +TCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTATATGGCGGTAAACC +GCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCTTGGCATGCGGGCGTCAGTTTCTG +GCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAATCGCGGCTG +GCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAATTCAGGCATT +GATCCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCGCAGAATGTGGTGGC +CCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGCTTCCCGTGGCGCGA +GCTGGCGGCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTGGCGTTTTATCTGGC +TGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCGTTACTCTCGCGCTA +TGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGGGTGATTATGGCGTT +CCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCCGAAACGCAGGCGAT +TGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACAAAGCGCTCTCAGAGAGAGCGC +TCTGCAGATGAGTGAGGCTGAAGCCCGCCCGACTAACTTTATTCGTCAGATTATTGATGA +AGATCTGGCGAGTGGTAAACATACCACTGTCCATACCCGTTTTCCGCCGGAGCCGAATGG +CTATCTGCACATCGGCCACGCGAAATCTATCTGCCTGAACTTTGGCATCGCGCAAGATTA +TCAGGGCCAGTGCAACCTGCGTTTCGATGACACCAACCCGGTAAAAGAAGATATCGAGTA +CGTTGATTCGATCAAAAACGACGTCGAGTGGTTAGGCTTTCACTGGTCTGGCGATATTCG +CTACTCCTCCGATTACTTTGACCAACTGCACGCCTATGCGGTCGAGCTCATCAATAAAGG +CCTGGCCTACGTTGATGAGCTGACGCCGGAGCAGATCCGCGAATACCGTGGTACGCTGAC +CGCGCCGGGTAAAAACAGCCCGTTCCGCGATCGCAGCGTGGAAGAAAACCTCGCGCTATT +TGAAAAAATGCGTACCGGCGGTTTTGAAGAGGGTAAAGCCTGTCTGCGCGCTAAAATCGA +CATGGCGTCGCCGTTTATCGTGATGCGCGATCCGGTGCTGTATCGCATTAAATTCGCCGA +GCATCATCAGACCGGCACGAAGTGGTGCATCTATCCGATGTACGACTTTACTCACTGCAT +CAGCGATGCGCTGGAAGGCATTACTCATTCTCTGTGTACGCTGGAGTTCCAGGACAACCG +TCGTCTGTACGACTGGGTGCTGGACAACATCACCATTCCGGTTCACCCGCGCCAGTACGA +ATTCTCGCGCCTGAATCTGGAATACACCGTGATGTCCAAGCGTAAGCTGAACCTGCTGGT +GACCGACAAACACGTCGAAGGTTGGGATGATCCGCGTATGCCGACTATTTCCGGTCTGCG +CCGTCGCGGCTATACCGCGGCTTCTATTCGTGAGTTCTGCAAACGCATCGGCGTCACCAA +GCAGGACAACACTATTGAGATGGCGTCGCTGGAATCCTGCATTCGCGAAGATCTGAACGA +AAACGCGCCGCGCGCGATGGCGGTAATCGATCCGGTAAAACTGGTTATCGAAAACTACCC +GCAGGGCGAGAGCGAAATGGTTACCATGCCTAACCATCCGAATAAACCGGAGATGGGCAG +CCGTGAAGTGCCGTTTAGCGGTGAGATCTGGATCGATCGCGCGGATTTCCGCGAAGAAGC +GAACAAACAGTACAAACGTCTGGTGATGGGCAAAGAAGTGCGTCTGCGTAATGCCTACGT +CATTAAAGCGGAGCGCGTAGAGAAGGATGCCGAAGGGAATATCACCACCATCTTCTGTAC +CTATGATGCTGATACGCTGAGTAAAGATCCGGCTGACGGGCGTAAAGTGAAAGGCGTAAT +CCACTGGGTTAGCGTAGCACATGCGCTGCCGATTGAAATTCGTCTCTACGACCGTCTGTT +CAGCGTGCCGAACCCGGGCGCCGCGGAGGACTTCCTGTCTGTTATCAACCCCGAATCATT +AGTGATTAAGCAGGGGTATGGCGAGCCGTCGCTGAAAGCGGCGGTAGCAGGGAAAGCTTT +CCAGTTTGAACGTGAAGGTTACTTCTGCCTTGACAGCCGCTATGCAACGGCCGATAAGCT +GGTCTTTAACCGCACCGTGGGCCTGCGTGATACCTGGGCGAAAGCGGGCGAATAA +>this_is_genome_3 +ACGCGCTATAGGGCTCTCAGAGAGTCTCAGTGTTTATTGGCATCGTTAGCCTGTTTCCTG +AAATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAAAGGCCTGC +TGAACATCCAAAGCTGGAGTCCTCGCGACTTCGCGCATGACCGGCACCGTACCGTGGACG +ACCGTCCTTACGGCGGCGGACCGGGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCA +TTCATGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGG +GACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAACTCATTCTGG +TGTGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGGCCGAAATTGACGAAGAAT +GGTCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGATTGACT +CCGTCGCCCGGTTTATACCGGGGGTTCTGGGGCATGAGGCATCAGCAATCGAAGATTCGT +TTGCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGG +AAGTACCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCGCTGGCGTTTGAAAC +AGTCACTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTG +AAGAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGCATAAAC +ATGATGGGATGGCATAGCGAGATCGCGCATAGCGCGCGGCGAGATCCGCGAGACATGAAT +AATCATTTTGGGAAAGGGTTAATGGCCGGGTTGCACGCGCCATATGCATATAGCGCGCAT +CATGCGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGT +ATGTTCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGCCGGAATTCTGACGCGT +CGCTATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATG +GCGGTTCGCTTCTTTATGGCTGGTTATCGACTCGAAGGTTGAACGCCTCTAGAGCGCTAG +AGGCGCGCGCGATATACGCGGCGCGAGACATGTCTACGCTTCTCTATTTGCACGGATTCA +ACAGTTCCCCTCGCTCGGCAAAAGCGTGCCAGCTAAAAAACTGGCTGGCGGAGCGTCATC +CGCATGTTGAGATGATCGTCCCTCAACTGCCGCCGTATCCTGCCGATGCGGCGGAGTTGC +TGGAATCTCTCGTACTTGAGCATGGCGGTGCGCCATTAGGGCTGGTAGGATCGTCGCTGG +GTGGTTATTACGCCACCTGGCTGTCGCAATGTTTTATGCTGCCGGCTGTGGTGGTGAATC +CCGCCGTGCGGCCCTTTGAATTACTGACCGACTATCTCGGTCAGAACGAGAACCCCTACA +CCGGGCAGCAATATGTGCTAGAGTCTCGCCATATTTATGATCTTAAAGTCATGCAGATTG +ACCCGCTGGAAGCGCCGGACCTGATCTGGCTACTGCAACAGACGGGCGATGAAGTGCTGG +ATTACCGCCAGGCGGTGGCATATTACGCCTCCTGCCGTCAGACAGTGACCGAGGGGGGTA +ATCACGCATTCACGGGCTTCGAAGATTATTTCAACCAGATTGTCGATTTTCTTGGACTGC +ACAGTTGCTGACCGGATATACGCGCGCGCGCTATATAGCGCGCGCGGCGATATAGCGCAT +GAAGTTTGTTGCGCCAGAACAGGCACCGGAACAGGCGGAAATCATCAGAAATACGCCGTT +CTGGCCTGATGTGGACCTGTCGGAGTTTCGCAGCATGATGCGCACTGATGGCACGGTGAC +GCAGCCGCGTTTAAAGCAGGTTGCGCTGTCGGCAATTTCGGAGGTCAACGCAGAGCTGTA +TGAGTTTCGCAGACGCCAGCAGATGCTGGGGTATGCCTCGCTGGCAGAAGTCCCGGCGGA +ACAACTGGACGGCAAAAGCGAGCGCATTCAGCACTATTTCAACGCGGTTTACTGCTGGGC +ACGCGCCATGCTCAACGAACGTTACCAGGACTATGACGCCACGGCATCCGGTGTGAAGCG +GGGCGAAGAACTGGCAGAAGCCAGCGGTGATTTGTGGCGTGACGCCCGCTGGGCCATCAG +CCGGGTGCAGGATGCGCCGCACTGCACAGTGGAGCTTATCTGA +>this_is_genome_4 +ATGATTGACCCTATTTTTGCGTCCTGTACGCTAATTGCCGTCTTTGTTGTTTTACTGGCC +ATGGGCGCGCCTATCGGGATCTGCATCGTTATCGCCTCTTTCAGCACCATGATGCTGGTA +CTGCCTTTCGATATTTCGATGTTCGCCACCGCGCAAAAAATGTTCTCCAGCCTGGACAGT +TTTGCCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGCGGGGGA +ATTGCCGCCCGACTGGTCAATTTTGCCAAACTGTTTACTGGCAAACTGCCCGGCTCGCTC +TCTTACACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCGATTGCC +GCCTCAACCTCTATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTACGATCGC +GGTTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAATTCCGCCCACC +ACGGCTTTTATCCTTTATGCGCTGGCAAGCGGGGGAACATCGATTGCCGCTCTGTTCGCC +GGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTGGTGGTC +GCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGCATGGCGCTAAAAGTT +GCCGTTGAGGCCATTCCCAGCCTGTTACTGATCGTGATTATTGTCGGCGGCATTGTGCAG +GGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTATACGTTATTGCTGACG +ATAGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAGACAGTG +GTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTCTCGATG +TCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCCAATAAA +CTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATGGATATC +GGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCGATTATGACTAAACTGGGCGTC +GATCCGGTGCATTTCGGCATTATCATGATCTATAACCTGGCGATAGGCACCATTACGCCG +CCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAGGACGTT +ATCAAACCGTTGATGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTATTAATTACCTAC +ATTCCGGAAATCACACTGTTTTTACCCCGTCTACTGGGCATCATGTAAACGCTCATAGGC +GGCGCGCGCGCTCTCAGGAATGAAGTTTGTTGCGCCAGAACAGGCACCGGAACAGGCGGA +GGTCATCAAAAATACGCCGTTCTGGCCTGATGTGAACCTGTCGGAATTTCGCAGTGTGAT +GCGCACTGACGGCACGGTGACGCAGCCGCGTTTAAAGCAGGTCGTGCTGACGGCGATCTC +TGAGGTTAACGCTGAGCTGTACGACTTCCGCAACCGTCAGCAGATGCTGGGCTGGCGGAC +ACTTGCTGAGGTTCCCGCAGAAATGCTGGACGGTAAAAGCGAGCGTATCCGGCACTACCA +CAACGCTGTTTTTTGCTGGGCGCGCGCTGTGCTTAATGAGCGTTATCAGGACTATGACGC +CACGGCGTCAGGCGTGAAGCGAGGGGAGGAGCTGGCGGAGGCCAGCGGCGATCTGTGGCG +TGATGCCCGCTGGGCCATCAGCCGGGTGCAGGATGCACCGCACTGTACGGTGGAGCTTAT +CTGACGCTCATAGGCGCGCGCTCATAGCGCGATGGAAACAAATATTACCTGGCAACAATT +GATAGATGAATATTTCTTCGCAAAACCTCTGCGCTCAGCATCTGAATGGAGTTACACCAA +AGTCTTCAAATCATTTGTACATTATATGGGGCCGTTAAGCTGCCCTAATGATGTGACATA +TCACAAAGTGCTTGCCTGGCGCCGTTTTCTTTTAAAAGAGAAAAAGCTGTCCGGACGTAC +CTGGAATAACAAGGTGGCGCATATGCGGGCCATCTTTAACTACGGAATACAGCGAGGGTT +ACTGCACTATGACGAAAATCCGTTTAACAATTCGGTAGTTAAACCGGACAAGAAGAGAAA +GAAAACGCTCACTCAGGCACAGATTGAGTATGCCTATCAGATCATGGAGCAGTATGAAAA +TCAGGAGAATACAGGGCTGGGACTGAAATATTCCCGCTGCGCCTTATTTCCTGCATGGTT +CTGGCTCACTGTCCTGGATACGCTCTATTACACAGGGATACGTCAGAACCAGTTATTACA +TATTCGGCTGAATGATGTTGATTTGAGAGAAGGGCAGATTCGGCTGATTACGGAGGGGTG +TAAAAATCACAAAGAACACTATGTGCCGGTGATCAGTTTTCTGCGTCCACGGCTGACCTG +TTTAATGGAGAAAGCGCAGAGCGAAGGATTGAAAGGTAATGACCGCCTGTTCAATATTGC +ACTTTTTACCGGCAAAGATCCCGCCATTGGCGATGACATGGATTCTCCTCAGGTAAGAGC +ATTCTTCCGTCGTCTGTCCAAGGAGTGTCAGTTTGCGATCAGTCCTCATCGTTTCAGACA +CACGCTGGCCACGGAGATGATGAAAATGCCGGAACAGAATCTGCATATGGCGCAAAGTGT +GCTGGGTCATTCAAACATGAAATCCACGCTGGAGTATGTGGAGAATGATATTGCAGTGAT +GGGGAGGGCTCTGGAAGCGCAGTTTATGCAGATTAAGGCAGCACATGCCCGAAGCATTTA +CAGTGGGTTGACAAAGAATAGATAA diff --git a/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.log b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.log new file mode 100644 index 0000000000000000000000000000000000000000..4e555a8670fed1a2ffe74b977bb3d5ef3b2fa5ba --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.log @@ -0,0 +1,119 @@ +[13:18:45] This is prokka 1.14-dev +[13:18:45] Written by Torsten Seemann <torsten.seemann@gmail.com> +[13:18:45] Homepage is https://github.com/tseemann/prokka +[13:18:45] Local time is Tue Feb 12 13:18:45 2019 +[13:18:45] You are aperrin +[13:18:45] Operating system is darwin +[13:18:45] You have BioPerl 1.006924 +[13:18:45] System has 8 cores. +[13:18:45] Will use maximum of 1 cores. +[13:18:45] Annotating as >>> Bacteria <<< +[13:18:45] Generating locus_tag from 'Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna' contents. +[13:18:45] Setting --locustag OHLPOIIB from MD5 8159822bce92b47adcc120629558b9c6 +[13:18:45] Creating new output folder: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes +[13:18:45] Running: mkdir -p Examples\/1\-res\-Annotate\/tmp_files\/genome2\.fst\-split5N\.fna\-prokkaRes +[13:18:45] Using filename prefix: GEN2.0219.00001.XXX +[13:18:45] Setting HMMER_NCPU=1 +[13:18:45] Writing log to: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.log +[13:18:45] Command: /usr/local/bin/prokka --outdir Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes --cpus 1 --prefix GEN2.0219.00001 Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna +[13:18:45] Appending to PATH: /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin +[13:18:45] Appending to PATH: /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/../common +[13:18:45] Appending to PATH: /Users/aperrin/Softwares/src/prokka/bin +[13:18:45] Looking for 'aragorn' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/aragorn +[13:18:45] Determined aragorn version is 1.2 +[13:18:45] Looking for 'barrnap' - found /usr/local/bin/barrnap +[13:18:45] Determined barrnap version is 0.8 +[13:18:45] Looking for 'blastp' - found /Users/aperrin/Softwares/bin/blastp +[13:18:45] Determined blastp version is 2.3 +[13:18:45] Looking for 'cmpress' - found /usr/local/bin/cmpress +[13:18:45] Determined cmpress version is 1.1 +[13:18:45] Looking for 'cmscan' - found /usr/local/bin/cmscan +[13:18:45] Determined cmscan version is 1.1 +[13:18:45] Looking for 'egrep' - found /usr/bin/egrep +[13:18:45] Looking for 'find' - found /usr/bin/find +[13:18:45] Looking for 'grep' - found /usr/bin/grep +[13:18:45] Looking for 'hmmpress' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/hmmpress +[13:18:45] Determined hmmpress version is 3.1 +[13:18:45] Looking for 'hmmscan' - found /usr/local/bin/hmmscan +[13:18:45] Determined hmmscan version is 3.1 +[13:18:45] Looking for 'java' - found /usr/bin/java +[13:18:45] Looking for 'less' - found /usr/bin/less +[13:18:45] Looking for 'makeblastdb' - found /Users/aperrin/Softwares/bin/makeblastdb +[13:18:45] Determined makeblastdb version is 2.3 +[13:18:45] Looking for 'minced' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/../common/minced +[13:18:45] Determined minced version is 2.0 +[13:18:45] Looking for 'parallel' - found /usr/local/bin/parallel +[13:18:45] Determined parallel version is 20181022 +[13:18:45] Looking for 'prodigal' - found /usr/local/bin/prodigal +[13:18:45] Determined prodigal version is 2.6 +[13:18:45] Looking for 'prokka-genbank_to_fasta_db' - found /Users/aperrin/Softwares/src/prokka/bin/prokka-genbank_to_fasta_db +[13:18:45] Looking for 'sed' - found /usr/bin/sed +[13:18:45] Looking for 'tbl2asn' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/tbl2asn +[13:18:45] Determined tbl2asn version is 25.6 +[13:18:45] Using genetic code table 11. +[13:18:45] Loading and checking input file: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna +[13:18:45] Wrote 4 contigs totalling 10711 bp. +[13:18:45] Predicting tRNAs and tmRNAs +[13:18:45] Running: aragorn -l -gc11 -w Examples\/1\-res\-Annotate\/tmp_files\/genome2\.fst\-split5N\.fna\-prokkaRes\/GEN2\.0219\.00001\.fna +[13:18:45] Found 0 tRNAs +[13:18:45] Predicting Ribosomal RNAs +[13:18:45] Running Barrnap with 1 threads +[13:18:45] Found 0 rRNAs +[13:18:45] Skipping ncRNA search, enable with --rfam if desired. +[13:18:45] Total of 0 tRNA + rRNA features +[13:18:45] Searching for CRISPR repeats +[13:18:46] Found 0 CRISPRs +[13:18:46] Predicting coding sequences +[13:18:46] Contigs total 10711 bp, so using meta mode +[13:18:46] Running: prodigal -i Examples\/1\-res\-Annotate\/tmp_files\/genome2\.fst\-split5N\.fna\-prokkaRes\/GEN2\.0219\.00001\.fna -c -m -g 11 -p meta -f sco -q +[13:18:46] Found 13 CDS +[13:18:46] Connecting features back to sequences +[13:18:46] Not using genus-specific database. Try --usegenus to enable it. +[13:18:46] Annotating CDS, please be patient. +[13:18:46] Will use 1 CPUs for similarity searching. +[13:18:46] There are still 13 unannotated CDS left (started with 13) +[13:18:46] Will use blast to search against /Users/aperrin/Softwares/src/prokka/db/kingdom/Bacteria/sprot with 1 CPUs +[13:18:46] Running: cat Examples\/1\-res\-Annotate\/tmp_files\/genome2\.fst\-split5N\.fna\-prokkaRes\/sprot\.faa | parallel --gnu --plain -j 1 --block 1697 --recstart '>' --pipe blastp -query - -db /Users/aperrin/Softwares/src/prokka/db/kingdom/Bacteria/sprot -evalue 1e-06 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > Examples\/1\-res\-Annotate\/tmp_files\/genome2\.fst\-split5N\.fna\-prokkaRes\/sprot\.blast 2> /dev/null +[13:18:47] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/sprot.faa +[13:18:47] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/sprot.blast +[13:18:47] There are still 5 unannotated CDS left (started with 13) +[13:18:47] Will use hmmer3 to search against /Users/aperrin/Softwares/src/prokka/db/hmm/HAMAP.hmm with 1 CPUs +[13:18:47] Running: cat Examples\/1\-res\-Annotate\/tmp_files\/genome2\.fst\-split5N\.fna\-prokkaRes\/HAMAP\.hmm\.faa | parallel --gnu --plain -j 1 --block 521 --recstart '>' --pipe hmmscan --noali --notextw --acc -E 1e-06 --cpu 1 /Users/aperrin/Softwares/src/prokka/db/hmm/HAMAP.hmm /dev/stdin > Examples\/1\-res\-Annotate\/tmp_files\/genome2\.fst\-split5N\.fna\-prokkaRes\/HAMAP\.hmm\.hmmer3 2> /dev/null +[13:18:48] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/HAMAP.hmm.faa +[13:18:48] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/HAMAP.hmm.hmmer3 +[13:18:48] Labelling remaining 5 proteins as 'hypothetical protein' +[13:18:48] Found 8 unique /gene codes. +[13:18:48] Fixed 0 colliding /gene names. +[13:18:48] Adding /locus_tag identifiers +[13:18:48] Assigned 13 locus_tags to CDS and RNA features. +[13:18:48] Writing outputs to Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/ +[13:18:48] Generating annotation statistics file +[13:18:48] Generating Genbank and Sequin files +[13:18:48] Running: tbl2asn -V b -a r10k -l paired-ends -M n -N 1 -y 'Annotated using prokka 1.14-dev from https://github.com/tseemann/prokka' -Z Examples\/1\-res\-Annotate\/tmp_files\/genome2\.fst\-split5N\.fna\-prokkaRes\/GEN2\.0219\.00001\.err -i Examples\/1\-res\-Annotate\/tmp_files\/genome2\.fst\-split5N\.fna\-prokkaRes\/GEN2\.0219\.00001\.fsa 2> /dev/null +[13:18:48] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/errorsummary.val +[13:18:48] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.dr +[13:18:48] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.fixedproducts +[13:18:48] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.ecn +[13:18:48] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.val +[13:18:48] Repairing broken .GBK output that tbl2asn produces... +[13:18:48] Running: sed 's/COORDINATES: profile/COORDINATES:profile/' < Examples\/1\-res\-Annotate\/tmp_files\/genome2\.fst\-split5N\.fna\-prokkaRes\/GEN2\.0219\.00001\.gbf > Examples\/1\-res\-Annotate\/tmp_files\/genome2\.fst\-split5N\.fna\-prokkaRes\/GEN2\.0219\.00001\.gbk +[13:18:48] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.gbf +[13:18:48] Output files: +[13:18:48] Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.gff +[13:18:48] Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.tbl +[13:18:48] Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.sqn +[13:18:48] Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.fna +[13:18:48] Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.tsv +[13:18:48] Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.err +[13:18:48] Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.txt +[13:18:48] Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.fsa +[13:18:48] Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.faa +[13:18:48] Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.log +[13:18:48] Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.ffn +[13:18:48] Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.gbk +[13:18:48] Annotation finished successfully. +[13:18:48] Walltime used: 0.05 minutes +[13:18:48] If you use this result please cite the Prokka paper: +[13:18:48] Seemann T (2014) Prokka: rapid prokaryotic genome annotation. Bioinformatics. 30(14):2068-9. +[13:18:48] Type 'prokka --citation' for more details. +[13:18:48] Share and enjoy! diff --git a/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.sqn b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.sqn new file mode 100644 index 0000000000000000000000000000000000000000..bf085e79297a2ed59f80a72c9597a16cf084ff9b --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.sqn @@ -0,0 +1,1318 @@ +Seq-entry ::= set { + class genbank , + seq-set { + set { + class nuc-prot , + descr { + source { + org { + taxname "Genus species" , + orgname { + mod { + { + subtype strain , + subname "strain" } } , + gcode 11 } } } , + comment "Annotated using prokka 1.14-dev from + https://github.com/tseemann/prokka" , + user { + type + str "NcbiCleanup" , + data { + { + label + str "method" , + data + str "SeriousSeqEntryCleanup" } , + { + label + str "version" , + data + int 8 } , + { + label + str "month" , + data + int 2 } , + { + label + str "day" , + data + int 12 } , + { + label + str "year" , + data + int 2019 } } } , + create-date + std { + year 2019 , + month 2 , + day 12 } } , + seq-set { + seq { + id { + local + str "this_is_genome_1" } , + descr { + molinfo { + biomol genomic } } , + inst { + repr raw , + mol dna , + length 2308 , + seq-data + iupacna "ATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATG +AAGTGCTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGTGATTAGCATTATCGGCTCGTCCGGTTCCGGTAAAA +GCACTTTTTTGCGCTGCATTAACTTCCTCGAAAAATCGAGCGAAGGCGCGATTATCGTGAACGGTCAGAACATTAATC +TGGTGCGCGACAAAGACGGGCAGCTCAAAGTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGG +TGTTTCAGCACTTTAACCTCTGGAACCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTACTGGGAT +TAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGGATTGATGAGCGCGCTCAGGGCAAAT +ATCCCGTCCATCTCTCCGGCGGCCAACAGCAGCGCGTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTAC +TGTTCGATGAACCCACATCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAAG +AAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCCCGCCATGTCTCTTCGCACGTGATTTTTCTGCATC +AGGGGAAAATTGAAGAAGAGGGCAATCCGGAGCAGGTGTTCGGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGA +AAGGCTCGCTGAAATAAACGCTAGAGGACGCGCCTCTCAGAGAGCGCGCTCTCTCAGAGAGGCGCGCGCCTCTTTCGC +AGAGACCNNCGCTCATGAGCGATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTGACGGCAGGTTCCGCGCTTGCT +GTTCTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCCAGACCGTCGATATTAAGGATTATCCGGCG +GATGACGGTATCGCCTCGTTCAAACAGGCCTTCGCCGACGGGCAGACTGTGGTCGTGCCGTCAGGATGGGTGTGTGAA +AATATCAATGCGGCGATAACGATTCCGGCGGGAAAAACGCTGCGGATACAGGGCGCGGTGCGTGGGAATGGCCGGGGA +CGGTTTATTTTGCTGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAATGTGACGCTGGATGTTCGC +GGGTCGGACTGTGTGATTAAAGGCGTGACGATGAGCGGCTTTGGCCCCGTCGCGCAAATTTTCATCGGCGGTAAGGAA +CCGCAGGTGATGCGTAATCTCATTATCGATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTT +CATAACCAAATGGACGGCGCGCGGATTACGCATAGTCGCTTTAGCGATTTGCAGGGGGACGCCATTGAGTGGAATGTC +GCGATTCATGACCGCGACATCCTGATTTCCGATCATGTCATCGAACGCATTGATTGTACCAATGGCAAAATCAACTGG +GGGATCGGCATCGGGCTGGCGGGTAGCGCCTATGACAATAGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTGGTG +GCCAATATTACCGGATCTGATTGCCGACAACTGGTACACGTAGAAAATGGCAAACATTTCGTCATTCGCAATGTCAAA +GCCAAAAACATCACGCCCGATTTCAGTAAAAATGCGGGTATTGATAACGCAACGATCGCCATTTATGGCTGTGATAAT +TTCGTCATTGATAATATTGATATGACGAATAGTGCCGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTG +TCAATTCCGCAAAACTTTAAATTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCAA +ATTTCCTCCGGTAACGCCCCCTCATTTGTTGCCATCACCAATGTACGGATGACGCGTGCTACGCTGGAACTGCATAAT +CAACCGCAGCACCTCTTTTTGCGTAATATCAACGTGATGCAAACTTCAGCGATTGGCCCGGCGTTAAAAATGCATTTT +GATTTGCGTAAAGATGTCCGTGGTCAATTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATC +AATGAAAACGGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGCAGTGAATTTTTCG +CTGCCGAAGCGGGGAGGGTAACACTCGCGCATAGAGAGCTCTCAGAGGAGCGCGCGCGCTATAGCGCGC" } } , + seq { + id { + local + str "this_is_genome_1_1" } , + descr { + title "Histidine transport ATP-binding protein HisP [Genus + species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 257 , + seq-data + ncbieaa "MSENKLHVIDLHKRYGGHEVLKGVSLQARAGDVISIIGSSGSGKSTFLRCINFLE +KSSEGAIIVNGQNINLVRDKDGQLKVADKNQLRLLRTRLTMVFQHFNLWNHMTVLENVMEAPIQVLGLSKHDARERAL +KYLAKVGIDERAQGKYPVHLSGGQQQRVSIARALAMEPDVLLFDEPTSALDPELVGEVLRIMQQLAEEGKTMVVVTHE +MGFARHVSSHVIFLHQGKIEEEGNPEQVFGNPQSPRLQQFLKGSLK" } , + annot { + { + data + ftable { + { + id + local + id 3 , + data + prot { + name { + "Histidine transport ATP-binding protein HisP" } } , + location + int { + from 0 , + to 256 , + id + local + str "this_is_genome_1_1" } } } } } } , + seq { + id { + local + str "this_is_genome_1_2" } , + descr { + title "hypothetical protein OHLPOIIB_00002 [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 467 , + seq-data + ncbieaa "MPATKFSRRTLLTAGSALAVLPFLRALPVQAREPRQTVDIKDYPADDGIASFKQA +FADGQTVVVPSGWVCENINAAITIPAGKTLRIQGAVRGNGRGRFILLDGCQVVGEQGGSLHNVTLDVRGSDCVIKGVT +MSGFGPVAQIFIGGKEPQVMRNLIIDDITVTHANYAILRQGFHNQMDGARITHSRFSDLQGDAIEWNVAIHDRDILIS +DHVIERIDCTNGKINWGIGIGLAGSAYDNSYPEDQAVKNFVVANITGSDCRQLVHVENGKHFVIRNVKAKNITPDFSK +NAGIDNATIAIYGCDNFVIDNIDMTNSAGMLIGYGVVKGKYLSIPQNFKLNAIRLDNRQVAYKLRGIQISSGNAPSFV +AITNVRMTRATLELHNQPQHLFLRNINVMQTSAIGPALKMHFDLRKDVRGQFMARQDTLLSLANVHAINENGQSSVDI +DRINHQTVNVEAVNFSLPKRGG" } , + annot { + { + data + ftable { + { + id + local + id 4 , + data + prot { + name { + "hypothetical protein" } } , + location + int { + from 0 , + to 466 , + id + local + str "this_is_genome_1_2" } } } } } } } , + annot { + { + data + ftable { + { + id + local + id 1 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "this_is_genome_1_1" , + location + int { + from 0 , + to 773 , + strand plus , + id + local + str "this_is_genome_1" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:P02915" } } , + xref { + { + data + gene { + locus "hisP" , + locus-tag "OHLPOIIB_00001" } } } } , + { + id + local + id 2 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "this_is_genome_1_2" , + location + int { + from 856 , + to 2259 , + strand plus , + id + local + str "this_is_genome_1" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } } , + xref { + { + data + gene { + locus-tag "OHLPOIIB_00002" } } } } } } } } , + set { + class nuc-prot , + descr { + source { + org { + taxname "Genus species" , + orgname { + mod { + { + subtype strain , + subname "strain" } } , + gcode 11 } } } , + comment "Annotated using prokka 1.14-dev from + https://github.com/tseemann/prokka" , + user { + type + str "NcbiCleanup" , + data { + { + label + str "method" , + data + str "SeriousSeqEntryCleanup" } , + { + label + str "version" , + data + int 8 } , + { + label + str "month" , + data + int 2 } , + { + label + str "day" , + data + int 12 } , + { + label + str "year" , + data + int 2019 } } } , + create-date + std { + year 2019 , + month 2 , + day 12 } } , + seq-set { + seq { + id { + local + str "this_is_genome_2" } , + descr { + molinfo { + biomol genomic } } , + inst { + repr raw , + mol dna , + length 3295 , + seq-data + iupacna "CGCGATAATATAGCGCGCGCTCATAGCATGAGCGCGCTATTTCTGGCCATCCCGT +TAACCATTTTTGTGTTGTTTGTGTTACCGATTTGGCTGTGGCTGCATTACAGCAACCGCGCCGGTCGGGGAGAACTGT +CGCAAAGCGAGCAGCAACGCTTACTGCAACTCACAGACGACGCGCAACGTATGCGCGAGCGCATTCAGGCGCTGGAAG +ATATTCTTGATGCAGAGCATCCGAACTGGAGAGAGCGCTAACGCCATATTATAGCGCGCCTCATAAGAGCGGCCTATA +GCGCGCTANNNATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTTACCAAGCCTGTTCA +GCAAGGTAAACGAGGCGCTGGCCGCCACCGGGATTTTCCCCATCGGCGGTATCCGCAGTCGCGCCCACTGGCTGGATA +CCTGGCAGATGGCTGACGGTAAGCATGATTACGCGTTTGTGCATATGACGCTGAAAATCGGCGCCGGGCGCAGCCTGG +AGAGCCGTCAGGAAGTCGGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCGACCTGATGGAGAACCGCTATC +TGGCGCTGTCGTTTGAGATTGCCGAGTTACATCCAACGCTCAATTACAAACAAAACAACGTACACGCGTTATTTAAAT +AGCCAGAGATTCGCGCGTTCAGAGAGGAGCTCTCTCATAGACGCGCGCATATGCGCTCTAGAGAGGCGCGCCTAATGG +CGCGCTATGATGAAAGCGCTACTGTGGCTGGTTGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGCGAAAAAGGAATT +ATCGATAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCCTATCCGCGCATTAAAGTCCTGGTGATT +CACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGACGTTAACGGGTCGCAACGTCAGTTCGCATTACCTGATTCCC +GCAACCCCGCCATTATATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCTTGGCATGCGGGC +GTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAATCGCGGCTGGCGAATG +TCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAATTCAGGCATTGATCCCGTTAGCGAAGGACATTATC +GCGCGCTATGACATCAAACCGCAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCG +CGCTTCCCGTGGCGCGAGCTGGCGGCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTGGCGTTTTATCTGGCT +GGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCGTTACTCTCGCGCTATGGCTATGAAGTCAAAGCC +GATATGACGGCGCGCGAGCAACAGCGGGTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATC +GCAGATGCCGAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACAAAGCGCTCTCAGAGAG +AGCGCTCTGCAGATGAGTGAGGCTGAAGCCCGCCCGACTAACTTTATTCGTCAGATTATTGATGAAGATCTGGCGAGT +GGTAAACATACCACTGTCCATACCCGTTTTCCGCCGGAGCCGAATGGCTATCTGCACATCGGCCACGCGAAATCTATC +TGCCTGAACTTTGGCATCGCGCAAGATTATCAGGGCCAGTGCAACCTGCGTTTCGATGACACCAACCCGGTAAAAGAA +GATATCGAGTACGTTGATTCGATCAAAAACGACGTCGAGTGGTTAGGCTTTCACTGGTCTGGCGATATTCGCTACTCC +TCCGATTACTTTGACCAACTGCACGCCTATGCGGTCGAGCTCATCAATAAAGGCCTGGCCTACGTTGATGAGCTGACG +CCGGAGCAGATCCGCGAATACCGTGGTACGCTGACCGCGCCGGGTAAAAACAGCCCGTTCCGCGATCGCAGCGTGGAA +GAAAACCTCGCGCTATTTGAAAAAATGCGTACCGGCGGTTTTGAAGAGGGTAAAGCCTGTCTGCGCGCTAAAATCGAC +ATGGCGTCGCCGTTTATCGTGATGCGCGATCCGGTGCTGTATCGCATTAAATTCGCCGAGCATCATCAGACCGGCACG +AAGTGGTGCATCTATCCGATGTACGACTTTACTCACTGCATCAGCGATGCGCTGGAAGGCATTACTCATTCTCTGTGT +ACGCTGGAGTTCCAGGACAACCGTCGTCTGTACGACTGGGTGCTGGACAACATCACCATTCCGGTTCACCCGCGCCAG +TACGAATTCTCGCGCCTGAATCTGGAATACACCGTGATGTCCAAGCGTAAGCTGAACCTGCTGGTGACCGACAAACAC +GTCGAAGGTTGGGATGATCCGCGTATGCCGACTATTTCCGGTCTGCGCCGTCGCGGCTATACCGCGGCTTCTATTCGT +GAGTTCTGCAAACGCATCGGCGTCACCAAGCAGGACAACACTATTGAGATGGCGTCGCTGGAATCCTGCATTCGCGAA +GATCTGAACGAAAACGCGCCGCGCGCGATGGCGGTAATCGATCCGGTAAAACTGGTTATCGAAAACTACCCGCAGGGC +GAGAGCGAAATGGTTACCATGCCTAACCATCCGAATAAACCGGAGATGGGCAGCCGTGAAGTGCCGTTTAGCGGTGAG +ATCTGGATCGATCGCGCGGATTTCCGCGAAGAAGCGAACAAACAGTACAAACGTCTGGTGATGGGCAAAGAAGTGCGT +CTGCGTAATGCCTACGTCATTAAAGCGGAGCGCGTAGAGAAGGATGCCGAAGGGAATATCACCACCATCTTCTGTACC +TATGATGCTGATACGCTGAGTAAAGATCCGGCTGACGGGCGTAAAGTGAAAGGCGTAATCCACTGGGTTAGCGTAGCA +CATGCGCTGCCGATTGAAATTCGTCTCTACGACCGTCTGTTCAGCGTGCCGAACCCGGGCGCCGCGGAGGACTTCCTG +TCTGTTATCAACCCCGAATCATTAGTGATTAAGCAGGGGTATGGCGAGCCGTCGCTGAAAGCGGCGGTAGCAGGGAAA +GCTTTCCAGTTTGAACGTGAAGGTTACTTCTGCCTTGACAGCCGCTATGCAACGGCCGATAAGCTGGTCTTTAACCGC +ACCGTGGGCCTGCGTGATACCTGGGCGAAAGCGGGCGAATAA" } } , + seq { + id { + local + str "this_is_genome_2_1" } , + descr { + title "Phage shock protein B [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 74 , + seq-data + ncbieaa "MSALFLAIPLTIFVLFVLPIWLWLHYSNRAGRGELSQSEQQRLLQLTDDAQRMRE +RIQALEDILDAEHPNWRER" } , + annot { + { + data + ftable { + { + id + local + id 9 , + data + prot { + name { + "Phage shock protein B" } } , + location + int { + from 0 , + to 73 , + id + local + str "this_is_genome_2_1" } } } } } } , + seq { + id { + local + str "this_is_genome_2_2" } , + descr { + title "5-carboxymethyl-2-hydroxymuconate Delta-isomerase [Genus + species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 126 , + seq-data + ncbieaa "MPHFIAECTENIREQADLPSLFSKVNEALAATGIFPIGGIRSRAHWLDTWQMADG +KHDYAFVHMTLKIGAGRSLESRQEVGEMLFGLIKAHFADLMENRYLALSFEIAELHPTLNYKQNNVHALFK" } , + annot { + { + data + ftable { + { + id + local + id 10 , + data + prot { + name { + "5-carboxymethyl-2-hydroxymuconate Delta-isomerase" } , + ec { + "5.3.3.10" } } , + location + int { + from 0 , + to 125 , + id + local + str "this_is_genome_2_2" } } } } } } , + seq { + id { + local + str "this_is_genome_2_3" } , + descr { + title "N-acetylmuramoyl-L-alanine amidase AmiD [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 277 , + seq-data + ncbieaa "MMKALLWLVGLALLLTGCASEKGIIDKEGYQLDTRHRAQAAYPRIKVLVIHYTAE +NFDVSLATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGVSFWRGATRLNDTSIGIELENRGWRMSGGVK +SFAPFESAQIQALIPLAKDIIARYDIKPQNVVAHADIAPQRKDDPGPRFPWRELAAQGIGAWPDAQRVAFYLAGRAPY +TPVDTATVLALLSRYGYEVKADMTAREQQRVIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD" } , + annot { + { + data + ftable { + { + id + local + id 11 , + data + prot { + name { + "N-acetylmuramoyl-L-alanine amidase AmiD" } , + ec { + "3.5.1.28" } } , + location + int { + from 0 , + to 276 , + id + local + str "this_is_genome_2_3" } } } } } } , + seq { + id { + local + str "this_is_genome_2_4" } , + descr { + title "Glutamine--tRNA ligase [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 555 , + seq-data + ncbieaa "MSEAEARPTNFIRQIIDEDLASGKHTTVHTRFPPEPNGYLHIGHAKSICLNFGIA +QDYQGQCNLRFDDTNPVKEDIEYVDSIKNDVEWLGFHWSGDIRYSSDYFDQLHAYAVELINKGLAYVDELTPEQIREY +RGTLTAPGKNSPFRDRSVEENLALFEKMRTGGFEEGKACLRAKIDMASPFIVMRDPVLYRIKFAEHHQTGTKWCIYPM +YDFTHCISDALEGITHSLCTLEFQDNRRLYDWVLDNITIPVHPRQYEFSRLNLEYTVMSKRKLNLLVTDKHVEGWDDP +RMPTISGLRRRGYTAASIREFCKRIGVTKQDNTIEMASLESCIREDLNENAPRAMAVIDPVKLVIENYPQGESEMVTM +PNHPNKPEMGSREVPFSGEIWIDRADFREEANKQYKRLVMGKEVRLRNAYVIKAERVEKDAEGNITTIFCTYDADTLS +KDPADGRKVKGVIHWVSVAHALPIEIRLYDRLFSVPNPGAAEDFLSVINPESLVIKQGYGEPSLKAAVAGKAFQFERE +GYFCLDSRYATADKLVFNRTVGLRDTWAKAGE" } , + annot { + { + data + ftable { + { + id + local + id 12 , + data + prot { + name { + "Glutamine--tRNA ligase" } , + ec { + "6.1.1.18" } } , + location + int { + from 0 , + to 554 , + id + local + str "this_is_genome_2_4" } } } } } } } , + annot { + { + data + ftable { + { + id + local + id 5 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "this_is_genome_2_1" , + location + int { + from 27 , + to 251 , + strand plus , + id + local + str "this_is_genome_2" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:P0AFM9" } } , + xref { + { + data + gene { + locus "pspB" , + locus-tag "OHLPOIIB_00003" } } } } , + { + id + local + id 6 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "this_is_genome_2_2" , + location + int { + from 300 , + to 680 , + strand plus , + id + local + str "this_is_genome_2" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:Q05354" } } , + xref { + { + data + gene { + locus "hpcD" , + locus-tag "OHLPOIIB_00004" } } } } , + { + id + local + id 7 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "this_is_genome_2_3" , + location + int { + from 763 , + to 1596 , + strand plus , + id + local + str "this_is_genome_2" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:P75820" } } , + xref { + { + data + gene { + locus "amiD" , + locus-tag "OHLPOIIB_00005" } } } } , + { + id + local + id 8 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "this_is_genome_2_4" , + location + int { + from 1627 , + to 3294 , + strand plus , + id + local + str "this_is_genome_2" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:P00962" } } , + xref { + { + data + gene { + locus "glnS" , + locus-tag "OHLPOIIB_00006" } } } } } } } } , + set { + class nuc-prot , + descr { + source { + org { + taxname "Genus species" , + orgname { + mod { + { + subtype strain , + subname "strain" } } , + gcode 11 } } } , + comment "Annotated using prokka 1.14-dev from + https://github.com/tseemann/prokka" , + user { + type + str "NcbiCleanup" , + data { + { + label + str "method" , + data + str "SeriousSeqEntryCleanup" } , + { + label + str "version" , + data + int 8 } , + { + label + str "month" , + data + int 2 } , + { + label + str "day" , + data + int 12 } , + { + label + str "year" , + data + int 2019 } } } , + create-date + std { + year 2019 , + month 2 , + day 12 } } , + seq-set { + seq { + id { + local + str "this_is_genome_3" } , + descr { + molinfo { + biomol genomic } } , + inst { + repr raw , + mol dna , + length 2263 , + seq-data + iupacna "ACGCGCTATAGGGCTCTCAGAGAGTCTCAGTGTTTATTGGCATCGTTAGCCTGTT +TCCTGAAATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACATCCAAAG +CTGGAGTCCTCGCGACTTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCGGGGATGTT +AATGATGGTGCAACCCTTGCGGGACGCCATTCATGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCT +GTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAACTCATTCTGGTGTGTGG +TCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGGCCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTCAG +CGGTGGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGGGTTCTGGGGCATGAGGCATC +AGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGA +AGTACCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCGCTGGCGTTTGAAACAGTCACTGGGCCGAACCTG +GCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGA +ACACGCACAACAGCAGCATAAACATGATGGGATGGCATAGCGAGATCGCGCATAGCGCGCGGCGAGATCCGCGAGACA +TGAATAATCATTTTGGGAAAGGGTTAATGGCCGGGTTGCACGCGCCATATGCATATAGCGCGCATCATGCGGTGAATT +TCTGTTCTGAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGTATGTTCGAAAAGACCGGCGATCGTCAACTTA +GCGCGTGGGAGGCCGGAATTCTGACGCGTCGCTATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGAGAATC +ATTCCGGGATGGCGGTTCGCTTCTTTATGGCTGGTTATCGACTCGAAGGTTGAACGCCTCTAGAGCGCTAGAGGCGCG +CGCGATATACGCGGCGCGAGACATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGC +GTGCCAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTTGAGATGATCGTCCCTCAACTGCCGCCGTATCCTGC +CGATGCGGCGGAGTTGCTGGAATCTCTCGTACTTGAGCATGGCGGTGCGCCATTAGGGCTGGTAGGATCGTCGCTGGG +TGGTTATTACGCCACCTGGCTGTCGCAATGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGA +ATTACTGACCGACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGCCATATTTA +TGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGGCTACTGCAACAGACGGGCGATGAAGT +GCTGGATTACCGCCAGGCGGTGGCATATTACGCCTCCTGCCGTCAGACAGTGACCGAGGGGGGTAATCACGCATTCAC +GGGCTTCGAAGATTATTTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGACCGGATATACGCGCGCGCGC +TATATAGCGCGCGCGGCGATATAGCGCATGAAGTTTGTTGCGCCAGAACAGGCACCGGAACAGGCGGAAATCATCAGA +AATACGCCGTTCTGGCCTGATGTGGACCTGTCGGAGTTTCGCAGCATGATGCGCACTGATGGCACGGTGACGCAGCCG +CGTTTAAAGCAGGTTGCGCTGTCGGCAATTTCGGAGGTCAACGCAGAGCTGTATGAGTTTCGCAGACGCCAGCAGATG +CTGGGGTATGCCTCGCTGGCAGAAGTCCCGGCGGAACAACTGGACGGCAAAAGCGAGCGCATTCAGCACTATTTCAAC +GCGGTTTACTGCTGGGCACGCGCCATGCTCAACGAACGTTACCAGGACTATGACGCCACGGCATCCGGTGTGAAGCGG +GGCGAAGAACTGGCAGAAGCCAGCGGTGATTTGTGGCGTGACGCCCGCTGGGCCATCAGCCGGGTGCAGGATGCGCCG +CACTGCACAGTGGAGCTTATCTGA" } } , + seq { + id { + local + str "this_is_genome_3_1" } , + descr { + title "tRNA (guanine-N(1)-)-methyltransferase [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 255 , + seq-data + ncbieaa "MFIGIVSLFPEMFRAITDYGVTGRAVKKGLLNIQSWSPRDFAHDRHRTVDDRPYG +GGPGMLMMVQPLRDAIHAAKAAAGEGAKVIYLSPQGRKLDQAGVSELATNQKLILVCGRYEGVDERVIQAEIDEEWSI +GDYVLSGGELPAMTLIDSVARFIPGVLGHEASAIEDSFADGLLDCPHYTRPEVLEGMEVPPVLLSGNHAEIRRWRLKQ +SLGRTWLRRPELLENLALTEEQARLLAEFKTEHAQQQHKHDGMA" } , + annot { + { + data + ftable { + { + id + local + id 17 , + data + prot { + name { + "tRNA (guanine-N(1)-)-methyltransferase" } , + ec { + "2.1.1.228" } } , + location + int { + from 0 , + to 254 , + id + local + str "this_is_genome_3_1" } } } } } } , + seq { + id { + local + str "this_is_genome_3_2" } , + descr { + title "hypothetical protein OHLPOIIB_00008 [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 86 , + seq-data + ncbieaa "MAGLHAPYAYSAHHAVNFCSEYKRGFVLGFTHRMFEKTGDRQLSAWEAGILTRRY +GLDKEMVMDFFKENHSGMAVRFFMAGYRLEG" } , + annot { + { + data + ftable { + { + id + local + id 18 , + data + prot { + name { + "hypothetical protein" } } , + location + int { + from 0 , + to 85 , + id + local + str "this_is_genome_3_2" } } } } } } , + seq { + id { + local + str "this_is_genome_3_3" } , + descr { + title "hypothetical protein OHLPOIIB_00009 [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 193 , + seq-data + ncbieaa "MSTLLYLHGFNSSPRSAKACQLKNWLAERHPHVEMIVPQLPPYPADAAELLESLV +LEHGGAPLGLVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDYLGQNENPYTGQQYVLESRHIYDLKVMQIDPL +EAPDLIWLLQQTGDEVLDYRQAVAYYASCRQTVTEGGNHAFTGFEDYFNQIVDFLGLHSC" } , + annot { + { + data + ftable { + { + id + local + id 19 , + data + prot { + name { + "hypothetical protein" } } , + location + int { + from 0 , + to 192 , + id + local + str "this_is_genome_3_3" } } } } } } , + seq { + id { + local + str "this_is_genome_3_4" } , + descr { + title "hypothetical protein OHLPOIIB_00010 [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 122 , + seq-data + ncbieaa "MMRTDGTVTQPRLKQVALSAISEVNAELYEFRRRQQMLGYASLAEVPAEQLDGKS +ERIQHYFNAVYCWARAMLNERYQDYDATASGVKRGEELAEASGDLWRDARWAISRVQDAPHCTVELI" } , + annot { + { + data + ftable { + { + id + local + id 20 , + data + prot { + name { + "hypothetical protein" } } , + location + int { + from 0 , + to 121 , + id + local + str "this_is_genome_3_4" } } } } } } } , + annot { + { + data + ftable { + { + id + local + id 13 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "this_is_genome_3_1" , + location + int { + from 29 , + to 796 , + strand plus , + id + local + str "this_is_genome_3" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:P0A873" } } , + xref { + { + data + gene { + locus "trmD" , + locus-tag "OHLPOIIB_00007" } } } } , + { + id + local + id 14 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "this_is_genome_3_2" , + location + int { + from 861 , + to 1121 , + strand plus , + id + local + str "this_is_genome_3" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } } , + xref { + { + data + gene { + locus-tag "OHLPOIIB_00008" } } } } , + { + id + local + id 15 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "this_is_genome_3_3" , + location + int { + from 1169 , + to 1750 , + strand plus , + id + local + str "this_is_genome_3" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } } , + xref { + { + data + gene { + locus-tag "OHLPOIIB_00009" } } } } , + { + id + local + id 16 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "this_is_genome_3_4" , + location + int { + from 1894 , + to 2262 , + strand plus , + id + local + str "this_is_genome_3" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } } , + xref { + { + data + gene { + locus-tag "OHLPOIIB_00010" } } } } } } } } , + set { + class nuc-prot , + descr { + source { + org { + taxname "Genus species" , + orgname { + mod { + { + subtype strain , + subname "strain" } } , + gcode 11 } } } , + comment "Annotated using prokka 1.14-dev from + https://github.com/tseemann/prokka" , + user { + type + str "NcbiCleanup" , + data { + { + label + str "method" , + data + str "SeriousSeqEntryCleanup" } , + { + label + str "version" , + data + int 8 } , + { + label + str "month" , + data + int 2 } , + { + label + str "day" , + data + int 12 } , + { + label + str "year" , + data + int 2019 } } } , + create-date + std { + year 2019 , + month 2 , + day 12 } } , + seq-set { + seq { + id { + local + str "this_is_genome_4" } , + descr { + molinfo { + biomol genomic } } , + inst { + repr raw , + mol dna , + length 2845 , + seq-data + iupacna "ATGATTGACCCTATTTTTGCGTCCTGTACGCTAATTGCCGTCTTTGTTGTTTTAC +TGGCCATGGGCGCGCCTATCGGGATCTGCATCGTTATCGCCTCTTTCAGCACCATGATGCTGGTACTGCCTTTCGATA +TTTCGATGTTCGCCACCGCGCAAAAAATGTTCTCCAGCCTGGACAGTTTTGCCTTGCTGGCCGTGCCGTTCTTCGTTT +TGTCCGGGGTGATCATGAATAGCGGGGGAATTGCCGCCCGACTGGTCAATTTTGCCAAACTGTTTACTGGCAAACTGC +CCGGCTCGCTCTCTTACACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCGATTGCCGCCTCAA +CCTCTATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTACGATCGCGGTTTTGCGGCCGCGGTGAATATCG +CCTCCGCGCCGACGGGAATGTTAATTCCGCCCACCACGGCTTTTATCCTTTATGCGCTGGCAAGCGGGGGAACATCGA +TTGCCGCTCTGTTCGCCGGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTGGTGGTCG +CTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGCATGGCGCTAAAAGTTGCCGTTGAGGCCATTCCCA +GCCTGTTACTGATCGTGATTATTGTCGGCGGCATTGTGCAGGGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCG +TGGTGTATACGTTATTGCTGACGATAGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAGA +CAGTGGTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTCTCGATGTCGATCACCAATA +TTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCCAATAAACTGGTTATCCTGTTAGTCATTACCGTCTTTT +TGTTGATTATCGGCGCATTTATGGATATCGGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCGATTATGACTA +AACTGGGCGTCGATCCGGTGCATTTCGGCATTATCATGATCTATAACCTGGCGATAGGCACCATTACGCCGCCAGTTG +GCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAGGACGTTATCAAACCGTTGATGCCTTTTTACG +GCGCGATTATCGGCGTTCTGTTATTAATTACCTACATTCCGGAAATCACACTGTTTTTACCCCGTCTACTGGGCATCA +TGTAAACGCTCATAGGCGGCGCGCGCGCTCTCAGGAATGAAGTTTGTTGCGCCAGAACAGGCACCGGAACAGGCGGAG +GTCATCAAAAATACGCCGTTCTGGCCTGATGTGAACCTGTCGGAATTTCGCAGTGTGATGCGCACTGACGGCACGGTG +ACGCAGCCGCGTTTAAAGCAGGTCGTGCTGACGGCGATCTCTGAGGTTAACGCTGAGCTGTACGACTTCCGCAACCGT +CAGCAGATGCTGGGCTGGCGGACACTTGCTGAGGTTCCCGCAGAAATGCTGGACGGTAAAAGCGAGCGTATCCGGCAC +TACCACAACGCTGTTTTTTGCTGGGCGCGCGCTGTGCTTAATGAGCGTTATCAGGACTATGACGCCACGGCGTCAGGC +GTGAAGCGAGGGGAGGAGCTGGCGGAGGCCAGCGGCGATCTGTGGCGTGATGCCCGCTGGGCCATCAGCCGGGTGCAG +GATGCACCGCACTGTACGGTGGAGCTTATCTGACGCTCATAGGCGCGCGCTCATAGCGCGATGGAAACAAATATTACC +TGGCAACAATTGATAGATGAATATTTCTTCGCAAAACCTCTGCGCTCAGCATCTGAATGGAGTTACACCAAAGTCTTC +AAATCATTTGTACATTATATGGGGCCGTTAAGCTGCCCTAATGATGTGACATATCACAAAGTGCTTGCCTGGCGCCGT +TTTCTTTTAAAAGAGAAAAAGCTGTCCGGACGTACCTGGAATAACAAGGTGGCGCATATGCGGGCCATCTTTAACTAC +GGAATACAGCGAGGGTTACTGCACTATGACGAAAATCCGTTTAACAATTCGGTAGTTAAACCGGACAAGAAGAGAAAG +AAAACGCTCACTCAGGCACAGATTGAGTATGCCTATCAGATCATGGAGCAGTATGAAAATCAGGAGAATACAGGGCTG +GGACTGAAATATTCCCGCTGCGCCTTATTTCCTGCATGGTTCTGGCTCACTGTCCTGGATACGCTCTATTACACAGGG +ATACGTCAGAACCAGTTATTACATATTCGGCTGAATGATGTTGATTTGAGAGAAGGGCAGATTCGGCTGATTACGGAG +GGGTGTAAAAATCACAAAGAACACTATGTGCCGGTGATCAGTTTTCTGCGTCCACGGCTGACCTGTTTAATGGAGAAA +GCGCAGAGCGAAGGATTGAAAGGTAATGACCGCCTGTTCAATATTGCACTTTTTACCGGCAAAGATCCCGCCATTGGC +GATGACATGGATTCTCCTCAGGTAAGAGCATTCTTCCGTCGTCTGTCCAAGGAGTGTCAGTTTGCGATCAGTCCTCAT +CGTTTCAGACACACGCTGGCCACGGAGATGATGAAAATGCCGGAACAGAATCTGCATATGGCGCAAAGTGTGCTGGGT +CATTCAAACATGAAATCCACGCTGGAGTATGTGGAGAATGATATTGCAGTGATGGGGAGGGCTCTGGAAGCGCAGTTT +ATGCAGATTAAGGCAGCACATGCCCGAAGCATTTACAGTGGGTTGACAAAGAATAGATAA" } } , + seq { + id { + local + str "this_is_genome_4_1" } , + descr { + title "C4-dicarboxylate TRAP transporter large permease protein + DctM [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 435 , + seq-data + ncbieaa "MIDPIFASCTLIAVFVVLLAMGAPIGICIVIASFSTMMLVLPFDISMFATAQKMF +SSLDSFALLAVPFFVLSGVIMNSGGIAARLVNFAKLFTGKLPGSLSYTNIVGNMMFGAISGSAIAASTSIGGVMVPMS +AREGYDRGFAAAVNIASAPTGMLIPPTTAFILYALASGGTSIAALFAGGLVAGVLWGVGCMLVTLVVAKRRNYRVFFT +VQKGMALKVAVEAIPSLLLIVIIVGGIVQGIFTAIEASAIAVVYTLLLTIVFYRTLKIKDLPSILLQTVVMTGVIMFL +LATSSAMSFSMSITNIPAALSDMILGISANKLVILLVITVFLLIIGAFMDIGPAILIFTPILLPIMTKLGVDPVHFGI +IMIYNLAIGTITPPVGSGLYVGASVGKVKVEDVIKPLMPFYGAIIGVLLLITYIPEITLFLPRLLGIM" } , + annot { + { + data + ftable { + { + id + local + id 24 , + data + prot { + name { + "C4-dicarboxylate TRAP transporter large permease + protein DctM" } } , + location + int { + from 0 , + to 434 , + id + local + str "this_is_genome_4_1" } } } } } } , + seq { + id { + local + str "this_is_genome_4_2" } , + descr { + title "hypothetical protein OHLPOIIB_00012 [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 154 , + seq-data + ncbieaa "MKFVAPEQAPEQAEVIKNTPFWPDVNLSEFRSVMRTDGTVTQPRLKQVVLTAISE +VNAELYDFRNRQQMLGWRTLAEVPAEMLDGKSERIRHYHNAVFCWARAVLNERYQDYDATASGVKRGEELAEASGDLW +RDARWAISRVQDAPHCTVELI" } , + annot { + { + data + ftable { + { + id + local + id 25 , + data + prot { + name { + "hypothetical protein" } } , + location + int { + from 0 , + to 153 , + id + local + str "this_is_genome_4_2" } } } } } } , + seq { + id { + local + str "this_is_genome_4_3" } , + descr { + title "Tyrosine recombinase XerC [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 337 , + seq-data + ncbieaa "METNITWQQLIDEYFFAKPLRSASEWSYTKVFKSFVHYMGPLSCPNDVTYHKVLA +WRRFLLKEKKLSGRTWNNKVAHMRAIFNYGIQRGLLHYDENPFNNSVVKPDKKRKKTLTQAQIEYAYQIMEQYENQEN +TGLGLKYSRCALFPAWFWLTVLDTLYYTGIRQNQLLHIRLNDVDLREGQIRLITEGCKNHKEHYVPVISFLRPRLTCL +MEKAQSEGLKGNDRLFNIALFTGKDPAIGDDMDSPQVRAFFRRLSKECQFAISPHRFRHTLATEMMKMPEQNLHMAQS +VLGHSNMKSTLEYVENDIAVMGRALEAQFMQIKAAHARSIYSGLTKNR" } , + annot { + { + data + ftable { + { + id + local + id 26 , + data + prot { + name { + "Tyrosine recombinase XerC" } } , + location + int { + from 0 , + to 336 , + id + local + str "this_is_genome_4_3" } } } } } } } , + annot { + { + data + ftable { + { + id + local + id 21 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "this_is_genome_4_1" , + location + int { + from 0 , + to 1307 , + strand plus , + id + local + str "this_is_genome_4" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:O07838" } } , + xref { + { + data + gene { + locus "dctM" , + locus-tag "OHLPOIIB_00011" } } } } , + { + id + local + id 22 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "this_is_genome_4_2" , + location + int { + from 1339 , + to 1803 , + strand plus , + id + local + str "this_is_genome_4" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } } , + xref { + { + data + gene { + locus-tag "OHLPOIIB_00012" } } } } , + { + id + local + id 23 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "this_is_genome_4_3" , + location + int { + from 1831 , + to 2844 , + strand plus , + id + local + str "this_is_genome_4" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:P39776" } } , + xref { + { + data + gene { + locus "xerC" , + locus-tag "OHLPOIIB_00013" } } } } } } } } } } diff --git a/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.tbl b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.tbl new file mode 100644 index 0000000000000000000000000000000000000000..a84c1e12fffdcf294a01f7461f87026eb390e347 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.tbl @@ -0,0 +1,82 @@ +>Feature this_is_genome_1 +1 774 CDS + dbxref COG:COG4598 + gene hisP + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:P02915 + locus_tag OHLPOIIB_00001 + product Histidine transport ATP-binding protein HisP +857 2260 CDS + inference ab initio prediction:Prodigal:2.6 + locus_tag OHLPOIIB_00002 + product hypothetical protein +>Feature this_is_genome_2 +28 252 CDS + gene pspB + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:P0AFM9 + locus_tag OHLPOIIB_00003 + product Phage shock protein B +301 681 CDS + EC_number 5.3.3.10 + gene hpcD + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:Q05354 + locus_tag OHLPOIIB_00004 + product 5-carboxymethyl-2-hydroxymuconate Delta-isomerase +764 1597 CDS + EC_number 3.5.1.28 + dbxref COG:COG3023 + gene amiD + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:P75820 + locus_tag OHLPOIIB_00005 + product N-acetylmuramoyl-L-alanine amidase AmiD +1628 3295 CDS + EC_number 6.1.1.18 + dbxref COG:COG0008 + gene glnS + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:P00962 + locus_tag OHLPOIIB_00006 + product Glutamine--tRNA ligase +>Feature this_is_genome_3 +30 797 CDS + EC_number 2.1.1.228 + dbxref COG:COG0336 + gene trmD + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:P0A873 + locus_tag OHLPOIIB_00007 + product tRNA (guanine-N(1)-)-methyltransferase +862 1122 CDS + inference ab initio prediction:Prodigal:2.6 + locus_tag OHLPOIIB_00008 + product hypothetical protein +1170 1751 CDS + inference ab initio prediction:Prodigal:2.6 + locus_tag OHLPOIIB_00009 + product hypothetical protein +1895 2263 CDS + inference ab initio prediction:Prodigal:2.6 + locus_tag OHLPOIIB_00010 + product hypothetical protein +>Feature this_is_genome_4 +1 1308 CDS + dbxref COG:COG1593 + gene dctM + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:O07838 + locus_tag OHLPOIIB_00011 + product C4-dicarboxylate TRAP transporter large permease protein DctM +1340 1804 CDS + inference ab initio prediction:Prodigal:2.6 + locus_tag OHLPOIIB_00012 + product hypothetical protein +1832 2845 CDS + dbxref COG:COG4974 + gene xerC + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:P39776 + locus_tag OHLPOIIB_00013 + product Tyrosine recombinase XerC diff --git a/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.tsv b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.tsv new file mode 100644 index 0000000000000000000000000000000000000000..73686a0ef5d0d079828e6630a7e50952851231d0 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.tsv @@ -0,0 +1,14 @@ +locus_tag ftype length_bp gene EC_number COG product +OHLPOIIB_00001 CDS 774 hisP COG4598 Histidine transport ATP-binding protein HisP +OHLPOIIB_00002 CDS 1404 hypothetical protein +OHLPOIIB_00003 CDS 225 pspB Phage shock protein B +OHLPOIIB_00004 CDS 381 hpcD 5.3.3.10 5-carboxymethyl-2-hydroxymuconate Delta-isomerase +OHLPOIIB_00005 CDS 834 amiD 3.5.1.28 COG3023 N-acetylmuramoyl-L-alanine amidase AmiD +OHLPOIIB_00006 CDS 1668 glnS 6.1.1.18 COG0008 Glutamine--tRNA ligase +OHLPOIIB_00007 CDS 768 trmD 2.1.1.228 COG0336 tRNA (guanine-N(1)-)-methyltransferase +OHLPOIIB_00008 CDS 261 hypothetical protein +OHLPOIIB_00009 CDS 582 hypothetical protein +OHLPOIIB_00010 CDS 369 hypothetical protein +OHLPOIIB_00011 CDS 1308 dctM COG1593 C4-dicarboxylate TRAP transporter large permease protein DctM +OHLPOIIB_00012 CDS 465 hypothetical protein +OHLPOIIB_00013 CDS 1014 xerC COG4974 Tyrosine recombinase XerC diff --git a/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.txt b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.txt new file mode 100644 index 0000000000000000000000000000000000000000..7c6686d050eb977ffdce18b3fd8062a41495ef9b --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome2.fst-split5N.fna-prokkaRes/GEN2.0219.00001.txt @@ -0,0 +1,4 @@ +organism: Genus species strain +contigs: 4 +bases: 10711 +CDS: 13 diff --git a/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna new file mode 100644 index 0000000000000000000000000000000000000000..35d4e193748f3c3fa8abcaa61bc65ae4cae8054f --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna @@ -0,0 +1,25 @@ +>header_genome3_chromo +ATGAGAGCGCTACTGTGGCTGGTGGGTCTTGCGTTGCTGTTAACAGGCTGCGCGAGCGAAAAAGGAATTATCGATGAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCCTATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGACGTTAACGGGCCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTATATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCGGGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAATTCAGGCATTGATTCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCGCAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGCTTCCCGTGGCGCGAGCTGGCGGCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTGGCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCGTTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGGGTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCCGAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAA +CGAGTCCGGGAGCGGCGAGATTCCGGGCAGAGGAGGCGGCTATAGCGG +ATGAGAGCGCTACTGTGGCTGGTGGGTCTTGCGTTGCTGTTAACAGGCTGCGCGAGCGAAAAAGGAATTATCGATGAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCCTATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGACGTTAACGGGCCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTATATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCGGGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAATTCAGGCATTGATTCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCGCAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGCTTCCCGTGGCGCGAGCTGGCGGCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTGGCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCGTTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGGGTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCCGAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAA +CAGGGCGCGGTATCGCGAGGACTCTCTCTTCGAGGAC +ATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTTACCCGGCCTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGATTTTCCCCATCGGCGGTATCCGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGATTACGCGTTTGTGCATATGACGCTGAAAATCGGCGCCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTCGGCGAAATGCTGTTCGGGCTGATTAAAGCCCACTTCGCCGACCTGATGGAGAACCGCTATCTGGCGCTGTCGTTTGAGATTGCCGAGTTACATCCAACGCTCAATTACAAACAAAACAACGTACACGCGTTATTTAAATAG +CCCGAGTATTAGCGCGCGGCGGCGCTATAGGCGCGCTATAGGC +ATGAGCGCGCTATTTCTGGCCATCCCGTTAACCATTTTTGTGTTGTTTGTGTTACCGATTTGGCTGTGGCTGCATTACAGCAACCGCGCCGGTCGGGGAGAACTGTCGCAAAGCGAGCAGCAACGCTTACTGCAACTCACAGACGACGCGCAACGTATGCGCGAGCGCATTCAGGCGCTGGAAGACATTCTTGATGCAGAGCATCCGAACTGGAGAGAGCGCTAA +CGAGAGTCTCGGAGGAGCGGCGCTCTGg +ATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTGACGGCAGGTTCTGCGCTTGCTGTTCTTCCTTTCCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCGAGACCGTCGATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGCCGACGGACAGACCGTGGTCGTACCGCCAGGATGGGTGTGTGAAAATATCAATGCGGCGATAACGATTCCGGCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATGGCCGGGGACGGTTTATTTTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAATGTGACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGGCGATGAGCGGCTTTGGCCCCGTCGCGCAAATTTTCATCGGTGGTAAGGAACCGCAGGTGATGCGTAATCTCATTATCGATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCATAACCAAATGGACGGCGCGAGGATTACGCATAGCCGCTTTAGCGATTTACAGGGGGACGCCATTGAGTGGAATGTCGCGATTCACGACCGCGATATCCTGATTTCCGATCATGTCATCGAACGCATTGATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCACCTATGACAACAGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTGGTGGCCAATATTACCGGATCTGATTGCCGACAGCTTGTGCACGTAGAAAATGGCAAACATTTCGTCATTCGCAATGTCAAAGCCAAAAACATCACGCCCGATTTCAGTAAAAATGCGGGTATTGATAACGCAACGATCGCAATTTATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAGTGCCGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAACTTTAAATTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCAAATTTCCTCCGGCAACACCCCCTCTTTTGTCGCCATCACCAATGTACGGATGACGCGTGCTACGCTGGAACTGCATAATCAACCGCAGCACCTCTTCCTGCGTAATATCAACGTGATGCAAACTTCAGCGATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGTAAAGATGTCCGTGGTCAATTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAATGAAAACGGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGCAGTGAATTTTTCGCTGCCGAAGCGGGGAGGGTAA +CGCGTCTGCGCGCGCGGAGAGAGGCTCTCAGGAC +ATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTGCTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGTGATTAGCATCATCGGCTCGTCCGGCTCCGGTAAAAGCACTTTTTTGCGCTGTATTAACTTCCTCGAAAAACCGAGCGAAGACGCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGACGGGCAGCTCAAAGTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCACTTTAACCTCTGGAGCCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTACTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGGATTGATGAGCGCGCTCAGGGCAAATATCCCGTTCATCTCTCCGGTGGCCAACAGCAGCGCGTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACTTCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAAGAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGTTCGCCATGTCTCTTCGCACGTTATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGCGATCCGGAGCAGGTGTTCGGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAA +CCGGATGCGCGCGATATATCGGCGATGC +GTGTTTATTGGCATCGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAATGGCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGACTTCACGCATGACCGGCACCGTACCGTGGACGATCGTCCTTACGGCGGCGGACCAGGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGACCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGAGTTCTGGGGCATGAAGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCGCTGGCGTTTGAAACAGTCGCTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAG +CCGGCTCTGCGAGAGAGGAGCGCTCGC +ATGAATAATCATTTTGGGAAAGGGTTAATGGCCGGGTTGCACGCGCCATATGCATATAGCGCGCATCATGCGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGTATGTTCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGCTGGAATTCTGACGCGTCGCTATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATGGCGGTTCGCTTCTTTATGGCCGGTTATCGACTCGAAGGTTGA + +>genome3_plasmid +ATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGCCAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTTGAGATGATCGTCCCTCAACTACCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAGTCTCTCGTGCTTGAGCATGGCGGTGCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAATGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACCGACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGCCATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGGCTACTGCAACAGACGGGCGATGAAGTGCTGGATTACCGCCAGGCGGTGGCATATTACGCCTCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCACGGGCTTCGAAGATTATTTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGA +CGCGTAGCGGCCGGAATCTTCTCGGAGAGGCGCTTCTCTCTCGGAG +ATGATTGACCCTATTTTTGCGTCCTGTACGCTAATTGCCGTCTTTGTTGTTTTACTGGCCATGGGCGCGCCTATCGGGATCTGCATCGTTATCGCCTCTTTCAGCACCATGATGCTGGTACTGCCTTTCGATATTTCGATGTTCGCCACCGCGCAAAAAATGTTCTCCAGCCTGGACAGTTTTGCCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGCGGGGGAATTGCCGCCCGGCTGATCAATTTTGCCAAACTGTTTACTGGCAAACTGCCCGGTTCGCTCTCTTATACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCAATTGCCGCCTCAACCTCCATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTACGATCGCGGCTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAATTCCGCCCACCACGGCTTTTATCCTTTATGCGCTGGCAAGCGGGGGAACATCGATTGCCGCTCTGTTCGCCGGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTGGTAGTCGCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGCATGGCGCTAAAAGTTGCCGTTGAGGCCATTCCCAGCCTGTTACTGATCGTGATTATCGTCGGCGGCATTGTGCAGGGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTATACGTTATTGCTGACGATGGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAGACAGTGGTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTCTCAATGTCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCCAATAAACTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATGGATATCGGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCGATCATGGCTAAACTGGGCGTCGATCCGGTGCATTTGGGCATTATCATGATCTATAACCTGGCGATTGGCACCATTACGCCGCCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAGGAAGTGATTAAACCGTTGCTGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTATTAATTACCTACATTCCGGAAATCATACTGTTCTTACCCCGTCTACTGGGCATCATGTAA +ACCCGATGGCGCGCAGAGGCGCGAGTTCTGGANNNNNNCACAGGGCTTAGAGGCGCTATGGCA +ATGATAATTAATGGGAAATTAATTAAAGCAAAAGACTTAGCTAAGGCTGCAGGTGTATCTCGTTCAACAGTGATTAAATATTACGGCATTAGCCGTGAGAATTACGAAAGGGTAGCAACTGAAAGAAGGAAGCTTGCTTTTGAACTAAGAGCATCAGGTTTAAAATGGAAAGAAGTTGCTGAAAAAATGAACACGACAAAATATAGCGCAATTGCATATTATAGACGATATTTAGCATTAGAGAAAAACAAATAA +CAGGCGCTAAGGCGGCGATCCTAGCGCGCGATCGCGCATGCG +ATGGTCGAACTGCAACATCAACGGCTGATGGTGCTTGCCGAACAGCTCCAGCTGGACAGTCTTATCGGCGCAGCGCCGGCGCTGTCGCAACAGGCGGTGGATCAGGAATGGAGCTACATGGACTTCCTGGAGCACCTGTTACATGAGGAGAAACTGGCCCGGCATCAGCGTAAACAGGCGATGTACACGCGGATGGCAGCCTTCCCGGCGGTAAAGACGTTCGAGGAGTACGACTTCACCTTCGCCACCGGCGCTCCTCAGAAGCAAATCCAGTCGCTGCGATCCCTGAGCTTCATAGAGCGTAACGAAAACATCGTGTTGCTGGGGCCATCGGGCGTGGGAAAAACGCATCTGGCGATAGCCATGGGCTACGAAGCAGTACGGGCGGGCATCAAGGTTCGCTTCACAACAGCAGCGGACCTGCTGCTACAGCTGTCCACTTCACAGCGTCAGGGCCGTTACAAAACGACTCTCAATCGTGGTGTCATGGCCCCGAAGCTGCTTATCATCGATGAAATAGGTTATCTGCCGTTCAGTCAGGAGGAAGCCAAGCTGTTCTTCCAGGTCATCGCCAAACGTTACGAGAAGAGCGCGATGATCCTGACCTCCAACCTGCCGTTCGGGCAGTGGGATCAGACGTTCGCCGGTGATGCAGCGCTGACATCGGCGATGCTGGACCGGATCTTACATCACTCACACGTCGTGCAAATAAAAGGGGAAAGCTATCGACTGAAGCAGAAACGAAAGGCCGGGGTTATAGCTGAAGCTAATCCTGAGTAA \ No newline at end of file diff --git a/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna new file mode 100644 index 0000000000000000000000000000000000000000..4ac241054ac16abe79be2b7ad970cc22310cad1c --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna @@ -0,0 +1,6 @@ +>header_genome3_1 +ATGAGAGCGCTACTGTGGCTGGTGGGTCTTGCGTTGCTGTTAACAGGCTGCGCGAGCGAAAAAGGAATTATCGATGAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCCTATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGACGTTAACGGGCCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTATATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCGGGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAATTCAGGCATTGATTCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCGCAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGCTTCCCGTGGCGCGAGCTGGCGGCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTGGCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCGTTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGGGTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCCGAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACGAGTCCGGGAGCGGCGAGATTCCGGGCAGAGGAGGCGGCTATAGCGGATGAGAGCGCTACTGTGGCTGGTGGGTCTTGCGTTGCTGTTAACAGGCTGCGCGAGCGAAAAAGGAATTATCGATGAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCCTATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGACGTTAACGGGCCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTATATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCGGGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAATTCAGGCATTGATTCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCGCAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGCTTCCCGTGGCGCGAGCTGGCGGCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTGGCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCGTTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGGGTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCCGAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACAGGGCGCGGTATCGCGAGGACTCTCTCTTCGAGGACATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTTACCCGGCCTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGATTTTCCCCATCGGCGGTATCCGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGATTACGCGTTTGTGCATATGACGCTGAAAATCGGCGCCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTCGGCGAAATGCTGTTCGGGCTGATTAAAGCCCACTTCGCCGACCTGATGGAGAACCGCTATCTGGCGCTGTCGTTTGAGATTGCCGAGTTACATCCAACGCTCAATTACAAACAAAACAACGTACACGCGTTATTTAAATAGCCCGAGTATTAGCGCGCGGCGGCGCTATAGGCGCGCTATAGGCATGAGCGCGCTATTTCTGGCCATCCCGTTAACCATTTTTGTGTTGTTTGTGTTACCGATTTGGCTGTGGCTGCATTACAGCAACCGCGCCGGTCGGGGAGAACTGTCGCAAAGCGAGCAGCAACGCTTACTGCAACTCACAGACGACGCGCAACGTATGCGCGAGCGCATTCAGGCGCTGGAAGACATTCTTGATGCAGAGCATCCGAACTGGAGAGAGCGCTAACGAGAGTCTCGGAGGAGCGGCGCTCTGGATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTGACGGCAGGTTCTGCGCTTGCTGTTCTTCCTTTCCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCGAGACCGTCGATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGCCGACGGACAGACCGTGGTCGTACCGCCAGGATGGGTGTGTGAAAATATCAATGCGGCGATAACGATTCCGGCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATGGCCGGGGACGGTTTATTTTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAATGTGACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGGCGATGAGCGGCTTTGGCCCCGTCGCGCAAATTTTCATCGGTGGTAAGGAACCGCAGGTGATGCGTAATCTCATTATCGATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCATAACCAAATGGACGGCGCGAGGATTACGCATAGCCGCTTTAGCGATTTACAGGGGGACGCCATTGAGTGGAATGTCGCGATTCACGACCGCGATATCCTGATTTCCGATCATGTCATCGAACGCATTGATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCACCTATGACAACAGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTGGTGGCCAATATTACCGGATCTGATTGCCGACAGCTTGTGCACGTAGAAAATGGCAAACATTTCGTCATTCGCAATGTCAAAGCCAAAAACATCACGCCCGATTTCAGTAAAAATGCGGGTATTGATAACGCAACGATCGCAATTTATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAGTGCCGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAACTTTAAATTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCAAATTTCCTCCGGCAACACCCCCTCTTTTGTCGCCATCACCAATGTACGGATGACGCGTGCTACGCTGGAACTGCATAATCAACCGCAGCACCTCTTCCTGCGTAATATCAACGTGATGCAAACTTCAGCGATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGTAAAGATGTCCGTGGTCAATTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAATGAAAACGGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGCAGTGAATTTTTCGCTGCCGAAGCGGGGAGGGTAACGCGTCTGCGCGCGCGGAGAGAGGCTCTCAGGACATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTGCTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGTGATTAGCATCATCGGCTCGTCCGGCTCCGGTAAAAGCACTTTTTTGCGCTGTATTAACTTCCTCGAAAAACCGAGCGAAGACGCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGACGGGCAGCTCAAAGTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCACTTTAACCTCTGGAGCCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTACTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGGATTGATGAGCGCGCTCAGGGCAAATATCCCGTTCATCTCTCCGGTGGCCAACAGCAGCGCGTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACTTCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAAGAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGTTCGCCATGTCTCTTCGCACGTTATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGCGATCCGGAGCAGGTGTTCGGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAACCGGATGCGCGCGATATATCGGCGATGCGTGTTTATTGGCATCGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAATGGCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGACTTCACGCATGACCGGCACCGTACCGTGGACGATCGTCCTTACGGCGGCGGACCAGGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGACCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGAGTTCTGGGGCATGAAGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCGCTGGCGTTTGAAACAGTCGCTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAGCCGGCTCTGCGAGAGAGGAGCGCTCGCATGAATAATCATTTTGGGAAAGGGTTAATGGCCGGGTTGCACGCGCCATATGCATATAGCGCGCATCATGCGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGTATGTTCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGCTGGAATTCTGACGCGTCGCTATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATGGCGGTTCGCTTCTTTATGGCCGGTTATCGACTCGAAGGTTGA +>genome3_plasmi_2 +ATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGCCAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTTGAGATGATCGTCCCTCAACTACCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAGTCTCTCGTGCTTGAGCATGGCGGTGCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAATGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACCGACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGCCATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGGCTACTGCAACAGACGGGCGATGAAGTGCTGGATTACCGCCAGGCGGTGGCATATTACGCCTCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCACGGGCTTCGAAGATTATTTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGACGCGTAGCGGCCGGAATCTTCTCGGAGAGGCGCTTCTCTCTCGGAGATGATTGACCCTATTTTTGCGTCCTGTACGCTAATTGCCGTCTTTGTTGTTTTACTGGCCATGGGCGCGCCTATCGGGATCTGCATCGTTATCGCCTCTTTCAGCACCATGATGCTGGTACTGCCTTTCGATATTTCGATGTTCGCCACCGCGCAAAAAATGTTCTCCAGCCTGGACAGTTTTGCCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGCGGGGGAATTGCCGCCCGGCTGATCAATTTTGCCAAACTGTTTACTGGCAAACTGCCCGGTTCGCTCTCTTATACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCAATTGCCGCCTCAACCTCCATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTACGATCGCGGCTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAATTCCGCCCACCACGGCTTTTATCCTTTATGCGCTGGCAAGCGGGGGAACATCGATTGCCGCTCTGTTCGCCGGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTGGTAGTCGCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGCATGGCGCTAAAAGTTGCCGTTGAGGCCATTCCCAGCCTGTTACTGATCGTGATTATCGTCGGCGGCATTGTGCAGGGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTATACGTTATTGCTGACGATGGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAGACAGTGGTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTCTCAATGTCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCCAATAAACTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATGGATATCGGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCGATCATGGCTAAACTGGGCGTCGATCCGGTGCATTTGGGCATTATCATGATCTATAACCTGGCGATTGGCACCATTACGCCGCCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAGGAAGTGATTAAACCGTTGCTGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTATTAATTACCTACATTCCGGAAATCATACTGTTCTTACCCCGTCTACTGGGCATCATGTAAACCCGATGGCGCGCAGAGGCGCGAGTTCTGGA +>genome3_plasmi_3 +CACAGGGCTTAGAGGCGCTATGGCAATGATAATTAATGGGAAATTAATTAAAGCAAAAGACTTAGCTAAGGCTGCAGGTGTATCTCGTTCAACAGTGATTAAATATTACGGCATTAGCCGTGAGAATTACGAAAGGGTAGCAACTGAAAGAAGGAAGCTTGCTTTTGAACTAAGAGCATCAGGTTTAAAATGGAAAGAAGTTGCTGAAAAAATGAACACGACAAAATATAGCGCAATTGCATATTATAGACGATATTTAGCATTAGAGAAAAACAAATAACAGGCGCTAAGGCGGCGATCCTAGCGCGCGATCGCGCATGCGATGGTCGAACTGCAACATCAACGGCTGATGGTGCTTGCCGAACAGCTCCAGCTGGACAGTCTTATCGGCGCAGCGCCGGCGCTGTCGCAACAGGCGGTGGATCAGGAATGGAGCTACATGGACTTCCTGGAGCACCTGTTACATGAGGAGAAACTGGCCCGGCATCAGCGTAAACAGGCGATGTACACGCGGATGGCAGCCTTCCCGGCGGTAAAGACGTTCGAGGAGTACGACTTCACCTTCGCCACCGGCGCTCCTCAGAAGCAAATCCAGTCGCTGCGATCCCTGAGCTTCATAGAGCGTAACGAAAACATCGTGTTGCTGGGGCCATCGGGCGTGGGAAAAACGCATCTGGCGATAGCCATGGGCTACGAAGCAGTACGGGCGGGCATCAAGGTTCGCTTCACAACAGCAGCGGACCTGCTGCTACAGCTGTCCACTTCACAGCGTCAGGGCCGTTACAAAACGACTCTCAATCGTGGTGTCATGGCCCCGAAGCTGCTTATCATCGATGAAATAGGTTATCTGCCGTTCAGTCAGGAGGAAGCCAAGCTGTTCTTCCAGGTCATCGCCAAACGTTACGAGAAGAGCGCGATGATCCTGACCTCCAACCTGCCGTTCGGGCAGTGGGATCAGACGTTCGCCGGTGATGCAGCGCTGACATCGGCGATGCTGGACCGGATCTTACATCACTCACACGTCGTGCAAATAAAAGGGGAAAGCTATCGACTGAAGCAGAAACGAAAGGCCGGGGTTATAGCTGAAGCTAATCCTGAGTAA diff --git a/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokka.log b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokka.log new file mode 100644 index 0000000000000000000000000000000000000000..5b897f7f9cddeebf15360eadd439465292a33ed6 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokka.log @@ -0,0 +1,121 @@ +[13:18:48] This is prokka 1.14-dev +[13:18:48] Written by Torsten Seemann <torsten.seemann@gmail.com> +[13:18:48] Homepage is https://github.com/tseemann/prokka +[13:18:48] Local time is Tue Feb 12 13:18:48 2019 +[13:18:48] You are aperrin +[13:18:48] Operating system is darwin +[13:18:48] You have BioPerl 1.006924 +[13:18:48] System has 8 cores. +[13:18:48] Will use maximum of 1 cores. +[13:18:48] Annotating as >>> Bacteria <<< +[13:18:48] Generating locus_tag from 'Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna' contents. +[13:18:48] Setting --locustag PFKCIMHH from MD5 9f4c2611ce3f44c6b2e6fb97579c5725 +[13:18:48] Creating new output folder: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes +[13:18:48] Running: mkdir -p Examples\/1\-res\-Annotate\/tmp_files\/genome3\-chromo\.fst\-all\.fna\-split5N\.fna\-prokkaRes +[13:18:48] Using filename prefix: EXAM.1216.00002.XXX +[13:18:48] Setting HMMER_NCPU=1 +[13:18:48] Writing log to: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.log +[13:18:48] Command: /usr/local/bin/prokka --outdir Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes --cpus 1 --prefix EXAM.1216.00002 Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna +[13:18:48] Appending to PATH: /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin +[13:18:48] Appending to PATH: /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/../common +[13:18:48] Appending to PATH: /Users/aperrin/Softwares/src/prokka/bin +[13:18:48] Looking for 'aragorn' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/aragorn +[13:18:48] Determined aragorn version is 1.2 +[13:18:48] Looking for 'barrnap' - found /usr/local/bin/barrnap +[13:18:48] Determined barrnap version is 0.8 +[13:18:48] Looking for 'blastp' - found /Users/aperrin/Softwares/bin/blastp +[13:18:48] Determined blastp version is 2.3 +[13:18:48] Looking for 'cmpress' - found /usr/local/bin/cmpress +[13:18:48] Determined cmpress version is 1.1 +[13:18:48] Looking for 'cmscan' - found /usr/local/bin/cmscan +[13:18:48] Determined cmscan version is 1.1 +[13:18:48] Looking for 'egrep' - found /usr/bin/egrep +[13:18:48] Looking for 'find' - found /usr/bin/find +[13:18:48] Looking for 'grep' - found /usr/bin/grep +[13:18:48] Looking for 'hmmpress' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/hmmpress +[13:18:48] Determined hmmpress version is 3.1 +[13:18:48] Looking for 'hmmscan' - found /usr/local/bin/hmmscan +[13:18:48] Determined hmmscan version is 3.1 +[13:18:48] Looking for 'java' - found /usr/bin/java +[13:18:48] Looking for 'less' - found /usr/bin/less +[13:18:48] Looking for 'makeblastdb' - found /Users/aperrin/Softwares/bin/makeblastdb +[13:18:48] Determined makeblastdb version is 2.3 +[13:18:48] Looking for 'minced' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/../common/minced +[13:18:48] Determined minced version is 2.0 +[13:18:48] Looking for 'parallel' - found /usr/local/bin/parallel +[13:18:49] Determined parallel version is 20181022 +[13:18:49] Looking for 'prodigal' - found /usr/local/bin/prodigal +[13:18:49] Determined prodigal version is 2.6 +[13:18:49] Looking for 'prokka-genbank_to_fasta_db' - found /Users/aperrin/Softwares/src/prokka/bin/prokka-genbank_to_fasta_db +[13:18:49] Looking for 'sed' - found /usr/bin/sed +[13:18:49] Looking for 'tbl2asn' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/tbl2asn +[13:18:49] Determined tbl2asn version is 25.6 +[13:18:49] Using genetic code table 11. +[13:18:49] Loading and checking input file: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna +[13:18:49] Wrote 3 contigs totalling 8817 bp. +[13:18:49] Predicting tRNAs and tmRNAs +[13:18:49] Running: aragorn -l -gc11 -w Examples\/1\-res\-Annotate\/tmp_files\/genome3\-chromo\.fst\-all\.fna\-split5N\.fna\-prokkaRes\/EXAM\.1216\.00002\.fna +[13:18:49] Found 0 tRNAs +[13:18:49] Predicting Ribosomal RNAs +[13:18:49] Running Barrnap with 1 threads +[13:18:49] Found 0 rRNAs +[13:18:49] Skipping ncRNA search, enable with --rfam if desired. +[13:18:49] Total of 0 tRNA + rRNA features +[13:18:49] Searching for CRISPR repeats +[13:18:49] Found 0 CRISPRs +[13:18:49] Predicting coding sequences +[13:18:49] Contigs total 8817 bp, so using meta mode +[13:18:49] Running: prodigal -i Examples\/1\-res\-Annotate\/tmp_files\/genome3\-chromo\.fst\-all\.fna\-split5N\.fna\-prokkaRes\/EXAM\.1216\.00002\.fna -c -m -g 11 -p meta -f sco -q +[13:18:49] Found 12 CDS +[13:18:49] Connecting features back to sequences +[13:18:49] Not using genus-specific database. Try --usegenus to enable it. +[13:18:49] Annotating CDS, please be patient. +[13:18:49] Will use 1 CPUs for similarity searching. +[13:18:49] There are still 12 unannotated CDS left (started with 12) +[13:18:49] Will use blast to search against /Users/aperrin/Softwares/src/prokka/db/kingdom/Bacteria/sprot with 1 CPUs +[13:18:49] Running: cat Examples\/1\-res\-Annotate\/tmp_files\/genome3\-chromo\.fst\-all\.fna\-split5N\.fna\-prokkaRes\/sprot\.faa | parallel --gnu --plain -j 1 --block 1443 --recstart '>' --pipe blastp -query - -db /Users/aperrin/Softwares/src/prokka/db/kingdom/Bacteria/sprot -evalue 1e-06 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > Examples\/1\-res\-Annotate\/tmp_files\/genome3\-chromo\.fst\-all\.fna\-split5N\.fna\-prokkaRes\/sprot\.blast 2> /dev/null +[13:18:50] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/sprot.faa +[13:18:50] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/sprot.blast +[13:18:50] There are still 4 unannotated CDS left (started with 12) +[13:18:50] Will use hmmer3 to search against /Users/aperrin/Softwares/src/prokka/db/hmm/HAMAP.hmm with 1 CPUs +[13:18:50] Running: cat Examples\/1\-res\-Annotate\/tmp_files\/genome3\-chromo\.fst\-all\.fna\-split5N\.fna\-prokkaRes\/HAMAP\.hmm\.faa | parallel --gnu --plain -j 1 --block 427 --recstart '>' --pipe hmmscan --noali --notextw --acc -E 1e-06 --cpu 1 /Users/aperrin/Softwares/src/prokka/db/hmm/HAMAP.hmm /dev/stdin > Examples\/1\-res\-Annotate\/tmp_files\/genome3\-chromo\.fst\-all\.fna\-split5N\.fna\-prokkaRes\/HAMAP\.hmm\.hmmer3 2> /dev/null +[13:18:51] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/HAMAP.hmm.faa +[13:18:51] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/HAMAP.hmm.hmmer3 +[13:18:51] Labelling remaining 4 proteins as 'hypothetical protein' +[13:18:51] Possible /pseudo 'N-acetylmuramoyl-L-alanine amidase AmiD' at header_genome3_1 position 880 +[13:18:51] Found 6 unique /gene codes. +[13:18:51] Fixed 2 duplicate /gene - amiD_1 amiD_2 +[13:18:51] Fixed 1 colliding /gene names. +[13:18:51] Adding /locus_tag identifiers +[13:18:51] Assigned 12 locus_tags to CDS and RNA features. +[13:18:51] Writing outputs to Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/ +[13:18:51] Generating annotation statistics file +[13:18:51] Generating Genbank and Sequin files +[13:18:51] Running: tbl2asn -V b -a r10k -l paired-ends -M n -N 1 -y 'Annotated using prokka 1.14-dev from https://github.com/tseemann/prokka' -Z Examples\/1\-res\-Annotate\/tmp_files\/genome3\-chromo\.fst\-all\.fna\-split5N\.fna\-prokkaRes\/EXAM\.1216\.00002\.err -i Examples\/1\-res\-Annotate\/tmp_files\/genome3\-chromo\.fst\-all\.fna\-split5N\.fna\-prokkaRes\/EXAM\.1216\.00002\.fsa 2> /dev/null +[13:18:51] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/errorsummary.val +[13:18:51] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.dr +[13:18:51] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.fixedproducts +[13:18:51] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.ecn +[13:18:51] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.val +[13:18:51] Repairing broken .GBK output that tbl2asn produces... +[13:18:51] Running: sed 's/COORDINATES: profile/COORDINATES:profile/' < Examples\/1\-res\-Annotate\/tmp_files\/genome3\-chromo\.fst\-all\.fna\-split5N\.fna\-prokkaRes\/EXAM\.1216\.00002\.gbf > Examples\/1\-res\-Annotate\/tmp_files\/genome3\-chromo\.fst\-all\.fna\-split5N\.fna\-prokkaRes\/EXAM\.1216\.00002\.gbk +[13:18:51] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.gbf +[13:18:51] Output files: +[13:18:51] Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.log +[13:18:51] Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.ffn +[13:18:51] Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.gbk +[13:18:51] Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.fsa +[13:18:51] Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.faa +[13:18:51] Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.txt +[13:18:51] Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.gff +[13:18:51] Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.tbl +[13:18:51] Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.sqn +[13:18:51] Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.tsv +[13:18:51] Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.fna +[13:18:51] Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.err +[13:18:51] Annotation finished successfully. +[13:18:51] Walltime used: 0.05 minutes +[13:18:51] If you use this result please cite the Prokka paper: +[13:18:51] Seemann T (2014) Prokka: rapid prokaryotic genome annotation. Bioinformatics. 30(14):2068-9. +[13:18:51] Type 'prokka --citation' for more details. +[13:18:51] Share and enjoy! diff --git a/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.err b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.err new file mode 100644 index 0000000000000000000000000000000000000000..2a0b1de27bde2245f0d664366b748b61b299cd72 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.err @@ -0,0 +1,133 @@ +Discrepancy Report Results + +Summary +FATAL: MISSING_PROTEIN_ID:12 proteins have invalid IDs. +DISC_SOURCE_QUALS_ASNDISC:strain (all present, all same) +DISC_SOURCE_QUALS_ASNDISC:taxname (all present, all same) +DISC_FEATURE_COUNT:CDS: 12 present +DISC_COUNT_NUCLEOTIDES:3 nucleotide Bioseqs are present +FEATURE_LOCATION_CONFLICT:12 features have inconsistent gene locations. +DISC_QUALITY_SCORES:Quality scores are missing on all sequences. +ONCALLER_COMMENT_PRESENT:3 comment descriptors were found (all same) +MISSING_GENOMEASSEMBLY_COMMENTS:3 bioseqs are missing GenomeAssembly structured comments +MOLTYPE_NOT_MRNA:3 molecule types are not set as mRNA. +TECHNIQUE_NOT_TSA:3 technique are not set as TSA +MISSING_STRUCTURED_COMMENT:3 sequences do not include structured comments. +MISSING_PROJECT:15 sequences do not include project. +DISC_INCONSISTENT_MOLINFO_TECH:Molinfo Technique Report (some missing, all same) + + +Detailed Report + +FATAL: DiscRep_ALL:MISSING_PROTEIN_ID::12 proteins halid nvalid IDs. +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:header_genome3_1_1 (length 276) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:header_genome3_1_2 (length 276) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:header_genome3_1_3 (length 157) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:header_genome3_1_4 (length 74) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:header_genome3_1_5 (length 467) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:header_genome3_1_6 (length 257) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:header_genome3_1_7 (length 263) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:header_genome3_1_8 (length 95) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:genome3_plasmi_2_1 (length 193) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:genome3_plasmi_2_2 (length 435) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:genome3_plasmi_3_1 (length 84) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:genome3_plasmi_3_2 (length 259) + +DiscRep_ALL:DISC_SOURCE_QUALS_ASNDISC::strain (all present, all same) +DiscRep_SUB:DISC_SOURCE_QUALS_ASNDISC::3 sources have 'strain' for strain +DiscRep_ALL:DISC_SOURCE_QUALS_ASNDISC::taxname (all present, all same) +DiscRep_SUB:DISC_SOURCE_QUALS_ASNDISC::3 sources have 'Genus species' for taxname +DiscRep_ALL:DISC_FEATURE_COUNT::CDS: 12 present +DiscRep_ALL:DISC_COUNT_NUCLEOTIDES::3 nucleotide Bioseqs are present +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:header_genome3_1 (length 5747) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:genome3_plasmi_2 (length 1968) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:genome3_plasmi_3 (length 1102) + +DiscRep_ALL:FEATURE_LOCATION_CONFLICT::12 features have inconsistent gene locations. +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:CDS N-acetylmuramoyl-L-alanine amidase AmiD header_genome3_1:1-831 PFKCIMHH_00001 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:CDS N-acetylmuramoyl-L-alanine amidase AmiD header_genome3_1:880-1710 PFKCIMHH_00002 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:CDS 5-carboxymethyl-2-hydroxymuconate Delta-isomerase header_genome3_1:1655-2128 PFKCIMHH_00003 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:CDS Phage shock protein B header_genome3_1:2172-2396 PFKCIMHH_00004 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:CDS hypothetical protein header_genome3_1:2425-3828 PFKCIMHH_00005 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:CDS Histidine transport ATP-binding protein HisP header_genome3_1:3863-4636 PFKCIMHH_00006 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:CDS tRNA (guanine-N(1)-)-methyltransferase header_genome3_1:4641-5432 PFKCIMHH_00007 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:CDS hypothetical protein header_genome3_1:5460-5747 PFKCIMHH_00008 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:CDS hypothetical protein genome3_plasmi_2:1-582 PFKCIMHH_00009 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:CDS C4-dicarboxylate TRAP transporter large permease protein DctM genome3_plasmi_2:629-1936 PFKCIMHH_00010 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:CDS hypothetical protein genome3_plasmi_3:26-280 PFKCIMHH_00011 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:CDS Insertion sequence IS5376 putative ATP-binding protein genome3_plasmi_3:323-1102 PFKCIMHH_00012 + +DiscRep_ALL:DISC_QUALITY_SCORES::Quality scores are missing on all sequences. + +DiscRep_ALL:ONCALLER_COMMENT_PRESENT::3 comment descriptors were found (all same) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:header_genome3_1:Annotated using prokka 1.14-dev from https://github.com/tseemann/prokka +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:genome3_plasmi_2:Annotated using prokka 1.14-dev from https://github.com/tseemann/prokka +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:genome3_plasmi_3:Annotated using prokka 1.14-dev from https://github.com/tseemann/prokka + +DiscRep_ALL:MISSING_GENOMEASSEMBLY_COMMENTS::3 bioseqs are missing GenomeAssembly structured comments +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:header_genome3_1 (length 5747) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:genome3_plasmi_2 (length 1968) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:genome3_plasmi_3 (length 1102) + +DiscRep_ALL:MOLTYPE_NOT_MRNA::3 molecule types are not set as mRNA. +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:header_genome3_1 (length 5747) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:genome3_plasmi_2 (length 1968) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:genome3_plasmi_3 (length 1102) + +DiscRep_ALL:TECHNIQUE_NOT_TSA::3 technique are not set as TSA +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:header_genome3_1 (length 5747) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:genome3_plasmi_2 (length 1968) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:genome3_plasmi_3 (length 1102) + +DiscRep_ALL:MISSING_STRUCTURED_COMMENT::3 sequences do not include structured comments. +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:header_genome3_1 (length 5747) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:genome3_plasmi_2 (length 1968) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:genome3_plasmi_3 (length 1102) + +DiscRep_ALL:MISSING_PROJECT::15 sequences do not include project. +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:header_genome3_1 (length 5747) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:header_genome3_1_1 (length 276) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:header_genome3_1_2 (length 276) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:header_genome3_1_3 (length 157) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:header_genome3_1_4 (length 74) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:header_genome3_1_5 (length 467) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:header_genome3_1_6 (length 257) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:header_genome3_1_7 (length 263) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:header_genome3_1_8 (length 95) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:genome3_plasmi_2 (length 1968) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:genome3_plasmi_2_1 (length 193) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:genome3_plasmi_2_2 (length 435) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:genome3_plasmi_3 (length 1102) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:genome3_plasmi_3_1 (length 84) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:genome3_plasmi_3_2 (length 259) + +DiscRep_ALL:DISC_INCONSISTENT_MOLINFO_TECH::Molinfo Technique Report (some missing, all same) +DiscRep_SUB:DISC_INCONSISTENT_MOLINFO_TECH::technique (all missing) +DiscRep_SUB:DISC_INCONSISTENT_MOLINFO_TECH::3 Molinfos are missing field technique +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:genome3_plasmi_2 (length 1968) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:genome3_plasmi_3 (length 1102) +Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002:header_genome3_1 (length 5747) + diff --git a/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.faa b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.faa new file mode 100644 index 0000000000000000000000000000000000000000..b4ae62e66c2e02c8b6a49d5f9521fd5bdcaba8b4 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.faa @@ -0,0 +1,66 @@ +>PFKCIMHH_00001 N-acetylmuramoyl-L-alanine amidase AmiD +MRALLWLVGLALLLTGCASEKGIIDEEGYQLDTRHRAQAAYPRIKVLVIHYTAENFDVSL +ATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGVSFWRGATRLNDTSIGIEL +ENRGWRMSGGVKSFAPFESAQIQALIPLAKDIIARYDIKPQNVVAHADIAPQRKDDPGPR +FPWRELAAQGIGAWPDAQRVAFYLAGRAPYTPVDTATVLALLSRYGYEVKADMTAREQQR +VIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD +>PFKCIMHH_00002 N-acetylmuramoyl-L-alanine amidase AmiD +MRALLWLVGLALLLTGCASEKGIIDEEGYQLDTRHRAQAAYPRIKVLVIHYTAENFDVSL +ATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGVSFWRGATRLNDTSIGIEL +ENRGWRMSGGVKSFAPFESAQIQALIPLAKDIIARYDIKPQNVVAHADIAPQRKDDPGPR +FPWRELAAQGIGAWPDAQRVAFYLAGRAPYTPVDTATVLALLSRYGYEVKADMTAREQQR +VIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD +>PFKCIMHH_00003 5-carboxymethyl-2-hydroxymuconate Delta-isomerase +MPKRRRLPKHYWRSTARINRARYREDSLFEDMPHFIAECTENIREQADLPGLFSKVNEAL +AASGIFPIGGIRSRAHWLDTWQMADGKHDYAFVHMTLKIGAGRSLESRQEVGEMLFGLIK +AHFADLMENRYLALSFEIAELHPTLNYKQNNVHALFK +>PFKCIMHH_00004 Phage shock protein B +MSALFLAIPLTIFVLFVLPIWLWLHYSNRAGRGELSQSEQQRLLQLTDDAQRMRERIQAL +EDILDAEHPNWRER +>PFKCIMHH_00005 hypothetical protein +MPATKFSRRTLLTAGSALAVLPFLRALPVQAREPRETVDIKDYPADDGIASFKQAFADGQ +TVVVPPGWVCENINAAITIPAGKTLRVQGAVRGNGRGRFILQDGCQVVGEQGGSLHNVTL +DVRGSDCVIKGVAMSGFGPVAQIFIGGKEPQVMRNLIIDDITVTHANYAILRQGFHNQMD +GARITHSRFSDLQGDAIEWNVAIHDRDILISDHVIERIDCTNGKINWGIGIGLAGSTYDN +SYPEDQAVKNFVVANITGSDCRQLVHVENGKHFVIRNVKAKNITPDFSKNAGIDNATIAI +YGCDNFVIDNIDMTNSAGMLIGYGVVKGKYLSIPQNFKLNAIRLDNRQVAYKLRGIQISS +GNTPSFVAITNVRMTRATLELHNQPQHLFLRNINVMQTSAIGPALKMHFDLRKDVRGQFM +ARQDTLLSLANVHAINENGQSSVDIDRINHQTVNVEAVNFSLPKRGG +>PFKCIMHH_00006 Histidine transport ATP-binding protein HisP +MSENKLHVIDLHKRYGGHEVLKGVSLQARAGDVISIIGSSGSGKSTFLRCINFLEKPSED +AIIVNGQNINLVRDKDGQLKVADKNQLRLLRTRLTMVFQHFNLWSHMTVLENVMEAPIQV +LGLSKHDARERALKYLAKVGIDERAQGKYPVHLSGGQQQRVSIARALAMEPDVLLFDEPT +SALDPELVGEVLRIMQQLAEEGKTMVVVTHEMGFVRHVSSHVIFLHQGKIEEEGDPEQVF +GNPQSPRLQQFLKGSLK +>PFKCIMHH_00007 tRNA (guanine-N(1)-)-methyltransferase +MRAIYRRCVFIGIVSLFPEMFRAITDYGVTGRAVKNGLLNIQSWSPRDFTHDRHRTVDDR +PYGGGPGMLMMVQPLRDAIHAAKAAAGEGAKVIYLSPQGRKLDQAGVSELATNQKLILVC +GRYEGVDERVIQTEIDEEWSIGDYVLSGGELPAMTLIDSVARFIPGVLGHEASAIEDSFA +DGLLDCPHYTRPEVLEGMEVPPVLLSGNHAEIRRWRLKQSLGRTWLRRPELLENLALTEE +QARLLAEFKTEHAQQQHKHDGMA +>PFKCIMHH_00008 hypothetical protein +MNNHFGKGLMAGLHAPYAYSAHHAVNFCSEYKRGFVLGFTHRMFEKTGDRQLSAWEAGIL +TRRYGLDKEMVMDFFKENHSGMAVRFFMAGYRLEG +>PFKCIMHH_00009 hypothetical protein +MSTLLYLHGFNSSPRSAKACQLKNWLAERHPHVEMIVPQLPPYPADAAELLESLVLEHGG +APLGLVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDYLGQNENPYTGQQYVLESR +HIYDLKVMQIDPLEAPDLIWLLQQTGDEVLDYRQAVAYYASCRQTVTEGGNHAFTGFEDY +FNQIVDFLGLHSC +>PFKCIMHH_00010 C4-dicarboxylate TRAP transporter large permease protein DctM +MIDPIFASCTLIAVFVVLLAMGAPIGICIVIASFSTMMLVLPFDISMFATAQKMFSSLDS +FALLAVPFFVLSGVIMNSGGIAARLINFAKLFTGKLPGSLSYTNIVGNMMFGAISGSAIA +ASTSIGGVMVPMSAREGYDRGFAAAVNIASAPTGMLIPPTTAFILYALASGGTSIAALFA +GGLVAGVLWGVGCMLVTLVVAKRRNYRVFFTVQKGMALKVAVEAIPSLLLIVIIVGGIVQ +GIFTAIEASAIAVVYTLLLTMVFYRTLKIKDLPSILLQTVVMTGVIMFLLATSSAMSFSM +SITNIPAALSDMILGISANKLVILLVITVFLLIIGAFMDIGPAILIFTPILLPIMAKLGV +DPVHLGIIMIYNLAIGTITPPVGSGLYVGASVGKVKVEEVIKPLLPFYGAIIGVLLLITY +IPEIILFLPRLLGIM +>PFKCIMHH_00011 hypothetical protein +MIINGKLIKAKDLAKAAGVSRSTVIKYYGISRENYERVATERRKLAFELRASGLKWKEVA +EKMNTTKYSAIAYYRRYLALEKNK +>PFKCIMHH_00012 Insertion sequence IS5376 putative ATP-binding protein +MVELQHQRLMVLAEQLQLDSLIGAAPALSQQAVDQEWSYMDFLEHLLHEEKLARHQRKQA +MYTRMAAFPAVKTFEEYDFTFATGAPQKQIQSLRSLSFIERNENIVLLGPSGVGKTHLAI +AMGYEAVRAGIKVRFTTAADLLLQLSTSQRQGRYKTTLNRGVMAPKLLIIDEIGYLPFSQ +EEAKLFFQVIAKRYEKSAMILTSNLPFGQWDQTFAGDAALTSAMLDRILHHSHVVQIKGE +SYRLKQKRKAGVIAEANPE diff --git a/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.ffn b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.ffn new file mode 100644 index 0000000000000000000000000000000000000000..ed27b471be4586da2fb4866febe34252e5f0a708 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.ffn @@ -0,0 +1,158 @@ +>PFKCIMHH_00001 N-acetylmuramoyl-L-alanine amidase AmiD +ATGAGAGCGCTACTGTGGCTGGTGGGTCTTGCGTTGCTGTTAACAGGCTGCGCGAGCGAA +AAAGGAATTATCGATGAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCC +TATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTG +GCGACGTTAACGGGCCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTA +TATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCG +GGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTG +GAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCG +CAAATTCAGGCATTGATTCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCG +CAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGC +TTCCCGTGGCGCGAGCTGGCGGCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTG +GCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCG +TTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGG +GTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCC +GAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAA +>PFKCIMHH_00002 N-acetylmuramoyl-L-alanine amidase AmiD +ATGAGAGCGCTACTGTGGCTGGTGGGTCTTGCGTTGCTGTTAACAGGCTGCGCGAGCGAA +AAAGGAATTATCGATGAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCC +TATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTG +GCGACGTTAACGGGCCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTA +TATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCG +GGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTG +GAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCG +CAAATTCAGGCATTGATTCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCG +CAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGC +TTCCCGTGGCGCGAGCTGGCGGCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTG +GCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCG +TTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGG +GTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCC +GAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAA +>PFKCIMHH_00003 5-carboxymethyl-2-hydroxymuconate Delta-isomerase +ATGCCGAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACAGG +GCGCGGTATCGCGAGGACTCTCTCTTCGAGGACATGCCGCACTTTATTGCTGAATGTACT +GAAAATATTCGCGAGCAGGCTGATTTACCCGGCCTGTTCAGCAAGGTAAACGAGGCGCTG +GCCGCCAGCGGGATTTTCCCCATCGGCGGTATCCGCAGTCGCGCCCACTGGCTGGATACC +TGGCAGATGGCTGACGGTAAGCATGATTACGCGTTTGTGCATATGACGCTGAAAATCGGC +GCCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTCGGCGAAATGCTGTTCGGGCTGATTAAA +GCCCACTTCGCCGACCTGATGGAGAACCGCTATCTGGCGCTGTCGTTTGAGATTGCCGAG +TTACATCCAACGCTCAATTACAAACAAAACAACGTACACGCGTTATTTAAATAG +>PFKCIMHH_00004 Phage shock protein B +ATGAGCGCGCTATTTCTGGCCATCCCGTTAACCATTTTTGTGTTGTTTGTGTTACCGATT +TGGCTGTGGCTGCATTACAGCAACCGCGCCGGTCGGGGAGAACTGTCGCAAAGCGAGCAG +CAACGCTTACTGCAACTCACAGACGACGCGCAACGTATGCGCGAGCGCATTCAGGCGCTG +GAAGACATTCTTGATGCAGAGCATCCGAACTGGAGAGAGCGCTAA +>PFKCIMHH_00005 hypothetical protein +ATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTGACGGCAGGTTCTGCGCTTGCTGTT +CTTCCTTTCCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCGAGACCGTCGATATT +AAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGCCGACGGACAG +ACCGTGGTCGTACCGCCAGGATGGGTGTGTGAAAATATCAATGCGGCGATAACGATTCCG +GCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATGGCCGGGGACGGTTTATT +TTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAATGTGACGCTG +GATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGGCGATGAGCGGCTTTGGCCCCGTC +GCGCAAATTTTCATCGGTGGTAAGGAACCGCAGGTGATGCGTAATCTCATTATCGATGAC +ATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCATAACCAAATGGAC +GGCGCGAGGATTACGCATAGCCGCTTTAGCGATTTACAGGGGGACGCCATTGAGTGGAAT +GTCGCGATTCACGACCGCGATATCCTGATTTCCGATCATGTCATCGAACGCATTGATTGT +ACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCACCTATGACAAC +AGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTGGTGGCCAATATTACCGGATCTGAT +TGCCGACAGCTTGTGCACGTAGAAAATGGCAAACATTTCGTCATTCGCAATGTCAAAGCC +AAAAACATCACGCCCGATTTCAGTAAAAATGCGGGTATTGATAACGCAACGATCGCAATT +TATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAGTGCCGGGATGCTC +ATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAACTTTAAATTAAAC +GCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCAAATTTCCTCC +GGCAACACCCCCTCTTTTGTCGCCATCACCAATGTACGGATGACGCGTGCTACGCTGGAA +CTGCATAATCAACCGCAGCACCTCTTCCTGCGTAATATCAACGTGATGCAAACTTCAGCG +ATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGTAAAGATGTCCGTGGTCAATTTATG +GCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAATGAAAACGGGCAG +AGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGCAGTGAATTTT +TCGCTGCCGAAGCGGGGAGGGTAA +>PFKCIMHH_00006 Histidine transport ATP-binding protein HisP +ATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTG +CTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGTGATTAGCATCATCGGCTCGTCC +GGCTCCGGTAAAAGCACTTTTTTGCGCTGTATTAACTTCCTCGAAAAACCGAGCGAAGAC +GCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGACGGGCAGCTCAAA +GTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCAC +TTTAACCTCTGGAGCCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTA +CTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGG +ATTGATGAGCGCGCTCAGGGCAAATATCCCGTTCATCTCTCCGGTGGCCAACAGCAGCGC +GTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACT +TCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAA +GAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGTTCGCCATGTCTCTTCG +CACGTTATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGCGATCCGGAGCAGGTGTTC +GGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAA +>PFKCIMHH_00007 tRNA (guanine-N(1)-)-methyltransferase +ATGCGCGCGATATATCGGCGATGCGTGTTTATTGGCATCGTTAGCCTGTTTCCTGAAATG +TTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAATGGCCTGCTGAAC +ATCCAAAGCTGGAGTCCTCGCGACTTCACGCATGACCGGCACCGTACCGTGGACGATCGT +CCTTACGGCGGCGGACCAGGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATTCAC +GCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGC +AAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGT +GGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGACCGAAATTGACGAAGAATGGTCA +ATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCCGTC +GCCCGGTTTATACCGGGAGTTCTGGGGCATGAAGCATCAGCAATCGAAGATTCGTTTGCT +GATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTA +CCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCGCTGGCGTTTGAAACAGTCG +CTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAG +CAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGCATAAACATGAT +GGGATGGCATAG +>PFKCIMHH_00008 hypothetical protein +ATGAATAATCATTTTGGGAAAGGGTTAATGGCCGGGTTGCACGCGCCATATGCATATAGC +GCGCATCATGCGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATTGGGTTTTACA +CACCGTATGTTCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGCTGGAATTCTG +ACGCGTCGCTATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGAGAATCATTCC +GGGATGGCGGTTCGCTTCTTTATGGCCGGTTATCGACTCGAAGGTTGA +>PFKCIMHH_00009 hypothetical protein +ATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGC +CAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTTGAGATGATCGTCCCTCAACTA +CCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAGTCTCTCGTGCTTGAGCATGGCGGT +GCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAA +TGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACC +GACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGC +CATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGG +CTACTGCAACAGACGGGCGATGAAGTGCTGGATTACCGCCAGGCGGTGGCATATTACGCC +TCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCACGGGCTTCGAAGATTAT +TTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGA +>PFKCIMHH_00010 C4-dicarboxylate TRAP transporter large permease protein DctM +ATGATTGACCCTATTTTTGCGTCCTGTACGCTAATTGCCGTCTTTGTTGTTTTACTGGCC +ATGGGCGCGCCTATCGGGATCTGCATCGTTATCGCCTCTTTCAGCACCATGATGCTGGTA +CTGCCTTTCGATATTTCGATGTTCGCCACCGCGCAAAAAATGTTCTCCAGCCTGGACAGT +TTTGCCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGCGGGGGA +ATTGCCGCCCGGCTGATCAATTTTGCCAAACTGTTTACTGGCAAACTGCCCGGTTCGCTC +TCTTATACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCAATTGCC +GCCTCAACCTCCATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTACGATCGC +GGCTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAATTCCGCCCACC +ACGGCTTTTATCCTTTATGCGCTGGCAAGCGGGGGAACATCGATTGCCGCTCTGTTCGCC +GGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTGGTAGTC +GCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGCATGGCGCTAAAAGTT +GCCGTTGAGGCCATTCCCAGCCTGTTACTGATCGTGATTATCGTCGGCGGCATTGTGCAG +GGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTATACGTTATTGCTGACG +ATGGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAGACAGTG +GTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTCTCAATG +TCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCCAATAAA +CTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATGGATATC +GGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCGATCATGGCTAAACTGGGCGTC +GATCCGGTGCATTTGGGCATTATCATGATCTATAACCTGGCGATTGGCACCATTACGCCG +CCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAGGAAGTG +ATTAAACCGTTGCTGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTATTAATTACCTAC +ATTCCGGAAATCATACTGTTCTTACCCCGTCTACTGGGCATCATGTAA +>PFKCIMHH_00011 hypothetical protein +ATGATAATTAATGGGAAATTAATTAAAGCAAAAGACTTAGCTAAGGCTGCAGGTGTATCT +CGTTCAACAGTGATTAAATATTACGGCATTAGCCGTGAGAATTACGAAAGGGTAGCAACT +GAAAGAAGGAAGCTTGCTTTTGAACTAAGAGCATCAGGTTTAAAATGGAAAGAAGTTGCT +GAAAAAATGAACACGACAAAATATAGCGCAATTGCATATTATAGACGATATTTAGCATTA +GAGAAAAACAAATAA +>PFKCIMHH_00012 Insertion sequence IS5376 putative ATP-binding protein +ATGGTCGAACTGCAACATCAACGGCTGATGGTGCTTGCCGAACAGCTCCAGCTGGACAGT +CTTATCGGCGCAGCGCCGGCGCTGTCGCAACAGGCGGTGGATCAGGAATGGAGCTACATG +GACTTCCTGGAGCACCTGTTACATGAGGAGAAACTGGCCCGGCATCAGCGTAAACAGGCG +ATGTACACGCGGATGGCAGCCTTCCCGGCGGTAAAGACGTTCGAGGAGTACGACTTCACC +TTCGCCACCGGCGCTCCTCAGAAGCAAATCCAGTCGCTGCGATCCCTGAGCTTCATAGAG +CGTAACGAAAACATCGTGTTGCTGGGGCCATCGGGCGTGGGAAAAACGCATCTGGCGATA +GCCATGGGCTACGAAGCAGTACGGGCGGGCATCAAGGTTCGCTTCACAACAGCAGCGGAC +CTGCTGCTACAGCTGTCCACTTCACAGCGTCAGGGCCGTTACAAAACGACTCTCAATCGT +GGTGTCATGGCCCCGAAGCTGCTTATCATCGATGAAATAGGTTATCTGCCGTTCAGTCAG +GAGGAAGCCAAGCTGTTCTTCCAGGTCATCGCCAAACGTTACGAGAAGAGCGCGATGATC +CTGACCTCCAACCTGCCGTTCGGGCAGTGGGATCAGACGTTCGCCGGTGATGCAGCGCTG +ACATCGGCGATGCTGGACCGGATCTTACATCACTCACACGTCGTGCAAATAAAAGGGGAA +AGCTATCGACTGAAGCAGAAACGAAAGGCCGGGGTTATAGCTGAAGCTAATCCTGAGTAA diff --git a/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.fna b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.fna new file mode 100644 index 0000000000000000000000000000000000000000..07bd5c4b160a7dc19f4e7c640e3813aeff601694 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.fna @@ -0,0 +1,151 @@ +>header_genome3_1 +ATGAGAGCGCTACTGTGGCTGGTGGGTCTTGCGTTGCTGTTAACAGGCTGCGCGAGCGAA +AAAGGAATTATCGATGAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCC +TATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTG +GCGACGTTAACGGGCCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTA +TATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCG +GGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTG +GAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCG +CAAATTCAGGCATTGATTCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCG +CAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGC +TTCCCGTGGCGCGAGCTGGCGGCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTG +GCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCG +TTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGG +GTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCC +GAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACGAGTCCGG +GAGCGGCGAGATTCCGGGCAGAGGAGGCGGCTATAGCGGATGAGAGCGCTACTGTGGCTG +GTGGGTCTTGCGTTGCTGTTAACAGGCTGCGCGAGCGAAAAAGGAATTATCGATGAAGAG +GGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCCTATCCGCGCATTAAAGTCCTG +GTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGACGTTAACGGGCCGCAAC +GTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTATATGGCGGTAAACCGCGCATC +TGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCGGGCGTCAGTTTCTGGCGAGGC +GCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAATCGCGGCTGGCGAATG +TCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAATTCAGGCATTGATTCCG +TTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCGCAGAATGTGGTGGCCCATGCG +GATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGCTTCCCGTGGCGCGAGCTGGCG +GCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTGGCGTTTTATCTGGCTGGACGC +GCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCGTTACTCTCGCGCTATGGCTAT +GAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGGGTGATTATGGCGTTCCAGATG +CACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCCGAAACGCAGGCGATTGCCGAA +GCATTACTGGAGAAGTACGGCCAGGATTAACAGGGCGCGGTATCGCGAGGACTCTCTCTT +CGAGGACATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTT +ACCCGGCCTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGATTTTCCCCATCGG +CGGTATCCGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGA +TTACGCGTTTGTGCATATGACGCTGAAAATCGGCGCCGGGCGCAGCCTGGAGAGCCGTCA +GGAAGTCGGCGAAATGCTGTTCGGGCTGATTAAAGCCCACTTCGCCGACCTGATGGAGAA +CCGCTATCTGGCGCTGTCGTTTGAGATTGCCGAGTTACATCCAACGCTCAATTACAAACA +AAACAACGTACACGCGTTATTTAAATAGCCCGAGTATTAGCGCGCGGCGGCGCTATAGGC +GCGCTATAGGCATGAGCGCGCTATTTCTGGCCATCCCGTTAACCATTTTTGTGTTGTTTG +TGTTACCGATTTGGCTGTGGCTGCATTACAGCAACCGCGCCGGTCGGGGAGAACTGTCGC +AAAGCGAGCAGCAACGCTTACTGCAACTCACAGACGACGCGCAACGTATGCGCGAGCGCA +TTCAGGCGCTGGAAGACATTCTTGATGCAGAGCATCCGAACTGGAGAGAGCGCTAACGAG +AGTCTCGGAGGAGCGGCGCTCTGGATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTG +ACGGCAGGTTCTGCGCTTGCTGTTCTTCCTTTCCTGCGCGCCTTGCCGGTACAGGCGCGT +GAACCTCGCGAGACCGTCGATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTC +AAACAGGCCTTCGCCGACGGACAGACCGTGGTCGTACCGCCAGGATGGGTGTGTGAAAAT +ATCAATGCGGCGATAACGATTCCGGCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGT +GGGAATGGCCGGGGACGGTTTATTTTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGC +GGCAGTCTGCACAATGTGACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTG +GCGATGAGCGGCTTTGGCCCCGTCGCGCAAATTTTCATCGGTGGTAAGGAACCGCAGGTG +ATGCGTAATCTCATTATCGATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGC +CAGGGATTTCATAACCAAATGGACGGCGCGAGGATTACGCATAGCCGCTTTAGCGATTTA +CAGGGGGACGCCATTGAGTGGAATGTCGCGATTCACGACCGCGATATCCTGATTTCCGAT +CATGTCATCGAACGCATTGATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGG +CTGGCGGGTAGCACCTATGACAACAGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTG +GTGGCCAATATTACCGGATCTGATTGCCGACAGCTTGTGCACGTAGAAAATGGCAAACAT +TTCGTCATTCGCAATGTCAAAGCCAAAAACATCACGCCCGATTTCAGTAAAAATGCGGGT +ATTGATAACGCAACGATCGCAATTTATGGCTGTGATAATTTCGTCATTGATAATATTGAT +ATGACGAATAGTGCCGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCA +ATTCCGCAAAACTTTAAATTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAA +TTACGCGGCATTCAAATTTCCTCCGGCAACACCCCCTCTTTTGTCGCCATCACCAATGTA +CGGATGACGCGTGCTACGCTGGAACTGCATAATCAACCGCAGCACCTCTTCCTGCGTAAT +ATCAACGTGATGCAAACTTCAGCGATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGT +AAAGATGTCCGTGGTCAATTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTT +CATGCCATCAATGAAAACGGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACC +GTGAATGTCGAAGCAGTGAATTTTTCGCTGCCGAAGCGGGGAGGGTAACGCGTCTGCGCG +CGCGGAGAGAGGCTCTCAGGACATGTCAGAAAATAAATTACACGTTATCGATTTGCACAA +ACGCTACGGCGGTCATGAAGTGCTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGT +GATTAGCATCATCGGCTCGTCCGGCTCCGGTAAAAGCACTTTTTTGCGCTGTATTAACTT +CCTCGAAAAACCGAGCGAAGACGCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCG +CGACAAAGACGGGCAGCTCAAAGTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCG +CCTGACGATGGTGTTTCAGCACTTTAACCTCTGGAGCCACATGACGGTGCTGGAAAATGT +GATGGAAGCGCCGATTCAGGTACTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTT +GAAATATCTGGCGAAGGTGGGGATTGATGAGCGCGCTCAGGGCAAATATCCCGTTCATCT +CTCCGGTGGCCAACAGCAGCGCGTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGT +TTTACTGTTCGATGAACCCACTTCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCG +CATCATGCAACAACTGGCGGAAGAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGG +CTTCGTTCGCCATGTCTCTTCGCACGTTATTTTTCTGCATCAGGGGAAAATTGAAGAAGA +GGGCGATCCGGAGCAGGTGTTCGGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAA +AGGCTCGCTGAAATAACCGGATGCGCGCGATATATCGGCGATGCGTGTTTATTGGCATCG +TTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAG +TAAAAAATGGCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGACTTCACGCATGACCGGC +ACCGTACCGTGGACGATCGTCCTTACGGCGGCGGACCAGGGATGTTAATGATGGTGCAAC +CCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTT +ATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATC +AGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGACCG +AAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAA +TGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGAGTTCTGGGGCATGAAGCATCAG +CAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAG +TGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTC +GCTGGCGTTTGAAACAGTCGCTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAA +ACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCAC +AACAGCAGCATAAACATGATGGGATGGCATAGCCGGCTCTGCGAGAGAGGAGCGCTCGCA +TGAATAATCATTTTGGGAAAGGGTTAATGGCCGGGTTGCACGCGCCATATGCATATAGCG +CGCATCATGCGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATTGGGTTTTACAC +ACCGTATGTTCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGCTGGAATTCTGA +CGCGTCGCTATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGAGAATCATTCCG +GGATGGCGGTTCGCTTCTTTATGGCCGGTTATCGACTCGAAGGTTGA +>genome3_plasmi_2 +ATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGC +CAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTTGAGATGATCGTCCCTCAACTA +CCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAGTCTCTCGTGCTTGAGCATGGCGGT +GCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAA +TGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACC +GACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGC +CATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGG +CTACTGCAACAGACGGGCGATGAAGTGCTGGATTACCGCCAGGCGGTGGCATATTACGCC +TCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCACGGGCTTCGAAGATTAT +TTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGACGCGTAGCGGCCGGAATC +TTCTCGGAGAGGCGCTTCTCTCTCGGAGATGATTGACCCTATTTTTGCGTCCTGTACGCT +AATTGCCGTCTTTGTTGTTTTACTGGCCATGGGCGCGCCTATCGGGATCTGCATCGTTAT +CGCCTCTTTCAGCACCATGATGCTGGTACTGCCTTTCGATATTTCGATGTTCGCCACCGC +GCAAAAAATGTTCTCCAGCCTGGACAGTTTTGCCTTGCTGGCCGTGCCGTTCTTCGTTTT +GTCCGGGGTGATCATGAATAGCGGGGGAATTGCCGCCCGGCTGATCAATTTTGCCAAACT +GTTTACTGGCAAACTGCCCGGTTCGCTCTCTTATACCAACATCGTCGGCAATATGATGTT +CGGTGCAATTTCCGGATCGGCAATTGCCGCCTCAACCTCCATCGGCGGCGTGATGGTGCC +GATGAGCGCGCGCGAAGGTTACGATCGCGGCTTTGCGGCCGCGGTGAATATCGCCTCCGC +GCCGACGGGAATGTTAATTCCGCCCACCACGGCTTTTATCCTTTATGCGCTGGCAAGCGG +GGGAACATCGATTGCCGCTCTGTTCGCCGGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGT +TGGCTGTATGCTGGTCACGCTGGTAGTCGCTAAGCGTCGAAATTATCGGGTTTTCTTCAC +CGTCCAAAAAGGCATGGCGCTAAAAGTTGCCGTTGAGGCCATTCCCAGCCTGTTACTGAT +CGTGATTATCGTCGGCGGCATTGTGCAGGGGATTTTCACCGCCATTGAAGCCTCCGCGAT +TGCCGTGGTGTATACGTTATTGCTGACGATGGTGTTTTACCGCACGCTGAAAATTAAGGA +TTTGCCTTCGATTTTGCTCCAGACAGTGGTAATGACCGGGGTCATCATGTTCCTGCTGGC +AACCTCTTCGGCGATGTCCTTCTCAATGTCGATCACCAATATTCCTGCGGCGCTGAGCGA +TATGATCCTCGGTATTTCCGCCAATAAACTGGTTATCCTGTTAGTCATTACCGTCTTTTT +GTTGATTATCGGCGCATTTATGGATATCGGTCCGGCCATTCTGATTTTTACCCCGATTCT +GCTGCCGATCATGGCTAAACTGGGCGTCGATCCGGTGCATTTGGGCATTATCATGATCTA +TAACCTGGCGATTGGCACCATTACGCCGCCAGTTGGCAGTGGTTTATATGTCGGGGCGAG +CGTCGGTAAGGTCAAAGTTGAGGAAGTGATTAAACCGTTGCTGCCTTTTTACGGCGCGAT +TATCGGCGTTCTGTTATTAATTACCTACATTCCGGAAATCATACTGTTCTTACCCCGTCT +ACTGGGCATCATGTAAACCCGATGGCGCGCAGAGGCGCGAGTTCTGGA +>genome3_plasmi_3 +CACAGGGCTTAGAGGCGCTATGGCAATGATAATTAATGGGAAATTAATTAAAGCAAAAGA +CTTAGCTAAGGCTGCAGGTGTATCTCGTTCAACAGTGATTAAATATTACGGCATTAGCCG +TGAGAATTACGAAAGGGTAGCAACTGAAAGAAGGAAGCTTGCTTTTGAACTAAGAGCATC +AGGTTTAAAATGGAAAGAAGTTGCTGAAAAAATGAACACGACAAAATATAGCGCAATTGC +ATATTATAGACGATATTTAGCATTAGAGAAAAACAAATAACAGGCGCTAAGGCGGCGATC +CTAGCGCGCGATCGCGCATGCGATGGTCGAACTGCAACATCAACGGCTGATGGTGCTTGC +CGAACAGCTCCAGCTGGACAGTCTTATCGGCGCAGCGCCGGCGCTGTCGCAACAGGCGGT +GGATCAGGAATGGAGCTACATGGACTTCCTGGAGCACCTGTTACATGAGGAGAAACTGGC +CCGGCATCAGCGTAAACAGGCGATGTACACGCGGATGGCAGCCTTCCCGGCGGTAAAGAC +GTTCGAGGAGTACGACTTCACCTTCGCCACCGGCGCTCCTCAGAAGCAAATCCAGTCGCT +GCGATCCCTGAGCTTCATAGAGCGTAACGAAAACATCGTGTTGCTGGGGCCATCGGGCGT +GGGAAAAACGCATCTGGCGATAGCCATGGGCTACGAAGCAGTACGGGCGGGCATCAAGGT +TCGCTTCACAACAGCAGCGGACCTGCTGCTACAGCTGTCCACTTCACAGCGTCAGGGCCG +TTACAAAACGACTCTCAATCGTGGTGTCATGGCCCCGAAGCTGCTTATCATCGATGAAAT +AGGTTATCTGCCGTTCAGTCAGGAGGAAGCCAAGCTGTTCTTCCAGGTCATCGCCAAACG +TTACGAGAAGAGCGCGATGATCCTGACCTCCAACCTGCCGTTCGGGCAGTGGGATCAGAC +GTTCGCCGGTGATGCAGCGCTGACATCGGCGATGCTGGACCGGATCTTACATCACTCACA +CGTCGTGCAAATAAAAGGGGAAAGCTATCGACTGAAGCAGAAACGAAAGGCCGGGGTTAT +AGCTGAAGCTAATCCTGAGTAA diff --git a/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.fsa b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.fsa new file mode 100644 index 0000000000000000000000000000000000000000..b25cc5564aa6b94e47594a34e65b0524a5bcda96 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.fsa @@ -0,0 +1,151 @@ +>header_genome3_1 [gcode=11] [organism=Genus species] [strain=strain] +ATGAGAGCGCTACTGTGGCTGGTGGGTCTTGCGTTGCTGTTAACAGGCTGCGCGAGCGAA +AAAGGAATTATCGATGAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCC +TATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTG +GCGACGTTAACGGGCCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTA +TATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCG +GGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTG +GAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCG +CAAATTCAGGCATTGATTCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCG +CAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGC +TTCCCGTGGCGCGAGCTGGCGGCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTG +GCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCG +TTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGG +GTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCC +GAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACGAGTCCGG +GAGCGGCGAGATTCCGGGCAGAGGAGGCGGCTATAGCGGATGAGAGCGCTACTGTGGCTG +GTGGGTCTTGCGTTGCTGTTAACAGGCTGCGCGAGCGAAAAAGGAATTATCGATGAAGAG +GGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCCTATCCGCGCATTAAAGTCCTG +GTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGACGTTAACGGGCCGCAAC +GTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTATATGGCGGTAAACCGCGCATC +TGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCGGGCGTCAGTTTCTGGCGAGGC +GCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAATCGCGGCTGGCGAATG +TCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAATTCAGGCATTGATTCCG +TTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCGCAGAATGTGGTGGCCCATGCG +GATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGCTTCCCGTGGCGCGAGCTGGCG +GCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTGGCGTTTTATCTGGCTGGACGC +GCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCGTTACTCTCGCGCTATGGCTAT +GAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGGGTGATTATGGCGTTCCAGATG +CACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCCGAAACGCAGGCGATTGCCGAA +GCATTACTGGAGAAGTACGGCCAGGATTAACAGGGCGCGGTATCGCGAGGACTCTCTCTT +CGAGGACATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTT +ACCCGGCCTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGATTTTCCCCATCGG +CGGTATCCGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGA +TTACGCGTTTGTGCATATGACGCTGAAAATCGGCGCCGGGCGCAGCCTGGAGAGCCGTCA +GGAAGTCGGCGAAATGCTGTTCGGGCTGATTAAAGCCCACTTCGCCGACCTGATGGAGAA +CCGCTATCTGGCGCTGTCGTTTGAGATTGCCGAGTTACATCCAACGCTCAATTACAAACA +AAACAACGTACACGCGTTATTTAAATAGCCCGAGTATTAGCGCGCGGCGGCGCTATAGGC +GCGCTATAGGCATGAGCGCGCTATTTCTGGCCATCCCGTTAACCATTTTTGTGTTGTTTG +TGTTACCGATTTGGCTGTGGCTGCATTACAGCAACCGCGCCGGTCGGGGAGAACTGTCGC +AAAGCGAGCAGCAACGCTTACTGCAACTCACAGACGACGCGCAACGTATGCGCGAGCGCA +TTCAGGCGCTGGAAGACATTCTTGATGCAGAGCATCCGAACTGGAGAGAGCGCTAACGAG +AGTCTCGGAGGAGCGGCGCTCTGGATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTG +ACGGCAGGTTCTGCGCTTGCTGTTCTTCCTTTCCTGCGCGCCTTGCCGGTACAGGCGCGT +GAACCTCGCGAGACCGTCGATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTC +AAACAGGCCTTCGCCGACGGACAGACCGTGGTCGTACCGCCAGGATGGGTGTGTGAAAAT +ATCAATGCGGCGATAACGATTCCGGCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGT +GGGAATGGCCGGGGACGGTTTATTTTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGC +GGCAGTCTGCACAATGTGACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTG +GCGATGAGCGGCTTTGGCCCCGTCGCGCAAATTTTCATCGGTGGTAAGGAACCGCAGGTG +ATGCGTAATCTCATTATCGATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGC +CAGGGATTTCATAACCAAATGGACGGCGCGAGGATTACGCATAGCCGCTTTAGCGATTTA +CAGGGGGACGCCATTGAGTGGAATGTCGCGATTCACGACCGCGATATCCTGATTTCCGAT +CATGTCATCGAACGCATTGATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGG +CTGGCGGGTAGCACCTATGACAACAGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTG +GTGGCCAATATTACCGGATCTGATTGCCGACAGCTTGTGCACGTAGAAAATGGCAAACAT +TTCGTCATTCGCAATGTCAAAGCCAAAAACATCACGCCCGATTTCAGTAAAAATGCGGGT +ATTGATAACGCAACGATCGCAATTTATGGCTGTGATAATTTCGTCATTGATAATATTGAT +ATGACGAATAGTGCCGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCA +ATTCCGCAAAACTTTAAATTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAA +TTACGCGGCATTCAAATTTCCTCCGGCAACACCCCCTCTTTTGTCGCCATCACCAATGTA +CGGATGACGCGTGCTACGCTGGAACTGCATAATCAACCGCAGCACCTCTTCCTGCGTAAT +ATCAACGTGATGCAAACTTCAGCGATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGT +AAAGATGTCCGTGGTCAATTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTT +CATGCCATCAATGAAAACGGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACC +GTGAATGTCGAAGCAGTGAATTTTTCGCTGCCGAAGCGGGGAGGGTAACGCGTCTGCGCG +CGCGGAGAGAGGCTCTCAGGACATGTCAGAAAATAAATTACACGTTATCGATTTGCACAA +ACGCTACGGCGGTCATGAAGTGCTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGT +GATTAGCATCATCGGCTCGTCCGGCTCCGGTAAAAGCACTTTTTTGCGCTGTATTAACTT +CCTCGAAAAACCGAGCGAAGACGCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCG +CGACAAAGACGGGCAGCTCAAAGTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCG +CCTGACGATGGTGTTTCAGCACTTTAACCTCTGGAGCCACATGACGGTGCTGGAAAATGT +GATGGAAGCGCCGATTCAGGTACTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTT +GAAATATCTGGCGAAGGTGGGGATTGATGAGCGCGCTCAGGGCAAATATCCCGTTCATCT +CTCCGGTGGCCAACAGCAGCGCGTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGT +TTTACTGTTCGATGAACCCACTTCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCG +CATCATGCAACAACTGGCGGAAGAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGG +CTTCGTTCGCCATGTCTCTTCGCACGTTATTTTTCTGCATCAGGGGAAAATTGAAGAAGA +GGGCGATCCGGAGCAGGTGTTCGGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAA +AGGCTCGCTGAAATAACCGGATGCGCGCGATATATCGGCGATGCGTGTTTATTGGCATCG +TTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAG +TAAAAAATGGCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGACTTCACGCATGACCGGC +ACCGTACCGTGGACGATCGTCCTTACGGCGGCGGACCAGGGATGTTAATGATGGTGCAAC +CCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTT +ATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATC +AGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGACCG +AAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAA +TGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGAGTTCTGGGGCATGAAGCATCAG +CAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAG +TGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTC +GCTGGCGTTTGAAACAGTCGCTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAA +ACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCAC +AACAGCAGCATAAACATGATGGGATGGCATAGCCGGCTCTGCGAGAGAGGAGCGCTCGCA +TGAATAATCATTTTGGGAAAGGGTTAATGGCCGGGTTGCACGCGCCATATGCATATAGCG +CGCATCATGCGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATTGGGTTTTACAC +ACCGTATGTTCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGCTGGAATTCTGA +CGCGTCGCTATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGAGAATCATTCCG +GGATGGCGGTTCGCTTCTTTATGGCCGGTTATCGACTCGAAGGTTGA +>genome3_plasmi_2 [gcode=11] [organism=Genus species] [strain=strain] +ATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGC +CAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTTGAGATGATCGTCCCTCAACTA +CCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAGTCTCTCGTGCTTGAGCATGGCGGT +GCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAA +TGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACC +GACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGC +CATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGG +CTACTGCAACAGACGGGCGATGAAGTGCTGGATTACCGCCAGGCGGTGGCATATTACGCC +TCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCACGGGCTTCGAAGATTAT +TTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGACGCGTAGCGGCCGGAATC +TTCTCGGAGAGGCGCTTCTCTCTCGGAGATGATTGACCCTATTTTTGCGTCCTGTACGCT +AATTGCCGTCTTTGTTGTTTTACTGGCCATGGGCGCGCCTATCGGGATCTGCATCGTTAT +CGCCTCTTTCAGCACCATGATGCTGGTACTGCCTTTCGATATTTCGATGTTCGCCACCGC +GCAAAAAATGTTCTCCAGCCTGGACAGTTTTGCCTTGCTGGCCGTGCCGTTCTTCGTTTT +GTCCGGGGTGATCATGAATAGCGGGGGAATTGCCGCCCGGCTGATCAATTTTGCCAAACT +GTTTACTGGCAAACTGCCCGGTTCGCTCTCTTATACCAACATCGTCGGCAATATGATGTT +CGGTGCAATTTCCGGATCGGCAATTGCCGCCTCAACCTCCATCGGCGGCGTGATGGTGCC +GATGAGCGCGCGCGAAGGTTACGATCGCGGCTTTGCGGCCGCGGTGAATATCGCCTCCGC +GCCGACGGGAATGTTAATTCCGCCCACCACGGCTTTTATCCTTTATGCGCTGGCAAGCGG +GGGAACATCGATTGCCGCTCTGTTCGCCGGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGT +TGGCTGTATGCTGGTCACGCTGGTAGTCGCTAAGCGTCGAAATTATCGGGTTTTCTTCAC +CGTCCAAAAAGGCATGGCGCTAAAAGTTGCCGTTGAGGCCATTCCCAGCCTGTTACTGAT +CGTGATTATCGTCGGCGGCATTGTGCAGGGGATTTTCACCGCCATTGAAGCCTCCGCGAT +TGCCGTGGTGTATACGTTATTGCTGACGATGGTGTTTTACCGCACGCTGAAAATTAAGGA +TTTGCCTTCGATTTTGCTCCAGACAGTGGTAATGACCGGGGTCATCATGTTCCTGCTGGC +AACCTCTTCGGCGATGTCCTTCTCAATGTCGATCACCAATATTCCTGCGGCGCTGAGCGA +TATGATCCTCGGTATTTCCGCCAATAAACTGGTTATCCTGTTAGTCATTACCGTCTTTTT +GTTGATTATCGGCGCATTTATGGATATCGGTCCGGCCATTCTGATTTTTACCCCGATTCT +GCTGCCGATCATGGCTAAACTGGGCGTCGATCCGGTGCATTTGGGCATTATCATGATCTA +TAACCTGGCGATTGGCACCATTACGCCGCCAGTTGGCAGTGGTTTATATGTCGGGGCGAG +CGTCGGTAAGGTCAAAGTTGAGGAAGTGATTAAACCGTTGCTGCCTTTTTACGGCGCGAT +TATCGGCGTTCTGTTATTAATTACCTACATTCCGGAAATCATACTGTTCTTACCCCGTCT +ACTGGGCATCATGTAAACCCGATGGCGCGCAGAGGCGCGAGTTCTGGA +>genome3_plasmi_3 [gcode=11] [organism=Genus species] [strain=strain] +CACAGGGCTTAGAGGCGCTATGGCAATGATAATTAATGGGAAATTAATTAAAGCAAAAGA +CTTAGCTAAGGCTGCAGGTGTATCTCGTTCAACAGTGATTAAATATTACGGCATTAGCCG +TGAGAATTACGAAAGGGTAGCAACTGAAAGAAGGAAGCTTGCTTTTGAACTAAGAGCATC +AGGTTTAAAATGGAAAGAAGTTGCTGAAAAAATGAACACGACAAAATATAGCGCAATTGC +ATATTATAGACGATATTTAGCATTAGAGAAAAACAAATAACAGGCGCTAAGGCGGCGATC +CTAGCGCGCGATCGCGCATGCGATGGTCGAACTGCAACATCAACGGCTGATGGTGCTTGC +CGAACAGCTCCAGCTGGACAGTCTTATCGGCGCAGCGCCGGCGCTGTCGCAACAGGCGGT +GGATCAGGAATGGAGCTACATGGACTTCCTGGAGCACCTGTTACATGAGGAGAAACTGGC +CCGGCATCAGCGTAAACAGGCGATGTACACGCGGATGGCAGCCTTCCCGGCGGTAAAGAC +GTTCGAGGAGTACGACTTCACCTTCGCCACCGGCGCTCCTCAGAAGCAAATCCAGTCGCT +GCGATCCCTGAGCTTCATAGAGCGTAACGAAAACATCGTGTTGCTGGGGCCATCGGGCGT +GGGAAAAACGCATCTGGCGATAGCCATGGGCTACGAAGCAGTACGGGCGGGCATCAAGGT +TCGCTTCACAACAGCAGCGGACCTGCTGCTACAGCTGTCCACTTCACAGCGTCAGGGCCG +TTACAAAACGACTCTCAATCGTGGTGTCATGGCCCCGAAGCTGCTTATCATCGATGAAAT +AGGTTATCTGCCGTTCAGTCAGGAGGAAGCCAAGCTGTTCTTCCAGGTCATCGCCAAACG +TTACGAGAAGAGCGCGATGATCCTGACCTCCAACCTGCCGTTCGGGCAGTGGGATCAGAC +GTTCGCCGGTGATGCAGCGCTGACATCGGCGATGCTGGACCGGATCTTACATCACTCACA +CGTCGTGCAAATAAAAGGGGAAAGCTATCGACTGAAGCAGAAACGAAAGGCCGGGGTTAT +AGCTGAAGCTAATCCTGAGTAA diff --git a/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.gbk b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.gbk new file mode 100644 index 0000000000000000000000000000000000000000..924905915875881e6008f7a3d4f3bb0db7b9eb77 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.gbk @@ -0,0 +1,350 @@ +LOCUS header_genome3_1 5747 bp DNA linear 12-FEB-2019 +DEFINITION Genus species strain strain. +ACCESSION +VERSION +KEYWORDS . +SOURCE Genus species + ORGANISM Genus species + Unclassified. +COMMENT Annotated using prokka 1.14-dev from + https://github.com/tseemann/prokka. +FEATURES Location/Qualifiers + source 1..5747 + /organism="Genus species" + /mol_type="genomic DNA" + /strain="strain" + CDS 1..831 + /gene="amiD_1" + /locus_tag="PFKCIMHH_00001" + /EC_number="3.5.1.28" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:P75820" + /codon_start=1 + /transl_table=11 + /product="N-acetylmuramoyl-L-alanine amidase AmiD" + /translation="MRALLWLVGLALLLTGCASEKGIIDEEGYQLDTRHRAQAAYPRI + KVLVIHYTAENFDVSLATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGV + SFWRGATRLNDTSIGIELENRGWRMSGGVKSFAPFESAQIQALIPLAKDIIARYDIKP + QNVVAHADIAPQRKDDPGPRFPWRELAAQGIGAWPDAQRVAFYLAGRAPYTPVDTATV + LALLSRYGYEVKADMTAREQQRVIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD + " + CDS 880..1710 + /gene="amiD_2" + /locus_tag="PFKCIMHH_00002" + /EC_number="3.5.1.28" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:P75820" + /codon_start=1 + /transl_table=11 + /product="N-acetylmuramoyl-L-alanine amidase AmiD" + /translation="MRALLWLVGLALLLTGCASEKGIIDEEGYQLDTRHRAQAAYPRI + KVLVIHYTAENFDVSLATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGV + SFWRGATRLNDTSIGIELENRGWRMSGGVKSFAPFESAQIQALIPLAKDIIARYDIKP + QNVVAHADIAPQRKDDPGPRFPWRELAAQGIGAWPDAQRVAFYLAGRAPYTPVDTATV + LALLSRYGYEVKADMTAREQQRVIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD + " + CDS 1655..2128 + /gene="hpcD" + /locus_tag="PFKCIMHH_00003" + /EC_number="5.3.3.10" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:Q05354" + /codon_start=1 + /transl_table=11 + /product="5-carboxymethyl-2-hydroxymuconate + Delta-isomerase" + /translation="MPKRRRLPKHYWRSTARINRARYREDSLFEDMPHFIAECTENIR + EQADLPGLFSKVNEALAASGIFPIGGIRSRAHWLDTWQMADGKHDYAFVHMTLKIGAG + RSLESRQEVGEMLFGLIKAHFADLMENRYLALSFEIAELHPTLNYKQNNVHALFK" + CDS 2172..2396 + /gene="pspB" + /locus_tag="PFKCIMHH_00004" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:P0AFM9" + /codon_start=1 + /transl_table=11 + /product="Phage shock protein B" + /translation="MSALFLAIPLTIFVLFVLPIWLWLHYSNRAGRGELSQSEQQRLL + QLTDDAQRMRERIQALEDILDAEHPNWRER" + CDS 2425..3828 + /locus_tag="PFKCIMHH_00005" + /inference="ab initio prediction:Prodigal:2.6" + /codon_start=1 + /transl_table=11 + /product="hypothetical protein" + /translation="MPATKFSRRTLLTAGSALAVLPFLRALPVQAREPRETVDIKDYP + ADDGIASFKQAFADGQTVVVPPGWVCENINAAITIPAGKTLRVQGAVRGNGRGRFILQ + DGCQVVGEQGGSLHNVTLDVRGSDCVIKGVAMSGFGPVAQIFIGGKEPQVMRNLIIDD + ITVTHANYAILRQGFHNQMDGARITHSRFSDLQGDAIEWNVAIHDRDILISDHVIERI + DCTNGKINWGIGIGLAGSTYDNSYPEDQAVKNFVVANITGSDCRQLVHVENGKHFVIR + NVKAKNITPDFSKNAGIDNATIAIYGCDNFVIDNIDMTNSAGMLIGYGVVKGKYLSIP + QNFKLNAIRLDNRQVAYKLRGIQISSGNTPSFVAITNVRMTRATLELHNQPQHLFLRN + INVMQTSAIGPALKMHFDLRKDVRGQFMARQDTLLSLANVHAINENGQSSVDIDRINH + QTVNVEAVNFSLPKRGG" + CDS 3863..4636 + /gene="hisP" + /locus_tag="PFKCIMHH_00006" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:P02915" + /codon_start=1 + /transl_table=11 + /product="Histidine transport ATP-binding protein HisP" + /translation="MSENKLHVIDLHKRYGGHEVLKGVSLQARAGDVISIIGSSGSGK + STFLRCINFLEKPSEDAIIVNGQNINLVRDKDGQLKVADKNQLRLLRTRLTMVFQHFN + LWSHMTVLENVMEAPIQVLGLSKHDARERALKYLAKVGIDERAQGKYPVHLSGGQQQR + VSIARALAMEPDVLLFDEPTSALDPELVGEVLRIMQQLAEEGKTMVVVTHEMGFVRHV + SSHVIFLHQGKIEEEGDPEQVFGNPQSPRLQQFLKGSLK" + CDS 4641..5432 + /gene="trmD" + /locus_tag="PFKCIMHH_00007" + /EC_number="2.1.1.228" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:P0A873" + /codon_start=1 + /transl_table=11 + /product="tRNA (guanine-N(1)-)-methyltransferase" + /translation="MRAIYRRCVFIGIVSLFPEMFRAITDYGVTGRAVKNGLLNIQSW + SPRDFTHDRHRTVDDRPYGGGPGMLMMVQPLRDAIHAAKAAAGEGAKVIYLSPQGRKL + DQAGVSELATNQKLILVCGRYEGVDERVIQTEIDEEWSIGDYVLSGGELPAMTLIDSV + ARFIPGVLGHEASAIEDSFADGLLDCPHYTRPEVLEGMEVPPVLLSGNHAEIRRWRLK + QSLGRTWLRRPELLENLALTEEQARLLAEFKTEHAQQQHKHDGMA" + CDS 5460..5747 + /locus_tag="PFKCIMHH_00008" + /inference="ab initio prediction:Prodigal:2.6" + /codon_start=1 + /transl_table=11 + /product="hypothetical protein" + /translation="MNNHFGKGLMAGLHAPYAYSAHHAVNFCSEYKRGFVLGFTHRMF + EKTGDRQLSAWEAGILTRRYGLDKEMVMDFFKENHSGMAVRFFMAGYRLEG" +ORIGIN + 1 atgagagcgc tactgtggct ggtgggtctt gcgttgctgt taacaggctg cgcgagcgaa + 61 aaaggaatta tcgatgaaga gggatatcag cttgataccc gacatcgggc gcaggcggcc + 121 tatccgcgca ttaaagtcct ggtgattcac tatacggcgg aaaactttga cgtttcgctg + 181 gcgacgttaa cgggccgcaa cgtcagttcg cattacctga ttcccgcaac cccgccatta + 241 tatggcggta aaccgcgcat ctggcaactg gtgccggaac aggatcaggc ctggcatgcg + 301 ggcgtcagtt tctggcgagg cgccacgcgt ctcaatgata cgtctattgg cattgagctg + 361 gaaaatcgcg gctggcgaat gtccggcggg gtgaaatctt tcgcgccgtt tgaatccgcg + 421 caaattcagg cattgattcc gttagcgaag gacattatcg cgcgctatga catcaaaccg + 481 cagaatgtgg tggcccatgc ggatatcgcg ccgcagcgta aagacgatcc cggcccgcgc + 541 ttcccgtggc gcgagctggc ggcacagggg attggcgcct ggcctgacgc ccagcgtgtg + 601 gcgttttatc tggctggacg cgcgccgtat acgccagtcg ataccgcaac ggtgcttgcg + 661 ttactctcgc gctatggcta tgaagtcaaa gccgatatga cggcgcgcga gcaacagcgg + 721 gtgattatgg cgttccagat gcacttccgt ccggcgcaat ggaacggtat cgcagatgcc + 781 gaaacgcagg cgattgccga agcattactg gagaagtacg gccaggatta acgagtccgg + 841 gagcggcgag attccgggca gaggaggcgg ctatagcgga tgagagcgct actgtggctg + 901 gtgggtcttg cgttgctgtt aacaggctgc gcgagcgaaa aaggaattat cgatgaagag + 961 ggatatcagc ttgatacccg acatcgggcg caggcggcct atccgcgcat taaagtcctg + 1021 gtgattcact atacggcgga aaactttgac gtttcgctgg cgacgttaac gggccgcaac + 1081 gtcagttcgc attacctgat tcccgcaacc ccgccattat atggcggtaa accgcgcatc + 1141 tggcaactgg tgccggaaca ggatcaggcc tggcatgcgg gcgtcagttt ctggcgaggc + 1201 gccacgcgtc tcaatgatac gtctattggc attgagctgg aaaatcgcgg ctggcgaatg + 1261 tccggcgggg tgaaatcttt cgcgccgttt gaatccgcgc aaattcaggc attgattccg + 1321 ttagcgaagg acattatcgc gcgctatgac atcaaaccgc agaatgtggt ggcccatgcg + 1381 gatatcgcgc cgcagcgtaa agacgatccc ggcccgcgct tcccgtggcg cgagctggcg + 1441 gcacagggga ttggcgcctg gcctgacgcc cagcgtgtgg cgttttatct ggctggacgc + 1501 gcgccgtata cgccagtcga taccgcaacg gtgcttgcgt tactctcgcg ctatggctat + 1561 gaagtcaaag ccgatatgac ggcgcgcgag caacagcggg tgattatggc gttccagatg + 1621 cacttccgtc cggcgcaatg gaacggtatc gcagatgccg aaacgcaggc gattgccgaa + 1681 gcattactgg agaagtacgg ccaggattaa cagggcgcgg tatcgcgagg actctctctt + 1741 cgaggacatg ccgcacttta ttgctgaatg tactgaaaat attcgcgagc aggctgattt + 1801 acccggcctg ttcagcaagg taaacgaggc gctggccgcc agcgggattt tccccatcgg + 1861 cggtatccgc agtcgcgccc actggctgga tacctggcag atggctgacg gtaagcatga + 1921 ttacgcgttt gtgcatatga cgctgaaaat cggcgccggg cgcagcctgg agagccgtca + 1981 ggaagtcggc gaaatgctgt tcgggctgat taaagcccac ttcgccgacc tgatggagaa + 2041 ccgctatctg gcgctgtcgt ttgagattgc cgagttacat ccaacgctca attacaaaca + 2101 aaacaacgta cacgcgttat ttaaatagcc cgagtattag cgcgcggcgg cgctataggc + 2161 gcgctatagg catgagcgcg ctatttctgg ccatcccgtt aaccattttt gtgttgtttg + 2221 tgttaccgat ttggctgtgg ctgcattaca gcaaccgcgc cggtcgggga gaactgtcgc + 2281 aaagcgagca gcaacgctta ctgcaactca cagacgacgc gcaacgtatg cgcgagcgca + 2341 ttcaggcgct ggaagacatt cttgatgcag agcatccgaa ctggagagag cgctaacgag + 2401 agtctcggag gagcggcgct ctggatgccc gcgactaaat tctcccgacg taccctcctg + 2461 acggcaggtt ctgcgcttgc tgttcttcct ttcctgcgcg ccttgccggt acaggcgcgt + 2521 gaacctcgcg agaccgtcga tattaaggat tatccggcgg atgacggtat cgcctcgttc + 2581 aaacaggcct tcgccgacgg acagaccgtg gtcgtaccgc caggatgggt gtgtgaaaat + 2641 atcaatgcgg cgataacgat tccggcggga aaaacgctgc gggtacaggg cgcggtgcgt + 2701 gggaatggcc ggggacggtt tattttgcag gacgggtgtc aggtggtggg ggagcagggc + 2761 ggcagtctgc acaatgtgac gctggatgtt cgcgggtcgg actgtgtgat taaaggcgtg + 2821 gcgatgagcg gctttggccc cgtcgcgcaa attttcatcg gtggtaagga accgcaggtg + 2881 atgcgtaatc tcattatcga tgacatcacc gttacccacg ccaactacgc cattctccgc + 2941 cagggatttc ataaccaaat ggacggcgcg aggattacgc atagccgctt tagcgattta + 3001 cagggggacg ccattgagtg gaatgtcgcg attcacgacc gcgatatcct gatttccgat + 3061 catgtcatcg aacgcattga ttgtaccaat ggcaaaatca actgggggat cggcatcggg + 3121 ctggcgggta gcacctatga caacagttat cctgaagacc aggcagtaaa aaactttgtg + 3181 gtggccaata ttaccggatc tgattgccga cagcttgtgc acgtagaaaa tggcaaacat + 3241 ttcgtcattc gcaatgtcaa agccaaaaac atcacgcccg atttcagtaa aaatgcgggt + 3301 attgataacg caacgatcgc aatttatggc tgtgataatt tcgtcattga taatattgat + 3361 atgacgaata gtgccgggat gctcatcggc tatggcgtcg ttaaaggaaa atacctgtca + 3421 attccgcaaa actttaaatt aaacgctatt cggttggata atcgccaggt tgcttataaa + 3481 ttacgcggca ttcaaatttc ctccggcaac accccctctt ttgtcgccat caccaatgta + 3541 cggatgacgc gtgctacgct ggaactgcat aatcaaccgc agcacctctt cctgcgtaat + 3601 atcaacgtga tgcaaacttc agcgattggc ccggcgttaa aaatgcattt cgatttgcgt + 3661 aaagatgtcc gtggtcaatt tatggcccgc caggacacgc tgctttccct cgctaatgtt + 3721 catgccatca atgaaaacgg gcagagttcc gtggatatcg acaggattaa tcaccaaacc + 3781 gtgaatgtcg aagcagtgaa tttttcgctg ccgaagcggg gagggtaacg cgtctgcgcg + 3841 cgcggagaga ggctctcagg acatgtcaga aaataaatta cacgttatcg atttgcacaa + 3901 acgctacggc ggtcatgaag tgctgaaagg ggtatcgctg caggcccgcg ccggagatgt + 3961 gattagcatc atcggctcgt ccggctccgg taaaagcact tttttgcgct gtattaactt + 4021 cctcgaaaaa ccgagcgaag acgcgattat cgtgaacggt cagaacatta atctggtgcg + 4081 cgacaaagac gggcagctca aagtggcgga taaaaatcag ctacgcttgt tgcgtacccg + 4141 cctgacgatg gtgtttcagc actttaacct ctggagccac atgacggtgc tggaaaatgt + 4201 gatggaagcg ccgattcagg tactgggatt aagcaagcac gacgcgcgcg agcgggcgtt + 4261 gaaatatctg gcgaaggtgg ggattgatga gcgcgctcag ggcaaatatc ccgttcatct + 4321 ctccggtggc caacagcagc gcgtttctat tgcgcgcgcg ctggcgatgg aacctgacgt + 4381 tttactgttc gatgaaccca cttcggcgct cgatcctgaa ctggtcggcg aagtgttgcg + 4441 catcatgcaa caactggcgg aagaaggcaa aacgatggtg gtggtcacgc atgaaatggg + 4501 cttcgttcgc catgtctctt cgcacgttat ttttctgcat caggggaaaa ttgaagaaga + 4561 gggcgatccg gagcaggtgt tcggcaatcc gcaaagcccg cgtttacagc aattcctgaa + 4621 aggctcgctg aaataaccgg atgcgcgcga tatatcggcg atgcgtgttt attggcatcg + 4681 ttagcctgtt tcctgaaatg ttccgcgcaa ttaccgatta cggggtaact ggccgggcag + 4741 taaaaaatgg cctgctgaac atccaaagct ggagtcctcg cgacttcacg catgaccggc + 4801 accgtaccgt ggacgatcgt ccttacggcg gcggaccagg gatgttaatg atggtgcaac + 4861 ccttgcggga cgccattcac gcagcaaaag ccgcggcagg tgaaggcgct aaagtgattt + 4921 atctgtcgcc tcagggacgc aagcttgatc aagcgggcgt tagcgagctg gccacgaatc + 4981 agaagcttat tctggtgtgt ggtcgctacg aaggcgtaga tgagcgcgta attcagaccg + 5041 aaattgacga agaatggtca attggcgatt acgttctcag cggtggcgaa ctaccggcaa + 5101 tgacgctgat tgactccgtc gcccggttta taccgggagt tctggggcat gaagcatcag + 5161 caatcgaaga ttcgtttgct gatgggttgc tggattgtcc gcactatacg cgccctgaag + 5221 tgttagaggg gatggaagta ccgccagtat tgctgtcggg aaaccatgct gagatacgtc + 5281 gctggcgttt gaaacagtcg ctgggccgaa cctggcttag aagacctgaa cttctggaaa + 5341 acctggctct gactgaagag caagcaaggt tgctggcgga gttcaaaaca gaacacgcac + 5401 aacagcagca taaacatgat gggatggcat agccggctct gcgagagagg agcgctcgca + 5461 tgaataatca ttttgggaaa gggttaatgg ccgggttgca cgcgccatat gcatatagcg + 5521 cgcatcatgc ggtgaatttc tgttctgagt ataaacgtgg ctttgtattg ggttttacac + 5581 accgtatgtt cgaaaagacc ggcgatcgtc aacttagcgc gtgggaggct ggaattctga + 5641 cgcgtcgcta tggtctggat aaagaaatgg tgatggattt ctttaaagag aatcattccg + 5701 ggatggcggt tcgcttcttt atggccggtt atcgactcga aggttga +// +LOCUS genome3_plasmi_2 1968 bp DNA linear 12-FEB-2019 +DEFINITION Genus species strain strain. +ACCESSION +VERSION +KEYWORDS . +SOURCE Genus species + ORGANISM Genus species + Unclassified. +COMMENT Annotated using prokka 1.14-dev from + https://github.com/tseemann/prokka. +FEATURES Location/Qualifiers + source 1..1968 + /organism="Genus species" + /mol_type="genomic DNA" + /strain="strain" + CDS 1..582 + /locus_tag="PFKCIMHH_00009" + /inference="ab initio prediction:Prodigal:2.6" + /codon_start=1 + /transl_table=11 + /product="hypothetical protein" + /translation="MSTLLYLHGFNSSPRSAKACQLKNWLAERHPHVEMIVPQLPPYP + ADAAELLESLVLEHGGAPLGLVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDY + LGQNENPYTGQQYVLESRHIYDLKVMQIDPLEAPDLIWLLQQTGDEVLDYRQAVAYYA + SCRQTVTEGGNHAFTGFEDYFNQIVDFLGLHSC" + CDS 629..1936 + /gene="dctM" + /locus_tag="PFKCIMHH_00010" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:O07838" + /codon_start=1 + /transl_table=11 + /product="C4-dicarboxylate TRAP transporter large permease + protein DctM" + /translation="MIDPIFASCTLIAVFVVLLAMGAPIGICIVIASFSTMMLVLPFD + ISMFATAQKMFSSLDSFALLAVPFFVLSGVIMNSGGIAARLINFAKLFTGKLPGSLSY + TNIVGNMMFGAISGSAIAASTSIGGVMVPMSAREGYDRGFAAAVNIASAPTGMLIPPT + TAFILYALASGGTSIAALFAGGLVAGVLWGVGCMLVTLVVAKRRNYRVFFTVQKGMAL + KVAVEAIPSLLLIVIIVGGIVQGIFTAIEASAIAVVYTLLLTMVFYRTLKIKDLPSIL + LQTVVMTGVIMFLLATSSAMSFSMSITNIPAALSDMILGISANKLVILLVITVFLLII + GAFMDIGPAILIFTPILLPIMAKLGVDPVHLGIIMIYNLAIGTITPPVGSGLYVGASV + GKVKVEEVIKPLLPFYGAIIGVLLLITYIPEIILFLPRLLGIM" +ORIGIN + 1 atgtctacgc ttctctattt gcacggattc aacagttccc ctcgctcggc aaaagcgtgc + 61 cagctaaaaa actggctggc ggagcgtcat ccgcatgttg agatgatcgt ccctcaacta + 121 ccgccgtatc ctgccgatgc ggcggagttg ctggagtctc tcgtgcttga gcatggcggt + 181 gcgccattag ggctggtagg atcgtcgctg ggtggttatt acgccacctg gctgtcgcaa + 241 tgttttatgc tgccggctgt ggtggtgaat cccgccgtgc ggccctttga attactgacc + 301 gactatctcg gtcagaacga gaacccctac accgggcagc aatatgtgct agagtctcgc + 361 catatttatg atcttaaagt catgcagatt gacccgctgg aagcgccgga cctgatctgg + 421 ctactgcaac agacgggcga tgaagtgctg gattaccgcc aggcggtggc atattacgcc + 481 tcctgccgtc agacagtgac cgagggtggt aatcacgcat tcacgggctt cgaagattat + 541 ttcaaccaga ttgtcgattt tcttggactg cacagttgct gacgcgtagc ggccggaatc + 601 ttctcggaga ggcgcttctc tctcggagat gattgaccct atttttgcgt cctgtacgct + 661 aattgccgtc tttgttgttt tactggccat gggcgcgcct atcgggatct gcatcgttat + 721 cgcctctttc agcaccatga tgctggtact gcctttcgat atttcgatgt tcgccaccgc + 781 gcaaaaaatg ttctccagcc tggacagttt tgccttgctg gccgtgccgt tcttcgtttt + 841 gtccggggtg atcatgaata gcgggggaat tgccgcccgg ctgatcaatt ttgccaaact + 901 gtttactggc aaactgcccg gttcgctctc ttataccaac atcgtcggca atatgatgtt + 961 cggtgcaatt tccggatcgg caattgccgc ctcaacctcc atcggcggcg tgatggtgcc + 1021 gatgagcgcg cgcgaaggtt acgatcgcgg ctttgcggcc gcggtgaata tcgcctccgc + 1081 gccgacggga atgttaattc cgcccaccac ggcttttatc ctttatgcgc tggcaagcgg + 1141 gggaacatcg attgccgctc tgttcgccgg cggtctggtc gcgggagtgc tgtggggcgt + 1201 tggctgtatg ctggtcacgc tggtagtcgc taagcgtcga aattatcggg ttttcttcac + 1261 cgtccaaaaa ggcatggcgc taaaagttgc cgttgaggcc attcccagcc tgttactgat + 1321 cgtgattatc gtcggcggca ttgtgcaggg gattttcacc gccattgaag cctccgcgat + 1381 tgccgtggtg tatacgttat tgctgacgat ggtgttttac cgcacgctga aaattaagga + 1441 tttgccttcg attttgctcc agacagtggt aatgaccggg gtcatcatgt tcctgctggc + 1501 aacctcttcg gcgatgtcct tctcaatgtc gatcaccaat attcctgcgg cgctgagcga + 1561 tatgatcctc ggtatttccg ccaataaact ggttatcctg ttagtcatta ccgtcttttt + 1621 gttgattatc ggcgcattta tggatatcgg tccggccatt ctgattttta ccccgattct + 1681 gctgccgatc atggctaaac tgggcgtcga tccggtgcat ttgggcatta tcatgatcta + 1741 taacctggcg attggcacca ttacgccgcc agttggcagt ggtttatatg tcggggcgag + 1801 cgtcggtaag gtcaaagttg aggaagtgat taaaccgttg ctgccttttt acggcgcgat + 1861 tatcggcgtt ctgttattaa ttacctacat tccggaaatc atactgttct taccccgtct + 1921 actgggcatc atgtaaaccc gatggcgcgc agaggcgcga gttctgga +// +LOCUS genome3_plasmi_3 1102 bp DNA linear 12-FEB-2019 +DEFINITION Genus species strain strain. +ACCESSION +VERSION +KEYWORDS . +SOURCE Genus species + ORGANISM Genus species + Unclassified. +COMMENT Annotated using prokka 1.14-dev from + https://github.com/tseemann/prokka. +FEATURES Location/Qualifiers + source 1..1102 + /organism="Genus species" + /mol_type="genomic DNA" + /strain="strain" + CDS 26..280 + /locus_tag="PFKCIMHH_00011" + /inference="ab initio prediction:Prodigal:2.6" + /codon_start=1 + /transl_table=11 + /product="hypothetical protein" + /translation="MIINGKLIKAKDLAKAAGVSRSTVIKYYGISRENYERVATERRK + LAFELRASGLKWKEVAEKMNTTKYSAIAYYRRYLALEKNK" + CDS 323..1102 + /locus_tag="PFKCIMHH_00012" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:Q45619" + /codon_start=1 + /transl_table=11 + /product="Insertion sequence IS5376 putative ATP-binding + protein" + /translation="MVELQHQRLMVLAEQLQLDSLIGAAPALSQQAVDQEWSYMDFLE + HLLHEEKLARHQRKQAMYTRMAAFPAVKTFEEYDFTFATGAPQKQIQSLRSLSFIERN + ENIVLLGPSGVGKTHLAIAMGYEAVRAGIKVRFTTAADLLLQLSTSQRQGRYKTTLNR + GVMAPKLLIIDEIGYLPFSQEEAKLFFQVIAKRYEKSAMILTSNLPFGQWDQTFAGDA + ALTSAMLDRILHHSHVVQIKGESYRLKQKRKAGVIAEANPE" +ORIGIN + 1 cacagggctt agaggcgcta tggcaatgat aattaatggg aaattaatta aagcaaaaga + 61 cttagctaag gctgcaggtg tatctcgttc aacagtgatt aaatattacg gcattagccg + 121 tgagaattac gaaagggtag caactgaaag aaggaagctt gcttttgaac taagagcatc + 181 aggtttaaaa tggaaagaag ttgctgaaaa aatgaacacg acaaaatata gcgcaattgc + 241 atattataga cgatatttag cattagagaa aaacaaataa caggcgctaa ggcggcgatc + 301 ctagcgcgcg atcgcgcatg cgatggtcga actgcaacat caacggctga tggtgcttgc + 361 cgaacagctc cagctggaca gtcttatcgg cgcagcgccg gcgctgtcgc aacaggcggt + 421 ggatcaggaa tggagctaca tggacttcct ggagcacctg ttacatgagg agaaactggc + 481 ccggcatcag cgtaaacagg cgatgtacac gcggatggca gccttcccgg cggtaaagac + 541 gttcgaggag tacgacttca ccttcgccac cggcgctcct cagaagcaaa tccagtcgct + 601 gcgatccctg agcttcatag agcgtaacga aaacatcgtg ttgctggggc catcgggcgt + 661 gggaaaaacg catctggcga tagccatggg ctacgaagca gtacgggcgg gcatcaaggt + 721 tcgcttcaca acagcagcgg acctgctgct acagctgtcc acttcacagc gtcagggccg + 781 ttacaaaacg actctcaatc gtggtgtcat ggccccgaag ctgcttatca tcgatgaaat + 841 aggttatctg ccgttcagtc aggaggaagc caagctgttc ttccaggtca tcgccaaacg + 901 ttacgagaag agcgcgatga tcctgacctc caacctgccg ttcgggcagt gggatcagac + 961 gttcgccggt gatgcagcgc tgacatcggc gatgctggac cggatcttac atcactcaca + 1021 cgtcgtgcaa ataaaagggg aaagctatcg actgaagcag aaacgaaagg ccggggttat + 1081 agctgaagct aatcctgagt aa +// diff --git a/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.gff b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.gff new file mode 100644 index 0000000000000000000000000000000000000000..46df1d1908945bc82b032bd6da755ffe027eed00 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.gff @@ -0,0 +1,168 @@ +##gff-version 3 +##sequence-region header_genome3_1 1 5747 +##sequence-region genome3_plasmi_2 1 1968 +##sequence-region genome3_plasmi_3 1 1102 +header_genome3_1 Prodigal:2.6 CDS 1 831 . + 0 ID=PFKCIMHH_00001;eC_number=3.5.1.28;Name=amiD_1;dbxref=COG:COG3023;gene=amiD_1;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:P75820;locus_tag=PFKCIMHH_00001;product=N-acetylmuramoyl-L-alanine amidase AmiD +header_genome3_1 Prodigal:2.6 CDS 880 1710 . + 0 ID=PFKCIMHH_00002;eC_number=3.5.1.28;Name=amiD_2;dbxref=COG:COG3023;gene=amiD_2;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:P75820;locus_tag=PFKCIMHH_00002;product=N-acetylmuramoyl-L-alanine amidase AmiD +header_genome3_1 Prodigal:2.6 CDS 1655 2128 . + 0 ID=PFKCIMHH_00003;eC_number=5.3.3.10;Name=hpcD;gene=hpcD;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:Q05354;locus_tag=PFKCIMHH_00003;product=5-carboxymethyl-2-hydroxymuconate Delta-isomerase +header_genome3_1 Prodigal:2.6 CDS 2172 2396 . + 0 ID=PFKCIMHH_00004;Name=pspB;gene=pspB;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:P0AFM9;locus_tag=PFKCIMHH_00004;product=Phage shock protein B +header_genome3_1 Prodigal:2.6 CDS 2425 3828 . + 0 ID=PFKCIMHH_00005;inference=ab initio prediction:Prodigal:2.6;locus_tag=PFKCIMHH_00005;product=hypothetical protein +header_genome3_1 Prodigal:2.6 CDS 3863 4636 . + 0 ID=PFKCIMHH_00006;Name=hisP;dbxref=COG:COG4598;gene=hisP;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:P02915;locus_tag=PFKCIMHH_00006;product=Histidine transport ATP-binding protein HisP +header_genome3_1 Prodigal:2.6 CDS 4641 5432 . + 0 ID=PFKCIMHH_00007;eC_number=2.1.1.228;Name=trmD;dbxref=COG:COG0336;gene=trmD;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:P0A873;locus_tag=PFKCIMHH_00007;product=tRNA (guanine-N(1)-)-methyltransferase +header_genome3_1 Prodigal:2.6 CDS 5460 5747 . + 0 ID=PFKCIMHH_00008;inference=ab initio prediction:Prodigal:2.6;locus_tag=PFKCIMHH_00008;product=hypothetical protein +genome3_plasmi_2 Prodigal:2.6 CDS 1 582 . + 0 ID=PFKCIMHH_00009;inference=ab initio prediction:Prodigal:2.6;locus_tag=PFKCIMHH_00009;product=hypothetical protein +genome3_plasmi_2 Prodigal:2.6 CDS 629 1936 . + 0 ID=PFKCIMHH_00010;Name=dctM;dbxref=COG:COG1593;gene=dctM;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:O07838;locus_tag=PFKCIMHH_00010;product=C4-dicarboxylate TRAP transporter large permease protein DctM +genome3_plasmi_3 Prodigal:2.6 CDS 26 280 . + 0 ID=PFKCIMHH_00011;inference=ab initio prediction:Prodigal:2.6;locus_tag=PFKCIMHH_00011;product=hypothetical protein +genome3_plasmi_3 Prodigal:2.6 CDS 323 1102 . + 0 ID=PFKCIMHH_00012;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:Q45619;locus_tag=PFKCIMHH_00012;product=Insertion sequence IS5376 putative ATP-binding protein +##FASTA +>header_genome3_1 +ATGAGAGCGCTACTGTGGCTGGTGGGTCTTGCGTTGCTGTTAACAGGCTGCGCGAGCGAA +AAAGGAATTATCGATGAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCC +TATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTG +GCGACGTTAACGGGCCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTA +TATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCG +GGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTG +GAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCG +CAAATTCAGGCATTGATTCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCG +CAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGC +TTCCCGTGGCGCGAGCTGGCGGCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTG +GCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCG +TTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGG +GTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCC +GAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACGAGTCCGG +GAGCGGCGAGATTCCGGGCAGAGGAGGCGGCTATAGCGGATGAGAGCGCTACTGTGGCTG +GTGGGTCTTGCGTTGCTGTTAACAGGCTGCGCGAGCGAAAAAGGAATTATCGATGAAGAG +GGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCCTATCCGCGCATTAAAGTCCTG +GTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGACGTTAACGGGCCGCAAC +GTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTATATGGCGGTAAACCGCGCATC +TGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCGGGCGTCAGTTTCTGGCGAGGC +GCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAATCGCGGCTGGCGAATG +TCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAATTCAGGCATTGATTCCG +TTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCGCAGAATGTGGTGGCCCATGCG +GATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGCTTCCCGTGGCGCGAGCTGGCG +GCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTGGCGTTTTATCTGGCTGGACGC +GCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCGTTACTCTCGCGCTATGGCTAT +GAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGGGTGATTATGGCGTTCCAGATG +CACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCCGAAACGCAGGCGATTGCCGAA +GCATTACTGGAGAAGTACGGCCAGGATTAACAGGGCGCGGTATCGCGAGGACTCTCTCTT +CGAGGACATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTT +ACCCGGCCTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGATTTTCCCCATCGG +CGGTATCCGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGA +TTACGCGTTTGTGCATATGACGCTGAAAATCGGCGCCGGGCGCAGCCTGGAGAGCCGTCA +GGAAGTCGGCGAAATGCTGTTCGGGCTGATTAAAGCCCACTTCGCCGACCTGATGGAGAA +CCGCTATCTGGCGCTGTCGTTTGAGATTGCCGAGTTACATCCAACGCTCAATTACAAACA +AAACAACGTACACGCGTTATTTAAATAGCCCGAGTATTAGCGCGCGGCGGCGCTATAGGC +GCGCTATAGGCATGAGCGCGCTATTTCTGGCCATCCCGTTAACCATTTTTGTGTTGTTTG +TGTTACCGATTTGGCTGTGGCTGCATTACAGCAACCGCGCCGGTCGGGGAGAACTGTCGC +AAAGCGAGCAGCAACGCTTACTGCAACTCACAGACGACGCGCAACGTATGCGCGAGCGCA +TTCAGGCGCTGGAAGACATTCTTGATGCAGAGCATCCGAACTGGAGAGAGCGCTAACGAG +AGTCTCGGAGGAGCGGCGCTCTGGATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTG +ACGGCAGGTTCTGCGCTTGCTGTTCTTCCTTTCCTGCGCGCCTTGCCGGTACAGGCGCGT +GAACCTCGCGAGACCGTCGATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTC +AAACAGGCCTTCGCCGACGGACAGACCGTGGTCGTACCGCCAGGATGGGTGTGTGAAAAT +ATCAATGCGGCGATAACGATTCCGGCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGT +GGGAATGGCCGGGGACGGTTTATTTTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGC +GGCAGTCTGCACAATGTGACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTG +GCGATGAGCGGCTTTGGCCCCGTCGCGCAAATTTTCATCGGTGGTAAGGAACCGCAGGTG +ATGCGTAATCTCATTATCGATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGC +CAGGGATTTCATAACCAAATGGACGGCGCGAGGATTACGCATAGCCGCTTTAGCGATTTA +CAGGGGGACGCCATTGAGTGGAATGTCGCGATTCACGACCGCGATATCCTGATTTCCGAT +CATGTCATCGAACGCATTGATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGG +CTGGCGGGTAGCACCTATGACAACAGTTATCCTGAAGACCAGGCAGTAAAAAACTTTGTG +GTGGCCAATATTACCGGATCTGATTGCCGACAGCTTGTGCACGTAGAAAATGGCAAACAT +TTCGTCATTCGCAATGTCAAAGCCAAAAACATCACGCCCGATTTCAGTAAAAATGCGGGT +ATTGATAACGCAACGATCGCAATTTATGGCTGTGATAATTTCGTCATTGATAATATTGAT +ATGACGAATAGTGCCGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCA +ATTCCGCAAAACTTTAAATTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAA +TTACGCGGCATTCAAATTTCCTCCGGCAACACCCCCTCTTTTGTCGCCATCACCAATGTA +CGGATGACGCGTGCTACGCTGGAACTGCATAATCAACCGCAGCACCTCTTCCTGCGTAAT +ATCAACGTGATGCAAACTTCAGCGATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGT +AAAGATGTCCGTGGTCAATTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTT +CATGCCATCAATGAAAACGGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACC +GTGAATGTCGAAGCAGTGAATTTTTCGCTGCCGAAGCGGGGAGGGTAACGCGTCTGCGCG +CGCGGAGAGAGGCTCTCAGGACATGTCAGAAAATAAATTACACGTTATCGATTTGCACAA +ACGCTACGGCGGTCATGAAGTGCTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGAGATGT +GATTAGCATCATCGGCTCGTCCGGCTCCGGTAAAAGCACTTTTTTGCGCTGTATTAACTT +CCTCGAAAAACCGAGCGAAGACGCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCG +CGACAAAGACGGGCAGCTCAAAGTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCG +CCTGACGATGGTGTTTCAGCACTTTAACCTCTGGAGCCACATGACGGTGCTGGAAAATGT +GATGGAAGCGCCGATTCAGGTACTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTT +GAAATATCTGGCGAAGGTGGGGATTGATGAGCGCGCTCAGGGCAAATATCCCGTTCATCT +CTCCGGTGGCCAACAGCAGCGCGTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGT +TTTACTGTTCGATGAACCCACTTCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCG +CATCATGCAACAACTGGCGGAAGAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGG +CTTCGTTCGCCATGTCTCTTCGCACGTTATTTTTCTGCATCAGGGGAAAATTGAAGAAGA +GGGCGATCCGGAGCAGGTGTTCGGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAA +AGGCTCGCTGAAATAACCGGATGCGCGCGATATATCGGCGATGCGTGTTTATTGGCATCG +TTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAG +TAAAAAATGGCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGACTTCACGCATGACCGGC +ACCGTACCGTGGACGATCGTCCTTACGGCGGCGGACCAGGGATGTTAATGATGGTGCAAC +CCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTT +ATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATC +AGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGACCG +AAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAA +TGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGAGTTCTGGGGCATGAAGCATCAG +CAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAG +TGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTC +GCTGGCGTTTGAAACAGTCGCTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAA +ACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCAC +AACAGCAGCATAAACATGATGGGATGGCATAGCCGGCTCTGCGAGAGAGGAGCGCTCGCA +TGAATAATCATTTTGGGAAAGGGTTAATGGCCGGGTTGCACGCGCCATATGCATATAGCG +CGCATCATGCGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATTGGGTTTTACAC +ACCGTATGTTCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGCTGGAATTCTGA +CGCGTCGCTATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGAGAATCATTCCG +GGATGGCGGTTCGCTTCTTTATGGCCGGTTATCGACTCGAAGGTTGA +>genome3_plasmi_2 +ATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGC +CAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTTGAGATGATCGTCCCTCAACTA +CCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAGTCTCTCGTGCTTGAGCATGGCGGT +GCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAA +TGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACC +GACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGC +CATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGG +CTACTGCAACAGACGGGCGATGAAGTGCTGGATTACCGCCAGGCGGTGGCATATTACGCC +TCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCACGGGCTTCGAAGATTAT +TTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGACGCGTAGCGGCCGGAATC +TTCTCGGAGAGGCGCTTCTCTCTCGGAGATGATTGACCCTATTTTTGCGTCCTGTACGCT +AATTGCCGTCTTTGTTGTTTTACTGGCCATGGGCGCGCCTATCGGGATCTGCATCGTTAT +CGCCTCTTTCAGCACCATGATGCTGGTACTGCCTTTCGATATTTCGATGTTCGCCACCGC +GCAAAAAATGTTCTCCAGCCTGGACAGTTTTGCCTTGCTGGCCGTGCCGTTCTTCGTTTT +GTCCGGGGTGATCATGAATAGCGGGGGAATTGCCGCCCGGCTGATCAATTTTGCCAAACT +GTTTACTGGCAAACTGCCCGGTTCGCTCTCTTATACCAACATCGTCGGCAATATGATGTT +CGGTGCAATTTCCGGATCGGCAATTGCCGCCTCAACCTCCATCGGCGGCGTGATGGTGCC +GATGAGCGCGCGCGAAGGTTACGATCGCGGCTTTGCGGCCGCGGTGAATATCGCCTCCGC +GCCGACGGGAATGTTAATTCCGCCCACCACGGCTTTTATCCTTTATGCGCTGGCAAGCGG +GGGAACATCGATTGCCGCTCTGTTCGCCGGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGT +TGGCTGTATGCTGGTCACGCTGGTAGTCGCTAAGCGTCGAAATTATCGGGTTTTCTTCAC +CGTCCAAAAAGGCATGGCGCTAAAAGTTGCCGTTGAGGCCATTCCCAGCCTGTTACTGAT +CGTGATTATCGTCGGCGGCATTGTGCAGGGGATTTTCACCGCCATTGAAGCCTCCGCGAT +TGCCGTGGTGTATACGTTATTGCTGACGATGGTGTTTTACCGCACGCTGAAAATTAAGGA +TTTGCCTTCGATTTTGCTCCAGACAGTGGTAATGACCGGGGTCATCATGTTCCTGCTGGC +AACCTCTTCGGCGATGTCCTTCTCAATGTCGATCACCAATATTCCTGCGGCGCTGAGCGA +TATGATCCTCGGTATTTCCGCCAATAAACTGGTTATCCTGTTAGTCATTACCGTCTTTTT +GTTGATTATCGGCGCATTTATGGATATCGGTCCGGCCATTCTGATTTTTACCCCGATTCT +GCTGCCGATCATGGCTAAACTGGGCGTCGATCCGGTGCATTTGGGCATTATCATGATCTA +TAACCTGGCGATTGGCACCATTACGCCGCCAGTTGGCAGTGGTTTATATGTCGGGGCGAG +CGTCGGTAAGGTCAAAGTTGAGGAAGTGATTAAACCGTTGCTGCCTTTTTACGGCGCGAT +TATCGGCGTTCTGTTATTAATTACCTACATTCCGGAAATCATACTGTTCTTACCCCGTCT +ACTGGGCATCATGTAAACCCGATGGCGCGCAGAGGCGCGAGTTCTGGA +>genome3_plasmi_3 +CACAGGGCTTAGAGGCGCTATGGCAATGATAATTAATGGGAAATTAATTAAAGCAAAAGA +CTTAGCTAAGGCTGCAGGTGTATCTCGTTCAACAGTGATTAAATATTACGGCATTAGCCG +TGAGAATTACGAAAGGGTAGCAACTGAAAGAAGGAAGCTTGCTTTTGAACTAAGAGCATC +AGGTTTAAAATGGAAAGAAGTTGCTGAAAAAATGAACACGACAAAATATAGCGCAATTGC +ATATTATAGACGATATTTAGCATTAGAGAAAAACAAATAACAGGCGCTAAGGCGGCGATC +CTAGCGCGCGATCGCGCATGCGATGGTCGAACTGCAACATCAACGGCTGATGGTGCTTGC +CGAACAGCTCCAGCTGGACAGTCTTATCGGCGCAGCGCCGGCGCTGTCGCAACAGGCGGT +GGATCAGGAATGGAGCTACATGGACTTCCTGGAGCACCTGTTACATGAGGAGAAACTGGC +CCGGCATCAGCGTAAACAGGCGATGTACACGCGGATGGCAGCCTTCCCGGCGGTAAAGAC +GTTCGAGGAGTACGACTTCACCTTCGCCACCGGCGCTCCTCAGAAGCAAATCCAGTCGCT +GCGATCCCTGAGCTTCATAGAGCGTAACGAAAACATCGTGTTGCTGGGGCCATCGGGCGT +GGGAAAAACGCATCTGGCGATAGCCATGGGCTACGAAGCAGTACGGGCGGGCATCAAGGT +TCGCTTCACAACAGCAGCGGACCTGCTGCTACAGCTGTCCACTTCACAGCGTCAGGGCCG +TTACAAAACGACTCTCAATCGTGGTGTCATGGCCCCGAAGCTGCTTATCATCGATGAAAT +AGGTTATCTGCCGTTCAGTCAGGAGGAAGCCAAGCTGTTCTTCCAGGTCATCGCCAAACG +TTACGAGAAGAGCGCGATGATCCTGACCTCCAACCTGCCGTTCGGGCAGTGGGATCAGAC +GTTCGCCGGTGATGCAGCGCTGACATCGGCGATGCTGGACCGGATCTTACATCACTCACA +CGTCGTGCAAATAAAAGGGGAAAGCTATCGACTGAAGCAGAAACGAAAGGCCGGGGTTAT +AGCTGAAGCTAATCCTGAGTAA diff --git a/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.log b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.log new file mode 100644 index 0000000000000000000000000000000000000000..5b897f7f9cddeebf15360eadd439465292a33ed6 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.log @@ -0,0 +1,121 @@ +[13:18:48] This is prokka 1.14-dev +[13:18:48] Written by Torsten Seemann <torsten.seemann@gmail.com> +[13:18:48] Homepage is https://github.com/tseemann/prokka +[13:18:48] Local time is Tue Feb 12 13:18:48 2019 +[13:18:48] You are aperrin +[13:18:48] Operating system is darwin +[13:18:48] You have BioPerl 1.006924 +[13:18:48] System has 8 cores. +[13:18:48] Will use maximum of 1 cores. +[13:18:48] Annotating as >>> Bacteria <<< +[13:18:48] Generating locus_tag from 'Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna' contents. +[13:18:48] Setting --locustag PFKCIMHH from MD5 9f4c2611ce3f44c6b2e6fb97579c5725 +[13:18:48] Creating new output folder: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes +[13:18:48] Running: mkdir -p Examples\/1\-res\-Annotate\/tmp_files\/genome3\-chromo\.fst\-all\.fna\-split5N\.fna\-prokkaRes +[13:18:48] Using filename prefix: EXAM.1216.00002.XXX +[13:18:48] Setting HMMER_NCPU=1 +[13:18:48] Writing log to: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.log +[13:18:48] Command: /usr/local/bin/prokka --outdir Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes --cpus 1 --prefix EXAM.1216.00002 Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna +[13:18:48] Appending to PATH: /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin +[13:18:48] Appending to PATH: /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/../common +[13:18:48] Appending to PATH: /Users/aperrin/Softwares/src/prokka/bin +[13:18:48] Looking for 'aragorn' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/aragorn +[13:18:48] Determined aragorn version is 1.2 +[13:18:48] Looking for 'barrnap' - found /usr/local/bin/barrnap +[13:18:48] Determined barrnap version is 0.8 +[13:18:48] Looking for 'blastp' - found /Users/aperrin/Softwares/bin/blastp +[13:18:48] Determined blastp version is 2.3 +[13:18:48] Looking for 'cmpress' - found /usr/local/bin/cmpress +[13:18:48] Determined cmpress version is 1.1 +[13:18:48] Looking for 'cmscan' - found /usr/local/bin/cmscan +[13:18:48] Determined cmscan version is 1.1 +[13:18:48] Looking for 'egrep' - found /usr/bin/egrep +[13:18:48] Looking for 'find' - found /usr/bin/find +[13:18:48] Looking for 'grep' - found /usr/bin/grep +[13:18:48] Looking for 'hmmpress' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/hmmpress +[13:18:48] Determined hmmpress version is 3.1 +[13:18:48] Looking for 'hmmscan' - found /usr/local/bin/hmmscan +[13:18:48] Determined hmmscan version is 3.1 +[13:18:48] Looking for 'java' - found /usr/bin/java +[13:18:48] Looking for 'less' - found /usr/bin/less +[13:18:48] Looking for 'makeblastdb' - found /Users/aperrin/Softwares/bin/makeblastdb +[13:18:48] Determined makeblastdb version is 2.3 +[13:18:48] Looking for 'minced' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/../common/minced +[13:18:48] Determined minced version is 2.0 +[13:18:48] Looking for 'parallel' - found /usr/local/bin/parallel +[13:18:49] Determined parallel version is 20181022 +[13:18:49] Looking for 'prodigal' - found /usr/local/bin/prodigal +[13:18:49] Determined prodigal version is 2.6 +[13:18:49] Looking for 'prokka-genbank_to_fasta_db' - found /Users/aperrin/Softwares/src/prokka/bin/prokka-genbank_to_fasta_db +[13:18:49] Looking for 'sed' - found /usr/bin/sed +[13:18:49] Looking for 'tbl2asn' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/tbl2asn +[13:18:49] Determined tbl2asn version is 25.6 +[13:18:49] Using genetic code table 11. +[13:18:49] Loading and checking input file: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna +[13:18:49] Wrote 3 contigs totalling 8817 bp. +[13:18:49] Predicting tRNAs and tmRNAs +[13:18:49] Running: aragorn -l -gc11 -w Examples\/1\-res\-Annotate\/tmp_files\/genome3\-chromo\.fst\-all\.fna\-split5N\.fna\-prokkaRes\/EXAM\.1216\.00002\.fna +[13:18:49] Found 0 tRNAs +[13:18:49] Predicting Ribosomal RNAs +[13:18:49] Running Barrnap with 1 threads +[13:18:49] Found 0 rRNAs +[13:18:49] Skipping ncRNA search, enable with --rfam if desired. +[13:18:49] Total of 0 tRNA + rRNA features +[13:18:49] Searching for CRISPR repeats +[13:18:49] Found 0 CRISPRs +[13:18:49] Predicting coding sequences +[13:18:49] Contigs total 8817 bp, so using meta mode +[13:18:49] Running: prodigal -i Examples\/1\-res\-Annotate\/tmp_files\/genome3\-chromo\.fst\-all\.fna\-split5N\.fna\-prokkaRes\/EXAM\.1216\.00002\.fna -c -m -g 11 -p meta -f sco -q +[13:18:49] Found 12 CDS +[13:18:49] Connecting features back to sequences +[13:18:49] Not using genus-specific database. Try --usegenus to enable it. +[13:18:49] Annotating CDS, please be patient. +[13:18:49] Will use 1 CPUs for similarity searching. +[13:18:49] There are still 12 unannotated CDS left (started with 12) +[13:18:49] Will use blast to search against /Users/aperrin/Softwares/src/prokka/db/kingdom/Bacteria/sprot with 1 CPUs +[13:18:49] Running: cat Examples\/1\-res\-Annotate\/tmp_files\/genome3\-chromo\.fst\-all\.fna\-split5N\.fna\-prokkaRes\/sprot\.faa | parallel --gnu --plain -j 1 --block 1443 --recstart '>' --pipe blastp -query - -db /Users/aperrin/Softwares/src/prokka/db/kingdom/Bacteria/sprot -evalue 1e-06 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > Examples\/1\-res\-Annotate\/tmp_files\/genome3\-chromo\.fst\-all\.fna\-split5N\.fna\-prokkaRes\/sprot\.blast 2> /dev/null +[13:18:50] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/sprot.faa +[13:18:50] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/sprot.blast +[13:18:50] There are still 4 unannotated CDS left (started with 12) +[13:18:50] Will use hmmer3 to search against /Users/aperrin/Softwares/src/prokka/db/hmm/HAMAP.hmm with 1 CPUs +[13:18:50] Running: cat Examples\/1\-res\-Annotate\/tmp_files\/genome3\-chromo\.fst\-all\.fna\-split5N\.fna\-prokkaRes\/HAMAP\.hmm\.faa | parallel --gnu --plain -j 1 --block 427 --recstart '>' --pipe hmmscan --noali --notextw --acc -E 1e-06 --cpu 1 /Users/aperrin/Softwares/src/prokka/db/hmm/HAMAP.hmm /dev/stdin > Examples\/1\-res\-Annotate\/tmp_files\/genome3\-chromo\.fst\-all\.fna\-split5N\.fna\-prokkaRes\/HAMAP\.hmm\.hmmer3 2> /dev/null +[13:18:51] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/HAMAP.hmm.faa +[13:18:51] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/HAMAP.hmm.hmmer3 +[13:18:51] Labelling remaining 4 proteins as 'hypothetical protein' +[13:18:51] Possible /pseudo 'N-acetylmuramoyl-L-alanine amidase AmiD' at header_genome3_1 position 880 +[13:18:51] Found 6 unique /gene codes. +[13:18:51] Fixed 2 duplicate /gene - amiD_1 amiD_2 +[13:18:51] Fixed 1 colliding /gene names. +[13:18:51] Adding /locus_tag identifiers +[13:18:51] Assigned 12 locus_tags to CDS and RNA features. +[13:18:51] Writing outputs to Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/ +[13:18:51] Generating annotation statistics file +[13:18:51] Generating Genbank and Sequin files +[13:18:51] Running: tbl2asn -V b -a r10k -l paired-ends -M n -N 1 -y 'Annotated using prokka 1.14-dev from https://github.com/tseemann/prokka' -Z Examples\/1\-res\-Annotate\/tmp_files\/genome3\-chromo\.fst\-all\.fna\-split5N\.fna\-prokkaRes\/EXAM\.1216\.00002\.err -i Examples\/1\-res\-Annotate\/tmp_files\/genome3\-chromo\.fst\-all\.fna\-split5N\.fna\-prokkaRes\/EXAM\.1216\.00002\.fsa 2> /dev/null +[13:18:51] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/errorsummary.val +[13:18:51] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.dr +[13:18:51] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.fixedproducts +[13:18:51] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.ecn +[13:18:51] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.val +[13:18:51] Repairing broken .GBK output that tbl2asn produces... +[13:18:51] Running: sed 's/COORDINATES: profile/COORDINATES:profile/' < Examples\/1\-res\-Annotate\/tmp_files\/genome3\-chromo\.fst\-all\.fna\-split5N\.fna\-prokkaRes\/EXAM\.1216\.00002\.gbf > Examples\/1\-res\-Annotate\/tmp_files\/genome3\-chromo\.fst\-all\.fna\-split5N\.fna\-prokkaRes\/EXAM\.1216\.00002\.gbk +[13:18:51] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.gbf +[13:18:51] Output files: +[13:18:51] Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.log +[13:18:51] Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.ffn +[13:18:51] Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.gbk +[13:18:51] Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.fsa +[13:18:51] Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.faa +[13:18:51] Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.txt +[13:18:51] Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.gff +[13:18:51] Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.tbl +[13:18:51] Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.sqn +[13:18:51] Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.tsv +[13:18:51] Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.fna +[13:18:51] Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.err +[13:18:51] Annotation finished successfully. +[13:18:51] Walltime used: 0.05 minutes +[13:18:51] If you use this result please cite the Prokka paper: +[13:18:51] Seemann T (2014) Prokka: rapid prokaryotic genome annotation. Bioinformatics. 30(14):2068-9. +[13:18:51] Type 'prokka --citation' for more details. +[13:18:51] Share and enjoy! diff --git a/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.sqn b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.sqn new file mode 100644 index 0000000000000000000000000000000000000000..28bbce5bfc43d860c6f5597b8364755b32da770a --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.sqn @@ -0,0 +1,1159 @@ +Seq-entry ::= set { + class genbank , + seq-set { + set { + class nuc-prot , + descr { + source { + org { + taxname "Genus species" , + orgname { + mod { + { + subtype strain , + subname "strain" } } , + gcode 11 } } } , + comment "Annotated using prokka 1.14-dev from + https://github.com/tseemann/prokka" , + user { + type + str "NcbiCleanup" , + data { + { + label + str "method" , + data + str "SeriousSeqEntryCleanup" } , + { + label + str "version" , + data + int 8 } , + { + label + str "month" , + data + int 2 } , + { + label + str "day" , + data + int 12 } , + { + label + str "year" , + data + int 2019 } } } , + create-date + std { + year 2019 , + month 2 , + day 12 } } , + seq-set { + seq { + id { + local + str "header_genome3_1" } , + descr { + molinfo { + biomol genomic } } , + inst { + repr raw , + mol dna , + length 5747 , + seq-data + iupacna "ATGAGAGCGCTACTGTGGCTGGTGGGTCTTGCGTTGCTGTTAACAGGCTGCGCGA +GCGAAAAAGGAATTATCGATGAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCCTATCCGCGCATTA +AAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGACGTTAACGGGCCGCAACGTCAGTTCGC +ATTACCTGATTCCCGCAACCCCGCCATTATATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGG +CCTGGCATGCGGGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAATC +GCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAATTCAGGCATTGATTCCGTTAG +CGAAGGACATTATCGCGCGCTATGACATCAAACCGCAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAG +ACGATCCCGGCCCGCGCTTCCCGTGGCGCGAGCTGGCGGCACAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTGG +CGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCGTTACTCTCGCGCTATGGCT +ATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGGGTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGC +AATGGAACGGTATCGCAGATGCCGAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACGAG +TCCGGGAGCGGCGAGATTCCGGGCAGAGGAGGCGGCTATAGCGGATGAGAGCGCTACTGTGGCTGGTGGGTCTTGCGT +TGCTGTTAACAGGCTGCGCGAGCGAAAAAGGAATTATCGATGAAGAGGGATATCAGCTTGATACCCGACATCGGGCGC +AGGCGGCCTATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGACGTTAA +CGGGCCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTATATGGCGGTAAACCGCGCATCTGGCAAC +TGGTGCCGGAACAGGATCAGGCCTGGCATGCGGGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTA +TTGGCATTGAGCTGGAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAA +TTCAGGCATTGATTCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCGCAGAATGTGGTGGCCCATGCGG +ATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGCTTCCCGTGGCGCGAGCTGGCGGCACAGGGGATTGGCGCCT +GGCCTGACGCCCAGCGTGTGGCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTG +CGTTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAACAGCGGGTGATTATGGCGTTCC +AGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCCGAAACGCAGGCGATTGCCGAAGCATTACTGGAGA +AGTACGGCCAGGATTAACAGGGCGCGGTATCGCGAGGACTCTCTCTTCGAGGACATGCCGCACTTTATTGCTGAATGT +ACTGAAAATATTCGCGAGCAGGCTGATTTACCCGGCCTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGATT +TTCCCCATCGGCGGTATCCGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGATTACGCG +TTTGTGCATATGACGCTGAAAATCGGCGCCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTCGGCGAAATGCTGTTCGGG +CTGATTAAAGCCCACTTCGCCGACCTGATGGAGAACCGCTATCTGGCGCTGTCGTTTGAGATTGCCGAGTTACATCCA +ACGCTCAATTACAAACAAAACAACGTACACGCGTTATTTAAATAGCCCGAGTATTAGCGCGCGGCGGCGCTATAGGCG +CGCTATAGGCATGAGCGCGCTATTTCTGGCCATCCCGTTAACCATTTTTGTGTTGTTTGTGTTACCGATTTGGCTGTG +GCTGCATTACAGCAACCGCGCCGGTCGGGGAGAACTGTCGCAAAGCGAGCAGCAACGCTTACTGCAACTCACAGACGA +CGCGCAACGTATGCGCGAGCGCATTCAGGCGCTGGAAGACATTCTTGATGCAGAGCATCCGAACTGGAGAGAGCGCTA +ACGAGAGTCTCGGAGGAGCGGCGCTCTGGATGCCCGCGACTAAATTCTCCCGACGTACCCTCCTGACGGCAGGTTCTG +CGCTTGCTGTTCTTCCTTTCCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCGAGACCGTCGATATTAAGGATT +ATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGCCGACGGACAGACCGTGGTCGTACCGCCAGGATGGG +TGTGTGAAAATATCAATGCGGCGATAACGATTCCGGCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATG +GCCGGGGACGGTTTATTTTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAATGTGACGCTGG +ATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGGCGATGAGCGGCTTTGGCCCCGTCGCGCAAATTTTCATCGGTG +GTAAGGAACCGCAGGTGATGCGTAATCTCATTATCGATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGCC +AGGGATTTCATAACCAAATGGACGGCGCGAGGATTACGCATAGCCGCTTTAGCGATTTACAGGGGGACGCCATTGAGT +GGAATGTCGCGATTCACGACCGCGATATCCTGATTTCCGATCATGTCATCGAACGCATTGATTGTACCAATGGCAAAA +TCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCACCTATGACAACAGTTATCCTGAAGACCAGGCAGTAAAAAACT +TTGTGGTGGCCAATATTACCGGATCTGATTGCCGACAGCTTGTGCACGTAGAAAATGGCAAACATTTCGTCATTCGCA +ATGTCAAAGCCAAAAACATCACGCCCGATTTCAGTAAAAATGCGGGTATTGATAACGCAACGATCGCAATTTATGGCT +GTGATAATTTCGTCATTGATAATATTGATATGACGAATAGTGCCGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAA +AATACCTGTCAATTCCGCAAAACTTTAAATTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCG +GCATTCAAATTTCCTCCGGCAACACCCCCTCTTTTGTCGCCATCACCAATGTACGGATGACGCGTGCTACGCTGGAAC +TGCATAATCAACCGCAGCACCTCTTCCTGCGTAATATCAACGTGATGCAAACTTCAGCGATTGGCCCGGCGTTAAAAA +TGCATTTCGATTTGCGTAAAGATGTCCGTGGTCAATTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTC +ATGCCATCAATGAAAACGGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGCAGTGA +ATTTTTCGCTGCCGAAGCGGGGAGGGTAACGCGTCTGCGCGCGCGGAGAGAGGCTCTCAGGACATGTCAGAAAATAAA +TTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTGCTGAAAGGGGTATCGCTGCAGGCCCGCGCCGGA +GATGTGATTAGCATCATCGGCTCGTCCGGCTCCGGTAAAAGCACTTTTTTGCGCTGTATTAACTTCCTCGAAAAACCG +AGCGAAGACGCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGACGGGCAGCTCAAAGTGGCGGAT +AAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCACTTTAACCTCTGGAGCCACATGACGGTG +CTGGAAAATGTGATGGAAGCGCCGATTCAGGTACTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATAT +CTGGCGAAGGTGGGGATTGATGAGCGCGCTCAGGGCAAATATCCCGTTCATCTCTCCGGTGGCCAACAGCAGCGCGTT +TCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACTTCGGCGCTCGATCCTGAACTG +GTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAAGAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGC +TTCGTTCGCCATGTCTCTTCGCACGTTATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGCGATCCGGAGCAGGTG +TTCGGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAACCGGATGCGCGCGATATATCG +GCGATGCGTGTTTATTGGCATCGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCG +GGCAGTAAAAAATGGCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGACTTCACGCATGACCGGCACCGTACCGTGGA +CGATCGTCCTTACGGCGGCGGACCAGGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGC +CGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCT +GGCCACGAATCAGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGACCGAAATTGA +CGAAGAATGGTCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCG +GTTTATACCGGGAGTTCTGGGGCATGAAGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCA +CTATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCGGGAAACCATGCTGAGATACGTCG +CTGGCGTTTGAAACAGTCGCTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGA +GCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAGCCGGC +TCTGCGAGAGAGGAGCGCTCGCATGAATAATCATTTTGGGAAAGGGTTAATGGCCGGGTTGCACGCGCCATATGCATA +TAGCGCGCATCATGCGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGTATGTTCGA +AAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGCTGGAATTCTGACGCGTCGCTATGGTCTGGATAAAGAAATGGT +GATGGATTTCTTTAAAGAGAATCATTCCGGGATGGCGGTTCGCTTCTTTATGGCCGGTTATCGACTCGAAGGTTGA" } } , + seq { + id { + local + str "header_genome3_1_1" } , + descr { + title "N-acetylmuramoyl-L-alanine amidase AmiD [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 276 , + seq-data + ncbieaa "MRALLWLVGLALLLTGCASEKGIIDEEGYQLDTRHRAQAAYPRIKVLVIHYTAEN +FDVSLATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGVSFWRGATRLNDTSIGIELENRGWRMSGGVKS +FAPFESAQIQALIPLAKDIIARYDIKPQNVVAHADIAPQRKDDPGPRFPWRELAAQGIGAWPDAQRVAFYLAGRAPYT +PVDTATVLALLSRYGYEVKADMTAREQQRVIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD" } , + annot { + { + data + ftable { + { + id + local + id 9 , + data + prot { + name { + "N-acetylmuramoyl-L-alanine amidase AmiD" } , + ec { + "3.5.1.28" } } , + location + int { + from 0 , + to 275 , + id + local + str "header_genome3_1_1" } } } } } } , + seq { + id { + local + str "header_genome3_1_2" } , + descr { + title "N-acetylmuramoyl-L-alanine amidase AmiD [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 276 , + seq-data + ncbieaa "MRALLWLVGLALLLTGCASEKGIIDEEGYQLDTRHRAQAAYPRIKVLVIHYTAEN +FDVSLATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGVSFWRGATRLNDTSIGIELENRGWRMSGGVKS +FAPFESAQIQALIPLAKDIIARYDIKPQNVVAHADIAPQRKDDPGPRFPWRELAAQGIGAWPDAQRVAFYLAGRAPYT +PVDTATVLALLSRYGYEVKADMTAREQQRVIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD" } , + annot { + { + data + ftable { + { + id + local + id 10 , + data + prot { + name { + "N-acetylmuramoyl-L-alanine amidase AmiD" } , + ec { + "3.5.1.28" } } , + location + int { + from 0 , + to 275 , + id + local + str "header_genome3_1_2" } } } } } } , + seq { + id { + local + str "header_genome3_1_3" } , + descr { + title "5-carboxymethyl-2-hydroxymuconate Delta-isomerase [Genus + species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 157 , + seq-data + ncbieaa "MPKRRRLPKHYWRSTARINRARYREDSLFEDMPHFIAECTENIREQADLPGLFSK +VNEALAASGIFPIGGIRSRAHWLDTWQMADGKHDYAFVHMTLKIGAGRSLESRQEVGEMLFGLIKAHFADLMENRYLA +LSFEIAELHPTLNYKQNNVHALFK" } , + annot { + { + data + ftable { + { + id + local + id 11 , + data + prot { + name { + "5-carboxymethyl-2-hydroxymuconate Delta-isomerase" } , + ec { + "5.3.3.10" } } , + location + int { + from 0 , + to 156 , + id + local + str "header_genome3_1_3" } } } } } } , + seq { + id { + local + str "header_genome3_1_4" } , + descr { + title "Phage shock protein B [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 74 , + seq-data + ncbieaa "MSALFLAIPLTIFVLFVLPIWLWLHYSNRAGRGELSQSEQQRLLQLTDDAQRMRE +RIQALEDILDAEHPNWRER" } , + annot { + { + data + ftable { + { + id + local + id 12 , + data + prot { + name { + "Phage shock protein B" } } , + location + int { + from 0 , + to 73 , + id + local + str "header_genome3_1_4" } } } } } } , + seq { + id { + local + str "header_genome3_1_5" } , + descr { + title "hypothetical protein PFKCIMHH_00005 [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 467 , + seq-data + ncbieaa "MPATKFSRRTLLTAGSALAVLPFLRALPVQAREPRETVDIKDYPADDGIASFKQA +FADGQTVVVPPGWVCENINAAITIPAGKTLRVQGAVRGNGRGRFILQDGCQVVGEQGGSLHNVTLDVRGSDCVIKGVA +MSGFGPVAQIFIGGKEPQVMRNLIIDDITVTHANYAILRQGFHNQMDGARITHSRFSDLQGDAIEWNVAIHDRDILIS +DHVIERIDCTNGKINWGIGIGLAGSTYDNSYPEDQAVKNFVVANITGSDCRQLVHVENGKHFVIRNVKAKNITPDFSK +NAGIDNATIAIYGCDNFVIDNIDMTNSAGMLIGYGVVKGKYLSIPQNFKLNAIRLDNRQVAYKLRGIQISSGNTPSFV +AITNVRMTRATLELHNQPQHLFLRNINVMQTSAIGPALKMHFDLRKDVRGQFMARQDTLLSLANVHAINENGQSSVDI +DRINHQTVNVEAVNFSLPKRGG" } , + annot { + { + data + ftable { + { + id + local + id 13 , + data + prot { + name { + "hypothetical protein" } } , + location + int { + from 0 , + to 466 , + id + local + str "header_genome3_1_5" } } } } } } , + seq { + id { + local + str "header_genome3_1_6" } , + descr { + title "Histidine transport ATP-binding protein HisP [Genus + species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 257 , + seq-data + ncbieaa "MSENKLHVIDLHKRYGGHEVLKGVSLQARAGDVISIIGSSGSGKSTFLRCINFLE +KPSEDAIIVNGQNINLVRDKDGQLKVADKNQLRLLRTRLTMVFQHFNLWSHMTVLENVMEAPIQVLGLSKHDARERAL +KYLAKVGIDERAQGKYPVHLSGGQQQRVSIARALAMEPDVLLFDEPTSALDPELVGEVLRIMQQLAEEGKTMVVVTHE +MGFVRHVSSHVIFLHQGKIEEEGDPEQVFGNPQSPRLQQFLKGSLK" } , + annot { + { + data + ftable { + { + id + local + id 14 , + data + prot { + name { + "Histidine transport ATP-binding protein HisP" } } , + location + int { + from 0 , + to 256 , + id + local + str "header_genome3_1_6" } } } } } } , + seq { + id { + local + str "header_genome3_1_7" } , + descr { + title "tRNA (guanine-N(1)-)-methyltransferase [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 263 , + seq-data + ncbieaa "MRAIYRRCVFIGIVSLFPEMFRAITDYGVTGRAVKNGLLNIQSWSPRDFTHDRHR +TVDDRPYGGGPGMLMMVQPLRDAIHAAKAAAGEGAKVIYLSPQGRKLDQAGVSELATNQKLILVCGRYEGVDERVIQT +EIDEEWSIGDYVLSGGELPAMTLIDSVARFIPGVLGHEASAIEDSFADGLLDCPHYTRPEVLEGMEVPPVLLSGNHAE +IRRWRLKQSLGRTWLRRPELLENLALTEEQARLLAEFKTEHAQQQHKHDGMA" } , + annot { + { + data + ftable { + { + id + local + id 15 , + data + prot { + name { + "tRNA (guanine-N(1)-)-methyltransferase" } , + ec { + "2.1.1.228" } } , + location + int { + from 0 , + to 262 , + id + local + str "header_genome3_1_7" } } } } } } , + seq { + id { + local + str "header_genome3_1_8" } , + descr { + title "hypothetical protein PFKCIMHH_00008 [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 95 , + seq-data + ncbieaa "MNNHFGKGLMAGLHAPYAYSAHHAVNFCSEYKRGFVLGFTHRMFEKTGDRQLSAW +EAGILTRRYGLDKEMVMDFFKENHSGMAVRFFMAGYRLEG" } , + annot { + { + data + ftable { + { + id + local + id 16 , + data + prot { + name { + "hypothetical protein" } } , + location + int { + from 0 , + to 94 , + id + local + str "header_genome3_1_8" } } } } } } } , + annot { + { + data + ftable { + { + id + local + id 1 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "header_genome3_1_1" , + location + int { + from 0 , + to 830 , + strand plus , + id + local + str "header_genome3_1" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:P75820" } } , + xref { + { + data + gene { + locus "amiD_1" , + locus-tag "PFKCIMHH_00001" } } } } , + { + id + local + id 2 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "header_genome3_1_2" , + location + int { + from 879 , + to 1709 , + strand plus , + id + local + str "header_genome3_1" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:P75820" } } , + xref { + { + data + gene { + locus "amiD_2" , + locus-tag "PFKCIMHH_00002" } } } } , + { + id + local + id 3 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "header_genome3_1_3" , + location + int { + from 1654 , + to 2127 , + strand plus , + id + local + str "header_genome3_1" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:Q05354" } } , + xref { + { + data + gene { + locus "hpcD" , + locus-tag "PFKCIMHH_00003" } } } } , + { + id + local + id 4 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "header_genome3_1_4" , + location + int { + from 2171 , + to 2395 , + strand plus , + id + local + str "header_genome3_1" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:P0AFM9" } } , + xref { + { + data + gene { + locus "pspB" , + locus-tag "PFKCIMHH_00004" } } } } , + { + id + local + id 5 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "header_genome3_1_5" , + location + int { + from 2424 , + to 3827 , + strand plus , + id + local + str "header_genome3_1" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } } , + xref { + { + data + gene { + locus-tag "PFKCIMHH_00005" } } } } , + { + id + local + id 6 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "header_genome3_1_6" , + location + int { + from 3862 , + to 4635 , + strand plus , + id + local + str "header_genome3_1" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:P02915" } } , + xref { + { + data + gene { + locus "hisP" , + locus-tag "PFKCIMHH_00006" } } } } , + { + id + local + id 7 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "header_genome3_1_7" , + location + int { + from 4640 , + to 5431 , + strand plus , + id + local + str "header_genome3_1" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:P0A873" } } , + xref { + { + data + gene { + locus "trmD" , + locus-tag "PFKCIMHH_00007" } } } } , + { + id + local + id 8 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "header_genome3_1_8" , + location + int { + from 5459 , + to 5746 , + strand plus , + id + local + str "header_genome3_1" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } } , + xref { + { + data + gene { + locus-tag "PFKCIMHH_00008" } } } } } } } } , + set { + class nuc-prot , + descr { + source { + org { + taxname "Genus species" , + orgname { + mod { + { + subtype strain , + subname "strain" } } , + gcode 11 } } } , + comment "Annotated using prokka 1.14-dev from + https://github.com/tseemann/prokka" , + user { + type + str "NcbiCleanup" , + data { + { + label + str "method" , + data + str "SeriousSeqEntryCleanup" } , + { + label + str "version" , + data + int 8 } , + { + label + str "month" , + data + int 2 } , + { + label + str "day" , + data + int 12 } , + { + label + str "year" , + data + int 2019 } } } , + create-date + std { + year 2019 , + month 2 , + day 12 } } , + seq-set { + seq { + id { + local + str "genome3_plasmi_2" } , + descr { + molinfo { + biomol genomic } } , + inst { + repr raw , + mol dna , + length 1968 , + seq-data + iupacna "ATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAG +CGTGCCAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTTGAGATGATCGTCCCTCAACTACCGCCGTATCCTG +CCGATGCGGCGGAGTTGCTGGAGTCTCTCGTGCTTGAGCATGGCGGTGCGCCATTAGGGCTGGTAGGATCGTCGCTGG +GTGGTTATTACGCCACCTGGCTGTCGCAATGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTG +AATTACTGACCGACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGCCATATTT +ATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGGCTACTGCAACAGACGGGCGATGAAG +TGCTGGATTACCGCCAGGCGGTGGCATATTACGCCTCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCA +CGGGCTTCGAAGATTATTTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGACGCGTAGCGGCCGGAATCT +TCTCGGAGAGGCGCTTCTCTCTCGGAGATGATTGACCCTATTTTTGCGTCCTGTACGCTAATTGCCGTCTTTGTTGTT +TTACTGGCCATGGGCGCGCCTATCGGGATCTGCATCGTTATCGCCTCTTTCAGCACCATGATGCTGGTACTGCCTTTC +GATATTTCGATGTTCGCCACCGCGCAAAAAATGTTCTCCAGCCTGGACAGTTTTGCCTTGCTGGCCGTGCCGTTCTTC +GTTTTGTCCGGGGTGATCATGAATAGCGGGGGAATTGCCGCCCGGCTGATCAATTTTGCCAAACTGTTTACTGGCAAA +CTGCCCGGTTCGCTCTCTTATACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCAATTGCCGCC +TCAACCTCCATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTACGATCGCGGCTTTGCGGCCGCGGTGAAT +ATCGCCTCCGCGCCGACGGGAATGTTAATTCCGCCCACCACGGCTTTTATCCTTTATGCGCTGGCAAGCGGGGGAACA +TCGATTGCCGCTCTGTTCGCCGGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTGGTA +GTCGCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGCATGGCGCTAAAAGTTGCCGTTGAGGCCATT +CCCAGCCTGTTACTGATCGTGATTATCGTCGGCGGCATTGTGCAGGGGATTTTCACCGCCATTGAAGCCTCCGCGATT +GCCGTGGTGTATACGTTATTGCTGACGATGGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTC +CAGACAGTGGTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTCTCAATGTCGATCACC +AATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCCAATAAACTGGTTATCCTGTTAGTCATTACCGTC +TTTTTGTTGATTATCGGCGCATTTATGGATATCGGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCGATCATG +GCTAAACTGGGCGTCGATCCGGTGCATTTGGGCATTATCATGATCTATAACCTGGCGATTGGCACCATTACGCCGCCA +GTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAGGAAGTGATTAAACCGTTGCTGCCTTTT +TACGGCGCGATTATCGGCGTTCTGTTATTAATTACCTACATTCCGGAAATCATACTGTTCTTACCCCGTCTACTGGGC +ATCATGTAAACCCGATGGCGCGCAGAGGCGCGAGTTCTGGA" } } , + seq { + id { + local + str "genome3_plasmi_2_1" } , + descr { + title "hypothetical protein PFKCIMHH_00009 [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 193 , + seq-data + ncbieaa "MSTLLYLHGFNSSPRSAKACQLKNWLAERHPHVEMIVPQLPPYPADAAELLESLV +LEHGGAPLGLVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDYLGQNENPYTGQQYVLESRHIYDLKVMQIDPL +EAPDLIWLLQQTGDEVLDYRQAVAYYASCRQTVTEGGNHAFTGFEDYFNQIVDFLGLHSC" } , + annot { + { + data + ftable { + { + id + local + id 19 , + data + prot { + name { + "hypothetical protein" } } , + location + int { + from 0 , + to 192 , + id + local + str "genome3_plasmi_2_1" } } } } } } , + seq { + id { + local + str "genome3_plasmi_2_2" } , + descr { + title "C4-dicarboxylate TRAP transporter large permease protein + DctM [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 435 , + seq-data + ncbieaa "MIDPIFASCTLIAVFVVLLAMGAPIGICIVIASFSTMMLVLPFDISMFATAQKMF +SSLDSFALLAVPFFVLSGVIMNSGGIAARLINFAKLFTGKLPGSLSYTNIVGNMMFGAISGSAIAASTSIGGVMVPMS +AREGYDRGFAAAVNIASAPTGMLIPPTTAFILYALASGGTSIAALFAGGLVAGVLWGVGCMLVTLVVAKRRNYRVFFT +VQKGMALKVAVEAIPSLLLIVIIVGGIVQGIFTAIEASAIAVVYTLLLTMVFYRTLKIKDLPSILLQTVVMTGVIMFL +LATSSAMSFSMSITNIPAALSDMILGISANKLVILLVITVFLLIIGAFMDIGPAILIFTPILLPIMAKLGVDPVHLGI +IMIYNLAIGTITPPVGSGLYVGASVGKVKVEEVIKPLLPFYGAIIGVLLLITYIPEIILFLPRLLGIM" } , + annot { + { + data + ftable { + { + id + local + id 20 , + data + prot { + name { + "C4-dicarboxylate TRAP transporter large permease + protein DctM" } } , + location + int { + from 0 , + to 434 , + id + local + str "genome3_plasmi_2_2" } } } } } } } , + annot { + { + data + ftable { + { + id + local + id 17 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "genome3_plasmi_2_1" , + location + int { + from 0 , + to 581 , + strand plus , + id + local + str "genome3_plasmi_2" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } } , + xref { + { + data + gene { + locus-tag "PFKCIMHH_00009" } } } } , + { + id + local + id 18 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "genome3_plasmi_2_2" , + location + int { + from 628 , + to 1935 , + strand plus , + id + local + str "genome3_plasmi_2" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:O07838" } } , + xref { + { + data + gene { + locus "dctM" , + locus-tag "PFKCIMHH_00010" } } } } } } } } , + set { + class nuc-prot , + descr { + source { + org { + taxname "Genus species" , + orgname { + mod { + { + subtype strain , + subname "strain" } } , + gcode 11 } } } , + comment "Annotated using prokka 1.14-dev from + https://github.com/tseemann/prokka" , + user { + type + str "NcbiCleanup" , + data { + { + label + str "method" , + data + str "SeriousSeqEntryCleanup" } , + { + label + str "version" , + data + int 8 } , + { + label + str "month" , + data + int 2 } , + { + label + str "day" , + data + int 12 } , + { + label + str "year" , + data + int 2019 } } } , + create-date + std { + year 2019 , + month 2 , + day 12 } } , + seq-set { + seq { + id { + local + str "genome3_plasmi_3" } , + descr { + molinfo { + biomol genomic } } , + inst { + repr raw , + mol dna , + length 1102 , + seq-data + iupacna "CACAGGGCTTAGAGGCGCTATGGCAATGATAATTAATGGGAAATTAATTAAAGCA +AAAGACTTAGCTAAGGCTGCAGGTGTATCTCGTTCAACAGTGATTAAATATTACGGCATTAGCCGTGAGAATTACGAA +AGGGTAGCAACTGAAAGAAGGAAGCTTGCTTTTGAACTAAGAGCATCAGGTTTAAAATGGAAAGAAGTTGCTGAAAAA +ATGAACACGACAAAATATAGCGCAATTGCATATTATAGACGATATTTAGCATTAGAGAAAAACAAATAACAGGCGCTA +AGGCGGCGATCCTAGCGCGCGATCGCGCATGCGATGGTCGAACTGCAACATCAACGGCTGATGGTGCTTGCCGAACAG +CTCCAGCTGGACAGTCTTATCGGCGCAGCGCCGGCGCTGTCGCAACAGGCGGTGGATCAGGAATGGAGCTACATGGAC +TTCCTGGAGCACCTGTTACATGAGGAGAAACTGGCCCGGCATCAGCGTAAACAGGCGATGTACACGCGGATGGCAGCC +TTCCCGGCGGTAAAGACGTTCGAGGAGTACGACTTCACCTTCGCCACCGGCGCTCCTCAGAAGCAAATCCAGTCGCTG +CGATCCCTGAGCTTCATAGAGCGTAACGAAAACATCGTGTTGCTGGGGCCATCGGGCGTGGGAAAAACGCATCTGGCG +ATAGCCATGGGCTACGAAGCAGTACGGGCGGGCATCAAGGTTCGCTTCACAACAGCAGCGGACCTGCTGCTACAGCTG +TCCACTTCACAGCGTCAGGGCCGTTACAAAACGACTCTCAATCGTGGTGTCATGGCCCCGAAGCTGCTTATCATCGAT +GAAATAGGTTATCTGCCGTTCAGTCAGGAGGAAGCCAAGCTGTTCTTCCAGGTCATCGCCAAACGTTACGAGAAGAGC +GCGATGATCCTGACCTCCAACCTGCCGTTCGGGCAGTGGGATCAGACGTTCGCCGGTGATGCAGCGCTGACATCGGCG +ATGCTGGACCGGATCTTACATCACTCACACGTCGTGCAAATAAAAGGGGAAAGCTATCGACTGAAGCAGAAACGAAAG +GCCGGGGTTATAGCTGAAGCTAATCCTGAGTAA" } } , + seq { + id { + local + str "genome3_plasmi_3_1" } , + descr { + title "hypothetical protein PFKCIMHH_00011 [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 84 , + seq-data + ncbieaa "MIINGKLIKAKDLAKAAGVSRSTVIKYYGISRENYERVATERRKLAFELRASGLK +WKEVAEKMNTTKYSAIAYYRRYLALEKNK" } , + annot { + { + data + ftable { + { + id + local + id 23 , + data + prot { + name { + "hypothetical protein" } } , + location + int { + from 0 , + to 83 , + id + local + str "genome3_plasmi_3_1" } } } } } } , + seq { + id { + local + str "genome3_plasmi_3_2" } , + descr { + title "Insertion sequence IS5376 putative ATP-binding protein + [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 259 , + seq-data + ncbieaa "MVELQHQRLMVLAEQLQLDSLIGAAPALSQQAVDQEWSYMDFLEHLLHEEKLARH +QRKQAMYTRMAAFPAVKTFEEYDFTFATGAPQKQIQSLRSLSFIERNENIVLLGPSGVGKTHLAIAMGYEAVRAGIKV +RFTTAADLLLQLSTSQRQGRYKTTLNRGVMAPKLLIIDEIGYLPFSQEEAKLFFQVIAKRYEKSAMILTSNLPFGQWD +QTFAGDAALTSAMLDRILHHSHVVQIKGESYRLKQKRKAGVIAEANPE" } , + annot { + { + data + ftable { + { + id + local + id 24 , + data + prot { + name { + "Insertion sequence IS5376 putative ATP-binding + protein" } } , + location + int { + from 0 , + to 258 , + id + local + str "genome3_plasmi_3_2" } } } } } } } , + annot { + { + data + ftable { + { + id + local + id 21 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "genome3_plasmi_3_1" , + location + int { + from 25 , + to 279 , + strand plus , + id + local + str "genome3_plasmi_3" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } } , + xref { + { + data + gene { + locus-tag "PFKCIMHH_00011" } } } } , + { + id + local + id 22 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "genome3_plasmi_3_2" , + location + int { + from 322 , + to 1101 , + strand plus , + id + local + str "genome3_plasmi_3" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:Q45619" } } , + xref { + { + data + gene { + locus-tag "PFKCIMHH_00012" } } } } } } } } } } diff --git a/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.tbl b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.tbl new file mode 100644 index 0000000000000000000000000000000000000000..834efec1575e3d9d3f0940ee9959c9e666791d43 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.tbl @@ -0,0 +1,75 @@ +>Feature header_genome3_1 +1 831 CDS + EC_number 3.5.1.28 + dbxref COG:COG3023 + gene amiD_1 + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:P75820 + locus_tag PFKCIMHH_00001 + product N-acetylmuramoyl-L-alanine amidase AmiD +880 1710 CDS + EC_number 3.5.1.28 + dbxref COG:COG3023 + gene amiD_2 + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:P75820 + locus_tag PFKCIMHH_00002 + product N-acetylmuramoyl-L-alanine amidase AmiD +1655 2128 CDS + EC_number 5.3.3.10 + gene hpcD + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:Q05354 + locus_tag PFKCIMHH_00003 + product 5-carboxymethyl-2-hydroxymuconate Delta-isomerase +2172 2396 CDS + gene pspB + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:P0AFM9 + locus_tag PFKCIMHH_00004 + product Phage shock protein B +2425 3828 CDS + inference ab initio prediction:Prodigal:2.6 + locus_tag PFKCIMHH_00005 + product hypothetical protein +3863 4636 CDS + dbxref COG:COG4598 + gene hisP + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:P02915 + locus_tag PFKCIMHH_00006 + product Histidine transport ATP-binding protein HisP +4641 5432 CDS + EC_number 2.1.1.228 + dbxref COG:COG0336 + gene trmD + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:P0A873 + locus_tag PFKCIMHH_00007 + product tRNA (guanine-N(1)-)-methyltransferase +5460 5747 CDS + inference ab initio prediction:Prodigal:2.6 + locus_tag PFKCIMHH_00008 + product hypothetical protein +>Feature genome3_plasmi_2 +1 582 CDS + inference ab initio prediction:Prodigal:2.6 + locus_tag PFKCIMHH_00009 + product hypothetical protein +629 1936 CDS + dbxref COG:COG1593 + gene dctM + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:O07838 + locus_tag PFKCIMHH_00010 + product C4-dicarboxylate TRAP transporter large permease protein DctM +>Feature genome3_plasmi_3 +26 280 CDS + inference ab initio prediction:Prodigal:2.6 + locus_tag PFKCIMHH_00011 + product hypothetical protein +323 1102 CDS + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:Q45619 + locus_tag PFKCIMHH_00012 + product Insertion sequence IS5376 putative ATP-binding protein diff --git a/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.tsv b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.tsv new file mode 100644 index 0000000000000000000000000000000000000000..0afd4f8ff32399cb6040568faebd0a2145f3676b --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.tsv @@ -0,0 +1,13 @@ +locus_tag ftype length_bp gene EC_number COG product +PFKCIMHH_00001 CDS 831 amiD_1 3.5.1.28 COG3023 N-acetylmuramoyl-L-alanine amidase AmiD +PFKCIMHH_00002 CDS 831 amiD_2 3.5.1.28 COG3023 N-acetylmuramoyl-L-alanine amidase AmiD +PFKCIMHH_00003 CDS 474 hpcD 5.3.3.10 5-carboxymethyl-2-hydroxymuconate Delta-isomerase +PFKCIMHH_00004 CDS 225 pspB Phage shock protein B +PFKCIMHH_00005 CDS 1404 hypothetical protein +PFKCIMHH_00006 CDS 774 hisP COG4598 Histidine transport ATP-binding protein HisP +PFKCIMHH_00007 CDS 792 trmD 2.1.1.228 COG0336 tRNA (guanine-N(1)-)-methyltransferase +PFKCIMHH_00008 CDS 288 hypothetical protein +PFKCIMHH_00009 CDS 582 hypothetical protein +PFKCIMHH_00010 CDS 1308 dctM COG1593 C4-dicarboxylate TRAP transporter large permease protein DctM +PFKCIMHH_00011 CDS 255 hypothetical protein +PFKCIMHH_00012 CDS 780 Insertion sequence IS5376 putative ATP-binding protein diff --git a/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.txt b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.txt new file mode 100644 index 0000000000000000000000000000000000000000..89b53831ab4386399ff71a68e8a46275f9d39114 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome3-chromo.fst-all.fna-split5N.fna-prokkaRes/EXAM.1216.00002.txt @@ -0,0 +1,4 @@ +organism: Genus species strain +contigs: 3 +bases: 8817 +CDS: 12 diff --git a/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna new file mode 100644 index 0000000000000000000000000000000000000000..85013c024cf3d318b654b2e8b07613ef6052c0ae --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna @@ -0,0 +1,2 @@ +>g4_1 +ATGAAAGCGCTACTGTGGCTGGTGGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGCGAAAAAGGAATTATCGATAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCCTATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGACGTTAACGGGCCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTATATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCGGGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAATTCAGGCATTGATTCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCGCAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGCTTCCCGTGGCGCGAGCTGGCGGCGCAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTGGCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCGTTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAGCAGCGGGTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCCGAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACGCGATCGCGCGCATATCGCGCGCGATAGATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTTACCCGGCCTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGATTTTCCCCATCGGCGGTATCCGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGATTATGCGTTTGTGCATATGACGCTGAAAATCGGTACCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTCGGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCGACCTGATGGAGAACCGCTATCTGGCGCTGTCGTTTGAGATTGCCGAGCTACATCCGACGCTCAATTACAAACAAAACAACGTACACGCGTTATTTAAATAGCCGCATATGCGGCGCTCAGGAGCGCTAGCATGCTCGATAAACAGACCCATACCCTGATCGCCCAGCGACTTAATCAGGCTGAAAAACAGCGTGAACAAATTCGCGCAGTGTCGCTGGATTATCCCAACATCACTATTGAAGATGCCTATGCCGTACAGCGTGAATGGGTCAATATCAAGATCGCCGAAGGGCGCACGCTCAAAGGCCACAAAATCGGCCTGACCTCAAAAGCGATGCAGGCCAGCTCGCAAATCAGCGAACCGGATTACGGCGCGCTGCTTGACGATATGTTCTTCCATGACGGCGGCGATATCCCCACCGACCGTTTTATCGTCCCGCGTATTGAAGTGGAGCTGGCGTTCGTGCTGGCGAAACCGCTGCGCGGCCCTCACTGCACGCTGTTCGACGTCTACAACGCCACGGATTATGTGATTCCGGCGCTGGAACTGATTGACGCCCGCAGCCACAACATCGACCCGGAAACCCAGCGCCCGCGCAAAGTGTTCGACACCATTTCCGACAACGCCGCCAACGCCGGGGTGATCCTCGGTGGTCGCCCCATCAAACCAGACGAGCTGGATCTGCGCTGGATCTCCGCGCTGCTCTATCGCAACGGCGTGATCGAAGAAACCGGCGTCGCCGCAGGCGTGCTGAATCATCCGGCCAACGGCGTGGCGTGGCTGGCGAACAAGCTTGCCCCCTACGATGTCCAGCTTGAAGCCGGGCAGATCATCCTCGGCGGCTCGTTCACCCGCCCGGTGCCGGCGAGCAAGGGCGACACCTTCCATGTCGATTACGGCAACATGGGCGCGATCAGTTGCCGGTTTGTGTAACCAGATCCGCGCGCGCATATATATCGCGCGCATACGATGCCCGTGAATAAGTTCTCCCGACGTACCCTCCTGACGGCAGGTTCCGCGCTTGCTGTTCTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCGAGACCGTCGATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGCCGACGGACAGACCGTGGTCTTACCGCCAGGATGGGTGTGTGAAAATATCAATGCGGCGATAACGATTCCGGCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATGGCCGGGGACGGTTTATTTTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAATGTGACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGACGATGAGCGGCTTTGGCCCCGTCGCGCAAATTTTCATCGGCGGTAAGGAACCGCAGGTGATGCGTAATCTCATTATCGATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCATAACCAAATGGACGGCGCGCGGATTACGCATAGCCGCTTTAGCGATTTGCAGGGGGACGCCATTGAGTGGAATGTCGCGATTCACGACCGCGACATCCTGATTTCCGATCATGTCATCGAACGCATTGATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCACCTATGACAACAGTTATCCTGAAGATCAGGCAGTAAAAAACTTTGTGGTGGCCAATATTACCGGATCTGATTGCCGACAGCTGGTGCACGTAGAAAATGGCAAACATTTCGTCATTCGCAATGTCAAAGCCAAAAACATCACGCCCGATTTCAGTAAAAATGCGGGTATTGATAACGCAACGATCGCCATTTATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAGTGCTGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAACTTTAAATTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCAAATTTCCTCCGGCAACATCCCCTCTTTTGTCGCCATCACCAATGTACGGATGACGCGTGCTACGCTGGAACTGCATAATCAACCGCAGCACCTCTTTCTGCGTAATATCAACGTGATGCAAACTTCAGCGATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGTAAAGATGTCCGTGGTCAATTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAATGAAAACGGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGCAGTGAATTTTTCGCTGCCGAAGCGGGGAGGGTAACAGAGCTCGCGCGATCGCGAATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTGCTGAAAGGGGTATCGTTGCAGGCCCGCGCCGGAGATGTGATTAGCATCATCGGCTCGTCCGGCTCCGGTAAAAGCACTTTTTTGCGCTGTATTAACTTCCTCGAAAAACCGAGCGAAGGCGCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGACGGGCAGCTCAAAGTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCACTTTAACCTCTGGAGCCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTACTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGGATTGATGAGCGCGCTCAGGGCAAATATCCCGTCCATCTCTCCGGCGGCCAACAGCAGCGCGTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACATCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAAGAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCCCGCCATGTCTCTTCGCACGTTATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGCGATCCGGAGCAGGTGTTCGGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAACAGATACGCGCATGGCGAGTGTTTATTGGCATCGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGACTTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCGGGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGACCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGGGTTCTGGGGCATGAGGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCGGGAAACCATGCCGAGATACGTCGCTGGCGCTTGAAACAGTCGCTGGGCCGAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTCAAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAGCGATAGCGCTAGAGCGATGAATAATCATTTTGGGAAAGGGTTAATGGCTGGGTTGCACGCGCCATATGCATATAGCGCGCATCATGCGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGTATGTTCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGCCGGAATTCTGACGCGTCGCTATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATGGCGGTTCGCTTCTTTATGGTCGGTTATCGACTCGAAGGTTGACAGATGCGCGATCGATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGCCAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTCGAGATGATCGTCCCTCAACTACCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAATCTCTCGTGCTTGAGCATGGCGGTGCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAATGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACCGACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGCCATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGGCTACTGCAACAGACGGGCGATGAAGTGCTGGATTACCGCCAGGCGGTGGCATATTACGCCTCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCACGGGCTTCGAAGATTATTTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGACACGGATGCGCCATCGGCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGCGGGGGAATTGCCGCCCGGCTGGTCAATTTTGCCAAACTGTTTACTGGCAAACTGCCCGGCTCGCTCTCTTATACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCAATTGCCGCCTCAACCTCCATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTACGATCGCGGCTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAATTCCGCCCACCACGGCTTTTATCCTTTATGCGCTGGCAAGCGGGGGAACATCGATTGCCGCTCTGTTCGCCGGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTGGTGGTCGCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGCATGGCGCTAAAAGTTGCCGTTGAGGCCATTCCCAGCCTGCTGCTGATCGTGATTATCGTCGGCGGCATTGTGCAGGGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTATACATTATTGTTGACGATGGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAGACAGTGGTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTCTCGATGTCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCCAATAAACTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATGGATATCGGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCGATCATGGCTAAACTGGGCGTCGATCCGGTGCATTTTGGCATTATCATGATCTATAACCTGGCGATTGGCACCATTACGCCGCCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAGGAAGTGATTAAACCGTTGCTGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTATTAATTACCTACATTCCGGAAATCACACTGTTCTTACCCCGTCTACTGGGCATCATGTAA diff --git a/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokka.log b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokka.log new file mode 100644 index 0000000000000000000000000000000000000000..2d2168766f191824a1996d5ce2a42ff6654d9700 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokka.log @@ -0,0 +1,119 @@ +[13:18:51] This is prokka 1.14-dev +[13:18:51] Written by Torsten Seemann <torsten.seemann@gmail.com> +[13:18:51] Homepage is https://github.com/tseemann/prokka +[13:18:51] Local time is Tue Feb 12 13:18:51 2019 +[13:18:51] You are aperrin +[13:18:51] Operating system is darwin +[13:18:51] You have BioPerl 1.006924 +[13:18:51] System has 8 cores. +[13:18:51] Will use maximum of 1 cores. +[13:18:51] Annotating as >>> Bacteria <<< +[13:18:51] Generating locus_tag from 'Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna' contents. +[13:18:51] Setting --locustag LIKMDGFN from MD5 5246d0f7b5f59e1da550241263d19049 +[13:18:51] Creating new output folder: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes +[13:18:51] Running: mkdir -p Examples\/1\-res\-Annotate\/tmp_files\/genome4\.fst\-split5N\.fna\-prokkaRes +[13:18:51] Using filename prefix: GEN4.1111.00001.XXX +[13:18:51] Setting HMMER_NCPU=1 +[13:18:51] Writing log to: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.log +[13:18:51] Command: /usr/local/bin/prokka --outdir Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes --cpus 1 --prefix GEN4.1111.00001 Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna +[13:18:51] Appending to PATH: /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin +[13:18:51] Appending to PATH: /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/../common +[13:18:51] Appending to PATH: /Users/aperrin/Softwares/src/prokka/bin +[13:18:51] Looking for 'aragorn' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/aragorn +[13:18:51] Determined aragorn version is 1.2 +[13:18:51] Looking for 'barrnap' - found /usr/local/bin/barrnap +[13:18:51] Determined barrnap version is 0.8 +[13:18:51] Looking for 'blastp' - found /Users/aperrin/Softwares/bin/blastp +[13:18:51] Determined blastp version is 2.3 +[13:18:51] Looking for 'cmpress' - found /usr/local/bin/cmpress +[13:18:51] Determined cmpress version is 1.1 +[13:18:51] Looking for 'cmscan' - found /usr/local/bin/cmscan +[13:18:51] Determined cmscan version is 1.1 +[13:18:51] Looking for 'egrep' - found /usr/bin/egrep +[13:18:51] Looking for 'find' - found /usr/bin/find +[13:18:51] Looking for 'grep' - found /usr/bin/grep +[13:18:51] Looking for 'hmmpress' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/hmmpress +[13:18:51] Determined hmmpress version is 3.1 +[13:18:51] Looking for 'hmmscan' - found /usr/local/bin/hmmscan +[13:18:51] Determined hmmscan version is 3.1 +[13:18:51] Looking for 'java' - found /usr/bin/java +[13:18:51] Looking for 'less' - found /usr/bin/less +[13:18:51] Looking for 'makeblastdb' - found /Users/aperrin/Softwares/bin/makeblastdb +[13:18:51] Determined makeblastdb version is 2.3 +[13:18:51] Looking for 'minced' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/../common/minced +[13:18:51] Determined minced version is 2.0 +[13:18:51] Looking for 'parallel' - found /usr/local/bin/parallel +[13:18:52] Determined parallel version is 20181022 +[13:18:52] Looking for 'prodigal' - found /usr/local/bin/prodigal +[13:18:52] Determined prodigal version is 2.6 +[13:18:52] Looking for 'prokka-genbank_to_fasta_db' - found /Users/aperrin/Softwares/src/prokka/bin/prokka-genbank_to_fasta_db +[13:18:52] Looking for 'sed' - found /usr/bin/sed +[13:18:52] Looking for 'tbl2asn' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/tbl2asn +[13:18:52] Determined tbl2asn version is 25.6 +[13:18:52] Using genetic code table 11. +[13:18:52] Loading and checking input file: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna +[13:18:52] Wrote 1 contigs totalling 7134 bp. +[13:18:52] Predicting tRNAs and tmRNAs +[13:18:52] Running: aragorn -l -gc11 -w Examples\/1\-res\-Annotate\/tmp_files\/genome4\.fst\-split5N\.fna\-prokkaRes\/GEN4\.1111\.00001\.fna +[13:18:52] Found 0 tRNAs +[13:18:52] Predicting Ribosomal RNAs +[13:18:52] Running Barrnap with 1 threads +[13:18:52] Found 0 rRNAs +[13:18:52] Skipping ncRNA search, enable with --rfam if desired. +[13:18:52] Total of 0 tRNA + rRNA features +[13:18:52] Searching for CRISPR repeats +[13:18:52] Found 0 CRISPRs +[13:18:52] Predicting coding sequences +[13:18:52] Contigs total 7134 bp, so using meta mode +[13:18:52] Running: prodigal -i Examples\/1\-res\-Annotate\/tmp_files\/genome4\.fst\-split5N\.fna\-prokkaRes\/GEN4\.1111\.00001\.fna -c -m -g 11 -p meta -f sco -q +[13:18:52] Found 9 CDS +[13:18:52] Connecting features back to sequences +[13:18:52] Not using genus-specific database. Try --usegenus to enable it. +[13:18:52] Annotating CDS, please be patient. +[13:18:52] Will use 1 CPUs for similarity searching. +[13:18:52] There are still 9 unannotated CDS left (started with 9) +[13:18:52] Will use blast to search against /Users/aperrin/Softwares/src/prokka/db/kingdom/Bacteria/sprot with 1 CPUs +[13:18:52] Running: cat Examples\/1\-res\-Annotate\/tmp_files\/genome4\.fst\-split5N\.fna\-prokkaRes\/sprot\.faa | parallel --gnu --plain -j 1 --block 1161 --recstart '>' --pipe blastp -query - -db /Users/aperrin/Softwares/src/prokka/db/kingdom/Bacteria/sprot -evalue 1e-06 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > Examples\/1\-res\-Annotate\/tmp_files\/genome4\.fst\-split5N\.fna\-prokkaRes\/sprot\.blast 2> /dev/null +[13:18:53] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/sprot.faa +[13:18:53] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/sprot.blast +[13:18:53] There are still 3 unannotated CDS left (started with 9) +[13:18:53] Will use hmmer3 to search against /Users/aperrin/Softwares/src/prokka/db/hmm/HAMAP.hmm with 1 CPUs +[13:18:53] Running: cat Examples\/1\-res\-Annotate\/tmp_files\/genome4\.fst\-split5N\.fna\-prokkaRes\/HAMAP\.hmm\.faa | parallel --gnu --plain -j 1 --block 379 --recstart '>' --pipe hmmscan --noali --notextw --acc -E 1e-06 --cpu 1 /Users/aperrin/Softwares/src/prokka/db/hmm/HAMAP.hmm /dev/stdin > Examples\/1\-res\-Annotate\/tmp_files\/genome4\.fst\-split5N\.fna\-prokkaRes\/HAMAP\.hmm\.hmmer3 2> /dev/null +[13:18:53] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/HAMAP.hmm.faa +[13:18:53] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/HAMAP.hmm.hmmer3 +[13:18:53] Labelling remaining 3 proteins as 'hypothetical protein' +[13:18:53] Found 6 unique /gene codes. +[13:18:53] Fixed 0 colliding /gene names. +[13:18:53] Adding /locus_tag identifiers +[13:18:53] Assigned 9 locus_tags to CDS and RNA features. +[13:18:53] Writing outputs to Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/ +[13:18:53] Generating annotation statistics file +[13:18:53] Generating Genbank and Sequin files +[13:18:53] Running: tbl2asn -V b -a r10k -l paired-ends -M n -N 1 -y 'Annotated using prokka 1.14-dev from https://github.com/tseemann/prokka' -Z Examples\/1\-res\-Annotate\/tmp_files\/genome4\.fst\-split5N\.fna\-prokkaRes\/GEN4\.1111\.00001\.err -i Examples\/1\-res\-Annotate\/tmp_files\/genome4\.fst\-split5N\.fna\-prokkaRes\/GEN4\.1111\.00001\.fsa 2> /dev/null +[13:18:54] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/errorsummary.val +[13:18:54] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.dr +[13:18:54] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.fixedproducts +[13:18:54] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.ecn +[13:18:54] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.val +[13:18:54] Repairing broken .GBK output that tbl2asn produces... +[13:18:54] Running: sed 's/COORDINATES: profile/COORDINATES:profile/' < Examples\/1\-res\-Annotate\/tmp_files\/genome4\.fst\-split5N\.fna\-prokkaRes\/GEN4\.1111\.00001\.gbf > Examples\/1\-res\-Annotate\/tmp_files\/genome4\.fst\-split5N\.fna\-prokkaRes\/GEN4\.1111\.00001\.gbk +[13:18:54] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.gbf +[13:18:54] Output files: +[13:18:54] Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.fsa +[13:18:54] Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.faa +[13:18:54] Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.log +[13:18:54] Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.ffn +[13:18:54] Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.gbk +[13:18:54] Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.gff +[13:18:54] Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.tbl +[13:18:54] Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.sqn +[13:18:54] Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.tsv +[13:18:54] Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.fna +[13:18:54] Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.err +[13:18:54] Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.txt +[13:18:54] Annotation finished successfully. +[13:18:54] Walltime used: 0.05 minutes +[13:18:54] If you use this result please cite the Prokka paper: +[13:18:54] Seemann T (2014) Prokka: rapid prokaryotic genome annotation. Bioinformatics. 30(14):2068-9. +[13:18:54] Type 'prokka --citation' for more details. +[13:18:54] Share and enjoy! diff --git a/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.err b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.err new file mode 100644 index 0000000000000000000000000000000000000000..f57bb0e61a4773385690b5e8c9f095124859cdf6 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.err @@ -0,0 +1,102 @@ +Discrepancy Report Results + +Summary +FATAL: MISSING_PROTEIN_ID:9 proteins have invalid IDs. +DISC_SOURCE_QUALS_ASNDISC:strain (all present, all unique) +DISC_SOURCE_QUALS_ASNDISC:taxname (all present, all unique) +DISC_FEATURE_COUNT:CDS: 9 present +DISC_COUNT_NUCLEOTIDES:1 nucleotide Bioseqs are present +FEATURE_LOCATION_CONFLICT:9 features have inconsistent gene locations. +DISC_QUALITY_SCORES:Quality scores are missing on all sequences. +ONCALLER_COMMENT_PRESENT:1 comment descriptors were found (all same) +MISSING_GENOMEASSEMBLY_COMMENTS:1 bioseqs are missing GenomeAssembly structured comments +MOLTYPE_NOT_MRNA:1 molecule types are not set as mRNA. +TECHNIQUE_NOT_TSA:1 technique are not set as TSA +MISSING_STRUCTURED_COMMENT:1 sequences do not include structured comments. +MISSING_PROJECT:10 sequences do not include project. +DISC_INCONSISTENT_MOLINFO_TECH:Molinfo Technique Report (some missing, all same) + + +Detailed Report + +FATAL: DiscRep_ALL:MISSING_PROTEIN_ID::9 proteins halid Ivalid IDs. +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1_1 (length 276) +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1_2 (length 126) +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1_3 (length 267) +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1_4 (length 467) +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1_5 (length 257) +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1_6 (length 255) +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1_7 (length 86) +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1_8 (length 193) +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1_9 (length 360) + +DiscRep_ALL:DISC_SOURCE_QUALS_ASNDISC::strain (all present, all unique) +DiscRep_SUB:DISC_SOURCE_QUALS_ASNDISC::1 sources have unique values for strain +DiscRep_ALL:DISC_SOURCE_QUALS_ASNDISC::taxname (all present, all unique) +DiscRep_SUB:DISC_SOURCE_QUALS_ASNDISC::1 sources have unique values for taxname +DiscRep_ALL:DISC_FEATURE_COUNT::CDS: 9 present +DiscRep_ALL:DISC_COUNT_NUCLEOTIDES::1 nucleotide Bioseqs are present +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1 (length 7134) + +DiscRep_ALL:FEATURE_LOCATION_CONFLICT::9 features have inconsistent gene locations. +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:CDS N-acetylmuramoyl-L-alanine amidase AmiD g4_1:1-831 LIKMDGFN_00001 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:CDS 5-carboxymethyl-2-hydroxymuconate Delta-isomerase g4_1:861-1241 LIKMDGFN_00002 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:CDS 2-oxo-hept-4-ene-1,7-dioate hydratase g4_1:1271-2074 LIKMDGFN_00003 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:CDS hypothetical protein g4_1:2111-3514 LIKMDGFN_00004 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:CDS Histidine transport ATP-binding protein HisP g4_1:3535-4308 LIKMDGFN_00005 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:CDS tRNA (guanine-N(1)-)-methyltransferase g4_1:4327-5094 LIKMDGFN_00006 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:CDS hypothetical protein g4_1:5138-5398 LIKMDGFN_00007 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:CDS hypothetical protein g4_1:5413-5994 LIKMDGFN_00008 + +DiscRep_SUB:FEATURE_LOCATION_CONFLICT::Coding region xref gene does not exist +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:CDS C4-dicarboxylate TRAP transporter large permease protein DctM g4_1:6052-7134 LIKMDGFN_00009 + +DiscRep_ALL:DISC_QUALITY_SCORES::Quality scores are missing on all sequences. + +DiscRep_ALL:ONCALLER_COMMENT_PRESENT::1 comment descriptors were found (all same) +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1:Annotated using prokka 1.14-dev from https://github.com/tseemann/prokka + +DiscRep_ALL:MISSING_GENOMEASSEMBLY_COMMENTS::1 bioseqs are missing GenomeAssembly structured comments +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1 (length 7134) + +DiscRep_ALL:MOLTYPE_NOT_MRNA::1 molecule types are not set as mRNA. +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1 (length 7134) + +DiscRep_ALL:TECHNIQUE_NOT_TSA::1 technique are not set as TSA +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1 (length 7134) + +DiscRep_ALL:MISSING_STRUCTURED_COMMENT::1 sequences do not include structured comments. +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1 (length 7134) + +DiscRep_ALL:MISSING_PROJECT::10 sequences do not include project. +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1 (length 7134) +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1_1 (length 276) +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1_2 (length 126) +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1_3 (length 267) +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1_4 (length 467) +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1_5 (length 257) +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1_6 (length 255) +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1_7 (length 86) +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1_8 (length 193) +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1_9 (length 360) + +DiscRep_ALL:DISC_INCONSISTENT_MOLINFO_TECH::Molinfo Technique Report (some missing, all same) +DiscRep_SUB:DISC_INCONSISTENT_MOLINFO_TECH::technique (all missing) +DiscRep_SUB:DISC_INCONSISTENT_MOLINFO_TECH::1 Molinfos are missing field technique +Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001:g4_1 (length 7134) + diff --git a/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.faa b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.faa new file mode 100644 index 0000000000000000000000000000000000000000..a9dac9220a5843883e83e742358b8227b5d843ff --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.faa @@ -0,0 +1,52 @@ +>LIKMDGFN_00001 N-acetylmuramoyl-L-alanine amidase AmiD +MKALLWLVGLALLLTGCASEKGIIDKEGYQLDTRHRAQAAYPRIKVLVIHYTAENFDVSL +ATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGVSFWRGATRLNDTSIGIEL +ENRGWRMSGGVKSFAPFESAQIQALIPLAKDIIARYDIKPQNVVAHADIAPQRKDDPGPR +FPWRELAAQGIGAWPDAQRVAFYLAGRAPYTPVDTATVLALLSRYGYEVKADMTAREQQR +VIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD +>LIKMDGFN_00002 5-carboxymethyl-2-hydroxymuconate Delta-isomerase +MPHFIAECTENIREQADLPGLFSKVNEALAASGIFPIGGIRSRAHWLDTWQMADGKHDYA +FVHMTLKIGTGRSLESRQEVGEMLFGLIKAHFADLMENRYLALSFEIAELHPTLNYKQNN +VHALFK +>LIKMDGFN_00003 2-oxo-hept-4-ene-1,7-dioate hydratase +MLDKQTHTLIAQRLNQAEKQREQIRAVSLDYPNITIEDAYAVQREWVNIKIAEGRTLKGH +KIGLTSKAMQASSQISEPDYGALLDDMFFHDGGDIPTDRFIVPRIEVELAFVLAKPLRGP +HCTLFDVYNATDYVIPALELIDARSHNIDPETQRPRKVFDTISDNAANAGVILGGRPIKP +DELDLRWISALLYRNGVIEETGVAAGVLNHPANGVAWLANKLAPYDVQLEAGQIILGGSF +TRPVPASKGDTFHVDYGNMGAISCRFV +>LIKMDGFN_00004 hypothetical protein +MPVNKFSRRTLLTAGSALAVLPFLRALPVQAREPRETVDIKDYPADDGIASFKQAFADGQ +TVVLPPGWVCENINAAITIPAGKTLRVQGAVRGNGRGRFILQDGCQVVGEQGGSLHNVTL +DVRGSDCVIKGVTMSGFGPVAQIFIGGKEPQVMRNLIIDDITVTHANYAILRQGFHNQMD +GARITHSRFSDLQGDAIEWNVAIHDRDILISDHVIERIDCTNGKINWGIGIGLAGSTYDN +SYPEDQAVKNFVVANITGSDCRQLVHVENGKHFVIRNVKAKNITPDFSKNAGIDNATIAI +YGCDNFVIDNIDMTNSAGMLIGYGVVKGKYLSIPQNFKLNAIRLDNRQVAYKLRGIQISS +GNIPSFVAITNVRMTRATLELHNQPQHLFLRNINVMQTSAIGPALKMHFDLRKDVRGQFM +ARQDTLLSLANVHAINENGQSSVDIDRINHQTVNVEAVNFSLPKRGG +>LIKMDGFN_00005 Histidine transport ATP-binding protein HisP +MSENKLHVIDLHKRYGGHEVLKGVSLQARAGDVISIIGSSGSGKSTFLRCINFLEKPSEG +AIIVNGQNINLVRDKDGQLKVADKNQLRLLRTRLTMVFQHFNLWSHMTVLENVMEAPIQV +LGLSKHDARERALKYLAKVGIDERAQGKYPVHLSGGQQQRVSIARALAMEPDVLLFDEPT +SALDPELVGEVLRIMQQLAEEGKTMVVVTHEMGFARHVSSHVIFLHQGKIEEEGDPEQVF +GNPQSPRLQQFLKGSLK +>LIKMDGFN_00006 tRNA (guanine-N(1)-)-methyltransferase +MFIGIVSLFPEMFRAITDYGVTGRAVKKGLLNIQSWSPRDFAHDRHRTVDDRPYGGGPGM +LMMVQPLRDAIHAAKAAAGEGAKVIYLSPQGRKLDQAGVSELATNQKLILVCGRYEGVDE +RVIQTEIDEEWSIGDYVLSGGELPAMTLIDSVARFIPGVLGHEASAIEDSFADGLLDCPH +YTRPEVLEGMEVPPVLLSGNHAEIRRWRLKQSLGRTWLRRPELLENLALTEEQARLLAEF +KTEHAQQQHKHDGMA +>LIKMDGFN_00007 hypothetical protein +MAGLHAPYAYSAHHAVNFCSEYKRGFVLGFTHRMFEKTGDRQLSAWEAGILTRRYGLDKE +MVMDFFKENHSGMAVRFFMVGYRLEG +>LIKMDGFN_00008 hypothetical protein +MSTLLYLHGFNSSPRSAKACQLKNWLAERHPHVEMIVPQLPPYPADAAELLESLVLEHGG +APLGLVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDYLGQNENPYTGQQYVLESR +HIYDLKVMQIDPLEAPDLIWLLQQTGDEVLDYRQAVAYYASCRQTVTEGGNHAFTGFEDY +FNQIVDFLGLHSC +>LIKMDGFN_00009 C4-dicarboxylate TRAP transporter large permease protein DctM +MNSGGIAARLVNFAKLFTGKLPGSLSYTNIVGNMMFGAISGSAIAASTSIGGVMVPMSAR +EGYDRGFAAAVNIASAPTGMLIPPTTAFILYALASGGTSIAALFAGGLVAGVLWGVGCML +VTLVVAKRRNYRVFFTVQKGMALKVAVEAIPSLLLIVIIVGGIVQGIFTAIEASAIAVVY +TLLLTMVFYRTLKIKDLPSILLQTVVMTGVIMFLLATSSAMSFSMSITNIPAALSDMILG +ISANKLVILLVITVFLLIIGAFMDIGPAILIFTPILLPIMAKLGVDPVHFGIIMIYNLAI +GTITPPVGSGLYVGASVGKVKVEEVIKPLLPFYGAIIGVLLLITYIPEITLFLPRLLGIM diff --git a/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.ffn b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.ffn new file mode 100644 index 0000000000000000000000000000000000000000..40721d61f9f565ea758dd19633045b3da0e82ace --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.ffn @@ -0,0 +1,128 @@ +>LIKMDGFN_00001 N-acetylmuramoyl-L-alanine amidase AmiD +ATGAAAGCGCTACTGTGGCTGGTGGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGCGAA +AAAGGAATTATCGATAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCC +TATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTG +GCGACGTTAACGGGCCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTA +TATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCG +GGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTG +GAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCG +CAAATTCAGGCATTGATTCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCG +CAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGC +TTCCCGTGGCGCGAGCTGGCGGCGCAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTG +GCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCG +TTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAGCAGCGG +GTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCC +GAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAA +>LIKMDGFN_00002 5-carboxymethyl-2-hydroxymuconate Delta-isomerase +ATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTTACCCGGC +CTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGATTTTCCCCATCGGCGGTATC +CGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGATTATGCG +TTTGTGCATATGACGCTGAAAATCGGTACCGGGCGCAGCCTGGAGAGCCGTCAGGAAGTC +GGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCGACCTGATGGAGAACCGCTAT +CTGGCGCTGTCGTTTGAGATTGCCGAGCTACATCCGACGCTCAATTACAAACAAAACAAC +GTACACGCGTTATTTAAATAG +>LIKMDGFN_00003 2-oxo-hept-4-ene-1,7-dioate hydratase +ATGCTCGATAAACAGACCCATACCCTGATCGCCCAGCGACTTAATCAGGCTGAAAAACAG +CGTGAACAAATTCGCGCAGTGTCGCTGGATTATCCCAACATCACTATTGAAGATGCCTAT +GCCGTACAGCGTGAATGGGTCAATATCAAGATCGCCGAAGGGCGCACGCTCAAAGGCCAC +AAAATCGGCCTGACCTCAAAAGCGATGCAGGCCAGCTCGCAAATCAGCGAACCGGATTAC +GGCGCGCTGCTTGACGATATGTTCTTCCATGACGGCGGCGATATCCCCACCGACCGTTTT +ATCGTCCCGCGTATTGAAGTGGAGCTGGCGTTCGTGCTGGCGAAACCGCTGCGCGGCCCT +CACTGCACGCTGTTCGACGTCTACAACGCCACGGATTATGTGATTCCGGCGCTGGAACTG +ATTGACGCCCGCAGCCACAACATCGACCCGGAAACCCAGCGCCCGCGCAAAGTGTTCGAC +ACCATTTCCGACAACGCCGCCAACGCCGGGGTGATCCTCGGTGGTCGCCCCATCAAACCA +GACGAGCTGGATCTGCGCTGGATCTCCGCGCTGCTCTATCGCAACGGCGTGATCGAAGAA +ACCGGCGTCGCCGCAGGCGTGCTGAATCATCCGGCCAACGGCGTGGCGTGGCTGGCGAAC +AAGCTTGCCCCCTACGATGTCCAGCTTGAAGCCGGGCAGATCATCCTCGGCGGCTCGTTC +ACCCGCCCGGTGCCGGCGAGCAAGGGCGACACCTTCCATGTCGATTACGGCAACATGGGC +GCGATCAGTTGCCGGTTTGTGTAA +>LIKMDGFN_00004 hypothetical protein +ATGCCCGTGAATAAGTTCTCCCGACGTACCCTCCTGACGGCAGGTTCCGCGCTTGCTGTT +CTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCGAGACCGTCGATATT +AAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGCCGACGGACAG +ACCGTGGTCTTACCGCCAGGATGGGTGTGTGAAAATATCAATGCGGCGATAACGATTCCG +GCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATGGCCGGGGACGGTTTATT +TTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAATGTGACGCTG +GATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGACGATGAGCGGCTTTGGCCCCGTC +GCGCAAATTTTCATCGGCGGTAAGGAACCGCAGGTGATGCGTAATCTCATTATCGATGAC +ATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCATAACCAAATGGAC +GGCGCGCGGATTACGCATAGCCGCTTTAGCGATTTGCAGGGGGACGCCATTGAGTGGAAT +GTCGCGATTCACGACCGCGACATCCTGATTTCCGATCATGTCATCGAACGCATTGATTGT +ACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCACCTATGACAAC +AGTTATCCTGAAGATCAGGCAGTAAAAAACTTTGTGGTGGCCAATATTACCGGATCTGAT +TGCCGACAGCTGGTGCACGTAGAAAATGGCAAACATTTCGTCATTCGCAATGTCAAAGCC +AAAAACATCACGCCCGATTTCAGTAAAAATGCGGGTATTGATAACGCAACGATCGCCATT +TATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAGTGCTGGGATGCTC +ATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAACTTTAAATTAAAC +GCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCAAATTTCCTCC +GGCAACATCCCCTCTTTTGTCGCCATCACCAATGTACGGATGACGCGTGCTACGCTGGAA +CTGCATAATCAACCGCAGCACCTCTTTCTGCGTAATATCAACGTGATGCAAACTTCAGCG +ATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGTAAAGATGTCCGTGGTCAATTTATG +GCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAATGAAAACGGGCAG +AGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGCAGTGAATTTT +TCGCTGCCGAAGCGGGGAGGGTAA +>LIKMDGFN_00005 Histidine transport ATP-binding protein HisP +ATGTCAGAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTG +CTGAAAGGGGTATCGTTGCAGGCCCGCGCCGGAGATGTGATTAGCATCATCGGCTCGTCC +GGCTCCGGTAAAAGCACTTTTTTGCGCTGTATTAACTTCCTCGAAAAACCGAGCGAAGGC +GCGATTATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGACGGGCAGCTCAAA +GTGGCGGATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCAC +TTTAACCTCTGGAGCCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTA +CTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGG +ATTGATGAGCGCGCTCAGGGCAAATATCCCGTCCATCTCTCCGGCGGCCAACAGCAGCGC +GTTTCTATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACA +TCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAA +GAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCCCGCCATGTCTCTTCG +CACGTTATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGCGATCCGGAGCAGGTGTTC +GGCAATCCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAA +>LIKMDGFN_00006 tRNA (guanine-N(1)-)-methyltransferase +GTGTTTATTGGCATCGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGG +GTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACATCCAAAGCTGGAGTCCTCGCGAC +TTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCGGGGATG +TTAATGATGGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCAGGTGAA +GGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGC +GAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTAGATGAG +CGCGTAATTCAGACCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTCAGCGGT +GGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGGGTTCTG +GGGCATGAGGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCAC +TATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCGGGAAAC +CATGCCGAGATACGTCGCTGGCGCTTGAAACAGTCGCTGGGCCGAACCTGGCTTAGAAGA +CCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTC +AAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAG +>LIKMDGFN_00007 hypothetical protein +ATGGCTGGGTTGCACGCGCCATATGCATATAGCGCGCATCATGCGGTGAATTTCTGTTCT +GAGTATAAACGTGGCTTTGTATTGGGTTTTACACACCGTATGTTCGAAAAGACCGGCGAT +CGTCAACTTAGCGCGTGGGAGGCCGGAATTCTGACGCGTCGCTATGGTCTGGATAAAGAA +ATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATGGCGGTTCGCTTCTTTATGGTC +GGTTATCGACTCGAAGGTTGA +>LIKMDGFN_00008 hypothetical protein +ATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGC +CAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTCGAGATGATCGTCCCTCAACTA +CCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAATCTCTCGTGCTTGAGCATGGCGGT +GCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAA +TGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACC +GACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTGCTAGAGTCTCGC +CATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGATCTGG +CTACTGCAACAGACGGGCGATGAAGTGCTGGATTACCGCCAGGCGGTGGCATATTACGCC +TCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCACGGGCTTCGAAGATTAT +TTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGA +>LIKMDGFN_00009 C4-dicarboxylate TRAP transporter large permease protein DctM +ATGAATAGCGGGGGAATTGCCGCCCGGCTGGTCAATTTTGCCAAACTGTTTACTGGCAAA +CTGCCCGGCTCGCTCTCTTATACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCC +GGATCGGCAATTGCCGCCTCAACCTCCATCGGCGGCGTGATGGTGCCGATGAGCGCGCGC +GAAGGTTACGATCGCGGCTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATG +TTAATTCCGCCCACCACGGCTTTTATCCTTTATGCGCTGGCAAGCGGGGGAACATCGATT +GCCGCTCTGTTCGCCGGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTG +GTCACGCTGGTGGTCGCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGC +ATGGCGCTAAAAGTTGCCGTTGAGGCCATTCCCAGCCTGCTGCTGATCGTGATTATCGTC +GGCGGCATTGTGCAGGGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTAT +ACATTATTGTTGACGATGGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATT +TTGCTCCAGACAGTGGTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCG +ATGTCCTTCTCGATGTCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGT +ATTTCCGCCAATAAACTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGC +GCATTTATGGATATCGGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCGATCATG +GCTAAACTGGGCGTCGATCCGGTGCATTTTGGCATTATCATGATCTATAACCTGGCGATT +GGCACCATTACGCCGCCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTC +AAAGTTGAGGAAGTGATTAAACCGTTGCTGCCTTTTTACGGCGCGATTATCGGCGTTCTG +TTATTAATTACCTACATTCCGGAAATCACACTGTTCTTACCCCGTCTACTGGGCATCATG +TAA diff --git a/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.fna b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.fna new file mode 100644 index 0000000000000000000000000000000000000000..c9dacd184cf1469f26718f4e1d00a44270105582 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.fna @@ -0,0 +1,120 @@ +>g4_1 +ATGAAAGCGCTACTGTGGCTGGTGGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGCGAA +AAAGGAATTATCGATAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCC +TATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTG +GCGACGTTAACGGGCCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTA +TATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCG +GGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTG +GAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCG +CAAATTCAGGCATTGATTCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCG +CAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGC +TTCCCGTGGCGCGAGCTGGCGGCGCAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTG +GCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCG +TTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAGCAGCGG +GTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCC +GAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACGCGATCGC +GCGCATATCGCGCGCGATAGATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCG +AGCAGGCTGATTTACCCGGCCTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGA +TTTTCCCCATCGGCGGTATCCGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTG +ACGGTAAGCATGATTATGCGTTTGTGCATATGACGCTGAAAATCGGTACCGGGCGCAGCC +TGGAGAGCCGTCAGGAAGTCGGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCG +ACCTGATGGAGAACCGCTATCTGGCGCTGTCGTTTGAGATTGCCGAGCTACATCCGACGC +TCAATTACAAACAAAACAACGTACACGCGTTATTTAAATAGCCGCATATGCGGCGCTCAG +GAGCGCTAGCATGCTCGATAAACAGACCCATACCCTGATCGCCCAGCGACTTAATCAGGC +TGAAAAACAGCGTGAACAAATTCGCGCAGTGTCGCTGGATTATCCCAACATCACTATTGA +AGATGCCTATGCCGTACAGCGTGAATGGGTCAATATCAAGATCGCCGAAGGGCGCACGCT +CAAAGGCCACAAAATCGGCCTGACCTCAAAAGCGATGCAGGCCAGCTCGCAAATCAGCGA +ACCGGATTACGGCGCGCTGCTTGACGATATGTTCTTCCATGACGGCGGCGATATCCCCAC +CGACCGTTTTATCGTCCCGCGTATTGAAGTGGAGCTGGCGTTCGTGCTGGCGAAACCGCT +GCGCGGCCCTCACTGCACGCTGTTCGACGTCTACAACGCCACGGATTATGTGATTCCGGC +GCTGGAACTGATTGACGCCCGCAGCCACAACATCGACCCGGAAACCCAGCGCCCGCGCAA +AGTGTTCGACACCATTTCCGACAACGCCGCCAACGCCGGGGTGATCCTCGGTGGTCGCCC +CATCAAACCAGACGAGCTGGATCTGCGCTGGATCTCCGCGCTGCTCTATCGCAACGGCGT +GATCGAAGAAACCGGCGTCGCCGCAGGCGTGCTGAATCATCCGGCCAACGGCGTGGCGTG +GCTGGCGAACAAGCTTGCCCCCTACGATGTCCAGCTTGAAGCCGGGCAGATCATCCTCGG +CGGCTCGTTCACCCGCCCGGTGCCGGCGAGCAAGGGCGACACCTTCCATGTCGATTACGG +CAACATGGGCGCGATCAGTTGCCGGTTTGTGTAACCAGATCCGCGCGCGCATATATATCG +CGCGCATACGATGCCCGTGAATAAGTTCTCCCGACGTACCCTCCTGACGGCAGGTTCCGC +GCTTGCTGTTCTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCGAGAC +CGTCGATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGC +CGACGGACAGACCGTGGTCTTACCGCCAGGATGGGTGTGTGAAAATATCAATGCGGCGAT +AACGATTCCGGCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATGGCCGGGG +ACGGTTTATTTTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAA +TGTGACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGACGATGAGCGGCTT +TGGCCCCGTCGCGCAAATTTTCATCGGCGGTAAGGAACCGCAGGTGATGCGTAATCTCAT +TATCGATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCATAA +CCAAATGGACGGCGCGCGGATTACGCATAGCCGCTTTAGCGATTTGCAGGGGGACGCCAT +TGAGTGGAATGTCGCGATTCACGACCGCGACATCCTGATTTCCGATCATGTCATCGAACG +CATTGATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCAC +CTATGACAACAGTTATCCTGAAGATCAGGCAGTAAAAAACTTTGTGGTGGCCAATATTAC +CGGATCTGATTGCCGACAGCTGGTGCACGTAGAAAATGGCAAACATTTCGTCATTCGCAA +TGTCAAAGCCAAAAACATCACGCCCGATTTCAGTAAAAATGCGGGTATTGATAACGCAAC +GATCGCCATTTATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAGTGC +TGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAACTT +TAAATTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCA +AATTTCCTCCGGCAACATCCCCTCTTTTGTCGCCATCACCAATGTACGGATGACGCGTGC +TACGCTGGAACTGCATAATCAACCGCAGCACCTCTTTCTGCGTAATATCAACGTGATGCA +AACTTCAGCGATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGTAAAGATGTCCGTGG +TCAATTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAATGA +AAACGGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGC +AGTGAATTTTTCGCTGCCGAAGCGGGGAGGGTAACAGAGCTCGCGCGATCGCGAATGTCA +GAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTGCTGAAA +GGGGTATCGTTGCAGGCCCGCGCCGGAGATGTGATTAGCATCATCGGCTCGTCCGGCTCC +GGTAAAAGCACTTTTTTGCGCTGTATTAACTTCCTCGAAAAACCGAGCGAAGGCGCGATT +ATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGACGGGCAGCTCAAAGTGGCG +GATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCACTTTAAC +CTCTGGAGCCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTACTGGGA +TTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGGATTGAT +GAGCGCGCTCAGGGCAAATATCCCGTCCATCTCTCCGGCGGCCAACAGCAGCGCGTTTCT +ATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACATCGGCG +CTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAAGAAGGC +AAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCCCGCCATGTCTCTTCGCACGTT +ATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGCGATCCGGAGCAGGTGTTCGGCAAT +CCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAACAGATACGCGCA +TGGCGAGTGTTTATTGGCATCGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGAT +TACGGGGTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACATCCAAAGCTGGAGTCCT +CGCGACTTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCG +GGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCA +GGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGC +GTTAGCGAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTA +GATGAGCGCGTAATTCAGACCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTC +AGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGG +GTTCTGGGGCATGAGGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGT +CCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCG +GGAAACCATGCCGAGATACGTCGCTGGCGCTTGAAACAGTCGCTGGGCCGAACCTGGCTT +AGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCG +GAGTTCAAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAGCGATAG +CGCTAGAGCGATGAATAATCATTTTGGGAAAGGGTTAATGGCTGGGTTGCACGCGCCATA +TGCATATAGCGCGCATCATGCGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATT +GGGTTTTACACACCGTATGTTCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGC +CGGAATTCTGACGCGTCGCTATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGA +GAATCATTCCGGGATGGCGGTTCGCTTCTTTATGGTCGGTTATCGACTCGAAGGTTGACA +GATGCGCGATCGATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCG +GCAAAAGCGTGCCAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTCGAGATGATC +GTCCCTCAACTACCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAATCTCTCGTGCTT +GAGCATGGCGGTGCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACC +TGGCTGTCGCAATGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTT +GAATTACTGACCGACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTG +CTAGAGTCTCGCCATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCG +GACCTGATCTGGCTACTGCAACAGACGGGCGATGAAGTGCTGGATTACCGCCAGGCGGTG +GCATATTACGCCTCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCACGGGC +TTCGAAGATTATTTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGACACGGA +TGCGCCATCGGCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGC +GGGGGAATTGCCGCCCGGCTGGTCAATTTTGCCAAACTGTTTACTGGCAAACTGCCCGGC +TCGCTCTCTTATACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCA +ATTGCCGCCTCAACCTCCATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTAC +GATCGCGGCTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAATTCCG +CCCACCACGGCTTTTATCCTTTATGCGCTGGCAAGCGGGGGAACATCGATTGCCGCTCTG +TTCGCCGGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTG +GTGGTCGCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGCATGGCGCTA +AAAGTTGCCGTTGAGGCCATTCCCAGCCTGCTGCTGATCGTGATTATCGTCGGCGGCATT +GTGCAGGGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTATACATTATTG +TTGACGATGGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAG +ACAGTGGTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTC +TCGATGTCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCC +AATAAACTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATG +GATATCGGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCGATCATGGCTAAACTG +GGCGTCGATCCGGTGCATTTTGGCATTATCATGATCTATAACCTGGCGATTGGCACCATT +ACGCCGCCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAG +GAAGTGATTAAACCGTTGCTGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTATTAATT +ACCTACATTCCGGAAATCACACTGTTCTTACCCCGTCTACTGGGCATCATGTAA diff --git a/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.fsa b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.fsa new file mode 100644 index 0000000000000000000000000000000000000000..79f8dbb42edbbe6742fcab99338d387343c11dc1 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.fsa @@ -0,0 +1,120 @@ +>g4_1 [gcode=11] [organism=Genus species] [strain=strain] +ATGAAAGCGCTACTGTGGCTGGTGGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGCGAA +AAAGGAATTATCGATAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCC +TATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTG +GCGACGTTAACGGGCCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTA +TATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCG +GGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTG +GAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCG +CAAATTCAGGCATTGATTCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCG +CAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGC +TTCCCGTGGCGCGAGCTGGCGGCGCAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTG +GCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCG +TTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAGCAGCGG +GTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCC +GAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACGCGATCGC +GCGCATATCGCGCGCGATAGATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCG +AGCAGGCTGATTTACCCGGCCTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGA +TTTTCCCCATCGGCGGTATCCGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTG +ACGGTAAGCATGATTATGCGTTTGTGCATATGACGCTGAAAATCGGTACCGGGCGCAGCC +TGGAGAGCCGTCAGGAAGTCGGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCG +ACCTGATGGAGAACCGCTATCTGGCGCTGTCGTTTGAGATTGCCGAGCTACATCCGACGC +TCAATTACAAACAAAACAACGTACACGCGTTATTTAAATAGCCGCATATGCGGCGCTCAG +GAGCGCTAGCATGCTCGATAAACAGACCCATACCCTGATCGCCCAGCGACTTAATCAGGC +TGAAAAACAGCGTGAACAAATTCGCGCAGTGTCGCTGGATTATCCCAACATCACTATTGA +AGATGCCTATGCCGTACAGCGTGAATGGGTCAATATCAAGATCGCCGAAGGGCGCACGCT +CAAAGGCCACAAAATCGGCCTGACCTCAAAAGCGATGCAGGCCAGCTCGCAAATCAGCGA +ACCGGATTACGGCGCGCTGCTTGACGATATGTTCTTCCATGACGGCGGCGATATCCCCAC +CGACCGTTTTATCGTCCCGCGTATTGAAGTGGAGCTGGCGTTCGTGCTGGCGAAACCGCT +GCGCGGCCCTCACTGCACGCTGTTCGACGTCTACAACGCCACGGATTATGTGATTCCGGC +GCTGGAACTGATTGACGCCCGCAGCCACAACATCGACCCGGAAACCCAGCGCCCGCGCAA +AGTGTTCGACACCATTTCCGACAACGCCGCCAACGCCGGGGTGATCCTCGGTGGTCGCCC +CATCAAACCAGACGAGCTGGATCTGCGCTGGATCTCCGCGCTGCTCTATCGCAACGGCGT +GATCGAAGAAACCGGCGTCGCCGCAGGCGTGCTGAATCATCCGGCCAACGGCGTGGCGTG +GCTGGCGAACAAGCTTGCCCCCTACGATGTCCAGCTTGAAGCCGGGCAGATCATCCTCGG +CGGCTCGTTCACCCGCCCGGTGCCGGCGAGCAAGGGCGACACCTTCCATGTCGATTACGG +CAACATGGGCGCGATCAGTTGCCGGTTTGTGTAACCAGATCCGCGCGCGCATATATATCG +CGCGCATACGATGCCCGTGAATAAGTTCTCCCGACGTACCCTCCTGACGGCAGGTTCCGC +GCTTGCTGTTCTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCGAGAC +CGTCGATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGC +CGACGGACAGACCGTGGTCTTACCGCCAGGATGGGTGTGTGAAAATATCAATGCGGCGAT +AACGATTCCGGCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATGGCCGGGG +ACGGTTTATTTTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAA +TGTGACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGACGATGAGCGGCTT +TGGCCCCGTCGCGCAAATTTTCATCGGCGGTAAGGAACCGCAGGTGATGCGTAATCTCAT +TATCGATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCATAA +CCAAATGGACGGCGCGCGGATTACGCATAGCCGCTTTAGCGATTTGCAGGGGGACGCCAT +TGAGTGGAATGTCGCGATTCACGACCGCGACATCCTGATTTCCGATCATGTCATCGAACG +CATTGATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCAC +CTATGACAACAGTTATCCTGAAGATCAGGCAGTAAAAAACTTTGTGGTGGCCAATATTAC +CGGATCTGATTGCCGACAGCTGGTGCACGTAGAAAATGGCAAACATTTCGTCATTCGCAA +TGTCAAAGCCAAAAACATCACGCCCGATTTCAGTAAAAATGCGGGTATTGATAACGCAAC +GATCGCCATTTATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAGTGC +TGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAACTT +TAAATTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCA +AATTTCCTCCGGCAACATCCCCTCTTTTGTCGCCATCACCAATGTACGGATGACGCGTGC +TACGCTGGAACTGCATAATCAACCGCAGCACCTCTTTCTGCGTAATATCAACGTGATGCA +AACTTCAGCGATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGTAAAGATGTCCGTGG +TCAATTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAATGA +AAACGGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGC +AGTGAATTTTTCGCTGCCGAAGCGGGGAGGGTAACAGAGCTCGCGCGATCGCGAATGTCA +GAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTGCTGAAA +GGGGTATCGTTGCAGGCCCGCGCCGGAGATGTGATTAGCATCATCGGCTCGTCCGGCTCC +GGTAAAAGCACTTTTTTGCGCTGTATTAACTTCCTCGAAAAACCGAGCGAAGGCGCGATT +ATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGACGGGCAGCTCAAAGTGGCG +GATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCACTTTAAC +CTCTGGAGCCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTACTGGGA +TTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGGATTGAT +GAGCGCGCTCAGGGCAAATATCCCGTCCATCTCTCCGGCGGCCAACAGCAGCGCGTTTCT +ATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACATCGGCG +CTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAAGAAGGC +AAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCCCGCCATGTCTCTTCGCACGTT +ATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGCGATCCGGAGCAGGTGTTCGGCAAT +CCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAACAGATACGCGCA +TGGCGAGTGTTTATTGGCATCGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGAT +TACGGGGTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACATCCAAAGCTGGAGTCCT +CGCGACTTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCG +GGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCA +GGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGC +GTTAGCGAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTA +GATGAGCGCGTAATTCAGACCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTC +AGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGG +GTTCTGGGGCATGAGGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGT +CCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCG +GGAAACCATGCCGAGATACGTCGCTGGCGCTTGAAACAGTCGCTGGGCCGAACCTGGCTT +AGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCG +GAGTTCAAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAGCGATAG +CGCTAGAGCGATGAATAATCATTTTGGGAAAGGGTTAATGGCTGGGTTGCACGCGCCATA +TGCATATAGCGCGCATCATGCGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATT +GGGTTTTACACACCGTATGTTCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGC +CGGAATTCTGACGCGTCGCTATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGA +GAATCATTCCGGGATGGCGGTTCGCTTCTTTATGGTCGGTTATCGACTCGAAGGTTGACA +GATGCGCGATCGATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCG +GCAAAAGCGTGCCAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTCGAGATGATC +GTCCCTCAACTACCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAATCTCTCGTGCTT +GAGCATGGCGGTGCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACC +TGGCTGTCGCAATGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTT +GAATTACTGACCGACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTG +CTAGAGTCTCGCCATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCG +GACCTGATCTGGCTACTGCAACAGACGGGCGATGAAGTGCTGGATTACCGCCAGGCGGTG +GCATATTACGCCTCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCACGGGC +TTCGAAGATTATTTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGACACGGA +TGCGCCATCGGCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGC +GGGGGAATTGCCGCCCGGCTGGTCAATTTTGCCAAACTGTTTACTGGCAAACTGCCCGGC +TCGCTCTCTTATACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCA +ATTGCCGCCTCAACCTCCATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTAC +GATCGCGGCTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAATTCCG +CCCACCACGGCTTTTATCCTTTATGCGCTGGCAAGCGGGGGAACATCGATTGCCGCTCTG +TTCGCCGGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTG +GTGGTCGCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGCATGGCGCTA +AAAGTTGCCGTTGAGGCCATTCCCAGCCTGCTGCTGATCGTGATTATCGTCGGCGGCATT +GTGCAGGGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTATACATTATTG +TTGACGATGGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAG +ACAGTGGTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTC +TCGATGTCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCC +AATAAACTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATG +GATATCGGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCGATCATGGCTAAACTG +GGCGTCGATCCGGTGCATTTTGGCATTATCATGATCTATAACCTGGCGATTGGCACCATT +ACGCCGCCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAG +GAAGTGATTAAACCGTTGCTGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTATTAATT +ACCTACATTCCGGAAATCACACTGTTCTTACCCCGTCTACTGGGCATCATGTAA diff --git a/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.gbk b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.gbk new file mode 100644 index 0000000000000000000000000000000000000000..126bfa9b50b3eda02bd01d9e39a7009eac9f839c --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.gbk @@ -0,0 +1,254 @@ +LOCUS g4_1 7134 bp DNA linear 12-FEB-2019 +DEFINITION Genus species strain strain. +ACCESSION +VERSION +KEYWORDS . +SOURCE Genus species + ORGANISM Genus species + Unclassified. +COMMENT Annotated using prokka 1.14-dev from + https://github.com/tseemann/prokka. +FEATURES Location/Qualifiers + source 1..7134 + /organism="Genus species" + /mol_type="genomic DNA" + /strain="strain" + CDS 1..831 + /gene="amiD" + /locus_tag="LIKMDGFN_00001" + /EC_number="3.5.1.28" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:P75820" + /codon_start=1 + /transl_table=11 + /product="N-acetylmuramoyl-L-alanine amidase AmiD" + /translation="MKALLWLVGLALLLTGCASEKGIIDKEGYQLDTRHRAQAAYPRI + KVLVIHYTAENFDVSLATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGV + SFWRGATRLNDTSIGIELENRGWRMSGGVKSFAPFESAQIQALIPLAKDIIARYDIKP + QNVVAHADIAPQRKDDPGPRFPWRELAAQGIGAWPDAQRVAFYLAGRAPYTPVDTATV + LALLSRYGYEVKADMTAREQQRVIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD + " + CDS 861..1241 + /gene="hpcD" + /locus_tag="LIKMDGFN_00002" + /EC_number="5.3.3.10" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:Q05354" + /codon_start=1 + /transl_table=11 + /product="5-carboxymethyl-2-hydroxymuconate + Delta-isomerase" + /translation="MPHFIAECTENIREQADLPGLFSKVNEALAASGIFPIGGIRSRA + HWLDTWQMADGKHDYAFVHMTLKIGTGRSLESRQEVGEMLFGLIKAHFADLMENRYLA + LSFEIAELHPTLNYKQNNVHALFK" + CDS 1271..2074 + /gene="hpcG" + /locus_tag="LIKMDGFN_00003" + /EC_number="4.2.1.163" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:P42270" + /codon_start=1 + /transl_table=11 + /product="2-oxo-hept-4-ene-1,7-dioate hydratase" + /translation="MLDKQTHTLIAQRLNQAEKQREQIRAVSLDYPNITIEDAYAVQR + EWVNIKIAEGRTLKGHKIGLTSKAMQASSQISEPDYGALLDDMFFHDGGDIPTDRFIV + PRIEVELAFVLAKPLRGPHCTLFDVYNATDYVIPALELIDARSHNIDPETQRPRKVFD + TISDNAANAGVILGGRPIKPDELDLRWISALLYRNGVIEETGVAAGVLNHPANGVAWL + ANKLAPYDVQLEAGQIILGGSFTRPVPASKGDTFHVDYGNMGAISCRFV" + CDS 2111..3514 + /locus_tag="LIKMDGFN_00004" + /inference="ab initio prediction:Prodigal:2.6" + /codon_start=1 + /transl_table=11 + /product="hypothetical protein" + /translation="MPVNKFSRRTLLTAGSALAVLPFLRALPVQAREPRETVDIKDYP + ADDGIASFKQAFADGQTVVLPPGWVCENINAAITIPAGKTLRVQGAVRGNGRGRFILQ + DGCQVVGEQGGSLHNVTLDVRGSDCVIKGVTMSGFGPVAQIFIGGKEPQVMRNLIIDD + ITVTHANYAILRQGFHNQMDGARITHSRFSDLQGDAIEWNVAIHDRDILISDHVIERI + DCTNGKINWGIGIGLAGSTYDNSYPEDQAVKNFVVANITGSDCRQLVHVENGKHFVIR + NVKAKNITPDFSKNAGIDNATIAIYGCDNFVIDNIDMTNSAGMLIGYGVVKGKYLSIP + QNFKLNAIRLDNRQVAYKLRGIQISSGNIPSFVAITNVRMTRATLELHNQPQHLFLRN + INVMQTSAIGPALKMHFDLRKDVRGQFMARQDTLLSLANVHAINENGQSSVDIDRINH + QTVNVEAVNFSLPKRGG" + CDS 3535..4308 + /gene="hisP" + /locus_tag="LIKMDGFN_00005" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:P02915" + /codon_start=1 + /transl_table=11 + /product="Histidine transport ATP-binding protein HisP" + /translation="MSENKLHVIDLHKRYGGHEVLKGVSLQARAGDVISIIGSSGSGK + STFLRCINFLEKPSEGAIIVNGQNINLVRDKDGQLKVADKNQLRLLRTRLTMVFQHFN + LWSHMTVLENVMEAPIQVLGLSKHDARERALKYLAKVGIDERAQGKYPVHLSGGQQQR + VSIARALAMEPDVLLFDEPTSALDPELVGEVLRIMQQLAEEGKTMVVVTHEMGFARHV + SSHVIFLHQGKIEEEGDPEQVFGNPQSPRLQQFLKGSLK" + CDS 4327..5094 + /gene="trmD" + /locus_tag="LIKMDGFN_00006" + /EC_number="2.1.1.228" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:P0A873" + /codon_start=1 + /transl_table=11 + /product="tRNA (guanine-N(1)-)-methyltransferase" + /translation="MFIGIVSLFPEMFRAITDYGVTGRAVKKGLLNIQSWSPRDFAHD + RHRTVDDRPYGGGPGMLMMVQPLRDAIHAAKAAAGEGAKVIYLSPQGRKLDQAGVSEL + ATNQKLILVCGRYEGVDERVIQTEIDEEWSIGDYVLSGGELPAMTLIDSVARFIPGVL + GHEASAIEDSFADGLLDCPHYTRPEVLEGMEVPPVLLSGNHAEIRRWRLKQSLGRTWL + RRPELLENLALTEEQARLLAEFKTEHAQQQHKHDGMA" + CDS 5138..5398 + /locus_tag="LIKMDGFN_00007" + /inference="ab initio prediction:Prodigal:2.6" + /codon_start=1 + /transl_table=11 + /product="hypothetical protein" + /translation="MAGLHAPYAYSAHHAVNFCSEYKRGFVLGFTHRMFEKTGDRQLS + AWEAGILTRRYGLDKEMVMDFFKENHSGMAVRFFMVGYRLEG" + CDS 5413..5994 + /locus_tag="LIKMDGFN_00008" + /inference="ab initio prediction:Prodigal:2.6" + /codon_start=1 + /transl_table=11 + /product="hypothetical protein" + /translation="MSTLLYLHGFNSSPRSAKACQLKNWLAERHPHVEMIVPQLPPYP + ADAAELLESLVLEHGGAPLGLVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDY + LGQNENPYTGQQYVLESRHIYDLKVMQIDPLEAPDLIWLLQQTGDEVLDYRQAVAYYA + SCRQTVTEGGNHAFTGFEDYFNQIVDFLGLHSC" + CDS 6052..7134 + /gene="dctM" + /locus_tag="LIKMDGFN_00009" + /inference="ab initio prediction:Prodigal:2.6" + /inference="similar to AA sequence:UniProtKB:O07838" + /codon_start=1 + /transl_table=11 + /product="C4-dicarboxylate TRAP transporter large permease + protein DctM" + /translation="MNSGGIAARLVNFAKLFTGKLPGSLSYTNIVGNMMFGAISGSAI + AASTSIGGVMVPMSAREGYDRGFAAAVNIASAPTGMLIPPTTAFILYALASGGTSIAA + LFAGGLVAGVLWGVGCMLVTLVVAKRRNYRVFFTVQKGMALKVAVEAIPSLLLIVIIV + GGIVQGIFTAIEASAIAVVYTLLLTMVFYRTLKIKDLPSILLQTVVMTGVIMFLLATS + SAMSFSMSITNIPAALSDMILGISANKLVILLVITVFLLIIGAFMDIGPAILIFTPIL + LPIMAKLGVDPVHFGIIMIYNLAIGTITPPVGSGLYVGASVGKVKVEEVIKPLLPFYG + AIIGVLLLITYIPEITLFLPRLLGIM" +ORIGIN + 1 atgaaagcgc tactgtggct ggtgggtctc gcgttgctgt taacaggctg cgcgagcgaa + 61 aaaggaatta tcgataaaga gggatatcag cttgataccc gacatcgggc gcaggcggcc + 121 tatccgcgca ttaaagtcct ggtgattcac tatacggcgg aaaactttga cgtttcgctg + 181 gcgacgttaa cgggccgcaa cgtcagttcg cattacctga ttcccgcaac cccgccatta + 241 tatggcggta aaccgcgcat ctggcaactg gtgccggaac aggatcaggc ctggcatgcg + 301 ggcgtcagtt tctggcgagg cgccacgcgt ctcaatgata cgtctattgg cattgagctg + 361 gaaaatcgcg gctggcgaat gtccggcggg gtgaaatctt tcgcgccgtt tgaatccgcg + 421 caaattcagg cattgattcc gttagcgaag gacattatcg cgcgctatga catcaaaccg + 481 cagaatgtgg tggcccatgc ggatatcgcg ccgcagcgta aagacgatcc cggcccgcgc + 541 ttcccgtggc gcgagctggc ggcgcagggg attggcgcct ggcctgacgc ccagcgtgtg + 601 gcgttttatc tggctggacg cgcgccgtat acgccagtcg ataccgcaac ggtgcttgcg + 661 ttactctcgc gctatggcta tgaagtcaaa gccgatatga cggcgcgcga gcagcagcgg + 721 gtgattatgg cgttccagat gcacttccgt ccggcgcaat ggaacggtat cgcagatgcc + 781 gaaacgcagg cgattgccga agcattactg gagaagtacg gccaggatta acgcgatcgc + 841 gcgcatatcg cgcgcgatag atgccgcact ttattgctga atgtactgaa aatattcgcg + 901 agcaggctga tttacccggc ctgttcagca aggtaaacga ggcgctggcc gccagcggga + 961 ttttccccat cggcggtatc cgcagtcgcg cccactggct ggatacctgg cagatggctg + 1021 acggtaagca tgattatgcg tttgtgcata tgacgctgaa aatcggtacc gggcgcagcc + 1081 tggagagccg tcaggaagtc ggcgaaatgc tgtttgggct gattaaagcc cacttcgccg + 1141 acctgatgga gaaccgctat ctggcgctgt cgtttgagat tgccgagcta catccgacgc + 1201 tcaattacaa acaaaacaac gtacacgcgt tatttaaata gccgcatatg cggcgctcag + 1261 gagcgctagc atgctcgata aacagaccca taccctgatc gcccagcgac ttaatcaggc + 1321 tgaaaaacag cgtgaacaaa ttcgcgcagt gtcgctggat tatcccaaca tcactattga + 1381 agatgcctat gccgtacagc gtgaatgggt caatatcaag atcgccgaag ggcgcacgct + 1441 caaaggccac aaaatcggcc tgacctcaaa agcgatgcag gccagctcgc aaatcagcga + 1501 accggattac ggcgcgctgc ttgacgatat gttcttccat gacggcggcg atatccccac + 1561 cgaccgtttt atcgtcccgc gtattgaagt ggagctggcg ttcgtgctgg cgaaaccgct + 1621 gcgcggccct cactgcacgc tgttcgacgt ctacaacgcc acggattatg tgattccggc + 1681 gctggaactg attgacgccc gcagccacaa catcgacccg gaaacccagc gcccgcgcaa + 1741 agtgttcgac accatttccg acaacgccgc caacgccggg gtgatcctcg gtggtcgccc + 1801 catcaaacca gacgagctgg atctgcgctg gatctccgcg ctgctctatc gcaacggcgt + 1861 gatcgaagaa accggcgtcg ccgcaggcgt gctgaatcat ccggccaacg gcgtggcgtg + 1921 gctggcgaac aagcttgccc cctacgatgt ccagcttgaa gccgggcaga tcatcctcgg + 1981 cggctcgttc acccgcccgg tgccggcgag caagggcgac accttccatg tcgattacgg + 2041 caacatgggc gcgatcagtt gccggtttgt gtaaccagat ccgcgcgcgc atatatatcg + 2101 cgcgcatacg atgcccgtga ataagttctc ccgacgtacc ctcctgacgg caggttccgc + 2161 gcttgctgtt cttccttttc tgcgcgcctt gccggtacag gcgcgtgaac ctcgcgagac + 2221 cgtcgatatt aaggattatc cggcggatga cggtatcgcc tcgttcaaac aggccttcgc + 2281 cgacggacag accgtggtct taccgccagg atgggtgtgt gaaaatatca atgcggcgat + 2341 aacgattccg gcgggaaaaa cgctgcgggt acagggcgcg gtgcgtggga atggccgggg + 2401 acggtttatt ttgcaggacg ggtgtcaggt ggtgggggag cagggcggca gtctgcacaa + 2461 tgtgacgctg gatgttcgcg ggtcggactg tgtgattaaa ggcgtgacga tgagcggctt + 2521 tggccccgtc gcgcaaattt tcatcggcgg taaggaaccg caggtgatgc gtaatctcat + 2581 tatcgatgac atcaccgtta cccacgccaa ctacgccatt ctccgccagg gatttcataa + 2641 ccaaatggac ggcgcgcgga ttacgcatag ccgctttagc gatttgcagg gggacgccat + 2701 tgagtggaat gtcgcgattc acgaccgcga catcctgatt tccgatcatg tcatcgaacg + 2761 cattgattgt accaatggca aaatcaactg ggggatcggc atcgggctgg cgggtagcac + 2821 ctatgacaac agttatcctg aagatcaggc agtaaaaaac tttgtggtgg ccaatattac + 2881 cggatctgat tgccgacagc tggtgcacgt agaaaatggc aaacatttcg tcattcgcaa + 2941 tgtcaaagcc aaaaacatca cgcccgattt cagtaaaaat gcgggtattg ataacgcaac + 3001 gatcgccatt tatggctgtg ataatttcgt cattgataat attgatatga cgaatagtgc + 3061 tgggatgctc atcggctatg gcgtcgttaa aggaaaatac ctgtcaattc cgcaaaactt + 3121 taaattaaac gctattcggt tggataatcg ccaggttgct tataaattac gcggcattca + 3181 aatttcctcc ggcaacatcc cctcttttgt cgccatcacc aatgtacgga tgacgcgtgc + 3241 tacgctggaa ctgcataatc aaccgcagca cctctttctg cgtaatatca acgtgatgca + 3301 aacttcagcg attggcccgg cgttaaaaat gcatttcgat ttgcgtaaag atgtccgtgg + 3361 tcaatttatg gcccgccagg acacgctgct ttccctcgct aatgttcatg ccatcaatga + 3421 aaacgggcag agttccgtgg atatcgacag gattaatcac caaaccgtga atgtcgaagc + 3481 agtgaatttt tcgctgccga agcggggagg gtaacagagc tcgcgcgatc gcgaatgtca + 3541 gaaaataaat tacacgttat cgatttgcac aaacgctacg gcggtcatga agtgctgaaa + 3601 ggggtatcgt tgcaggcccg cgccggagat gtgattagca tcatcggctc gtccggctcc + 3661 ggtaaaagca cttttttgcg ctgtattaac ttcctcgaaa aaccgagcga aggcgcgatt + 3721 atcgtgaacg gtcagaacat taatctggtg cgcgacaaag acgggcagct caaagtggcg + 3781 gataaaaatc agctacgctt gttgcgtacc cgcctgacga tggtgtttca gcactttaac + 3841 ctctggagcc acatgacggt gctggaaaat gtgatggaag cgccgattca ggtactggga + 3901 ttaagcaagc acgacgcgcg cgagcgggcg ttgaaatatc tggcgaaggt ggggattgat + 3961 gagcgcgctc agggcaaata tcccgtccat ctctccggcg gccaacagca gcgcgtttct + 4021 attgcgcgcg cgctggcgat ggaacctgac gttttactgt tcgatgaacc cacatcggcg + 4081 ctcgatcctg aactggtcgg cgaagtgttg cgcatcatgc aacaactggc ggaagaaggc + 4141 aaaacgatgg tggtggtcac gcatgaaatg ggcttcgccc gccatgtctc ttcgcacgtt + 4201 atttttctgc atcaggggaa aattgaagaa gagggcgatc cggagcaggt gttcggcaat + 4261 ccgcaaagcc cgcgtttaca gcaattcctg aaaggctcgc tgaaataaca gatacgcgca + 4321 tggcgagtgt ttattggcat cgttagcctg tttcctgaaa tgttccgcgc aattaccgat + 4381 tacggggtaa ctggccgggc agtaaaaaaa ggcctgctga acatccaaag ctggagtcct + 4441 cgcgacttcg cgcatgaccg gcaccgtacc gtggacgacc gtccttacgg cggcggaccg + 4501 gggatgttaa tgatggtgca acccttgcgg gacgccattc acgcagcaaa agccgcggca + 4561 ggtgaaggcg ctaaagtgat ttatctgtcg cctcagggac gcaagcttga tcaagcgggc + 4621 gttagcgagc tggccacgaa tcagaagctt attctggtgt gtggtcgcta cgaaggcgta + 4681 gatgagcgcg taattcagac cgaaattgac gaagaatggt caattggcga ttacgttctc + 4741 agcggtggcg aactaccggc aatgacgctg attgactccg tcgcccggtt tataccgggg + 4801 gttctggggc atgaggcatc agcaatcgaa gattcgtttg ctgatgggtt gctggattgt + 4861 ccgcactata cgcgccctga agtgttagag gggatggaag taccgccagt attgctgtcg + 4921 ggaaaccatg ccgagatacg tcgctggcgc ttgaaacagt cgctgggccg aacctggctt + 4981 agaagacctg aacttctgga aaacctggct ctgactgaag agcaagcaag gttgctggcg + 5041 gagttcaaaa cagaacacgc acaacagcag cataaacatg atgggatggc atagcgatag + 5101 cgctagagcg atgaataatc attttgggaa agggttaatg gctgggttgc acgcgccata + 5161 tgcatatagc gcgcatcatg cggtgaattt ctgttctgag tataaacgtg gctttgtatt + 5221 gggttttaca caccgtatgt tcgaaaagac cggcgatcgt caacttagcg cgtgggaggc + 5281 cggaattctg acgcgtcgct atggtctgga taaagaaatg gtgatggatt tctttaaaga + 5341 gaatcattcc gggatggcgg ttcgcttctt tatggtcggt tatcgactcg aaggttgaca + 5401 gatgcgcgat cgatgtctac gcttctctat ttgcacggat tcaacagttc ccctcgctcg + 5461 gcaaaagcgt gccagctaaa aaactggctg gcggagcgtc atccgcatgt cgagatgatc + 5521 gtccctcaac taccgccgta tcctgccgat gcggcggagt tgctggaatc tctcgtgctt + 5581 gagcatggcg gtgcgccatt agggctggta ggatcgtcgc tgggtggtta ttacgccacc + 5641 tggctgtcgc aatgttttat gctgccggct gtggtggtga atcccgccgt gcggcccttt + 5701 gaattactga ccgactatct cggtcagaac gagaacccct acaccgggca gcaatatgtg + 5761 ctagagtctc gccatattta tgatcttaaa gtcatgcaga ttgacccgct ggaagcgccg + 5821 gacctgatct ggctactgca acagacgggc gatgaagtgc tggattaccg ccaggcggtg + 5881 gcatattacg cctcctgccg tcagacagtg accgagggtg gtaatcacgc attcacgggc + 5941 ttcgaagatt atttcaacca gattgtcgat tttcttggac tgcacagttg ctgacacgga + 6001 tgcgccatcg gcttgctggc cgtgccgttc ttcgttttgt ccggggtgat catgaatagc + 6061 gggggaattg ccgcccggct ggtcaatttt gccaaactgt ttactggcaa actgcccggc + 6121 tcgctctctt ataccaacat cgtcggcaat atgatgttcg gtgcaatttc cggatcggca + 6181 attgccgcct caacctccat cggcggcgtg atggtgccga tgagcgcgcg cgaaggttac + 6241 gatcgcggct ttgcggccgc ggtgaatatc gcctccgcgc cgacgggaat gttaattccg + 6301 cccaccacgg cttttatcct ttatgcgctg gcaagcgggg gaacatcgat tgccgctctg + 6361 ttcgccggcg gtctggtcgc gggagtgctg tggggcgttg gctgtatgct ggtcacgctg + 6421 gtggtcgcta agcgtcgaaa ttatcgggtt ttcttcaccg tccaaaaagg catggcgcta + 6481 aaagttgccg ttgaggccat tcccagcctg ctgctgatcg tgattatcgt cggcggcatt + 6541 gtgcagggga ttttcaccgc cattgaagcc tccgcgattg ccgtggtgta tacattattg + 6601 ttgacgatgg tgttttaccg cacgctgaaa attaaggatt tgccttcgat tttgctccag + 6661 acagtggtaa tgaccggggt catcatgttc ctgctggcaa cctcttcggc gatgtccttc + 6721 tcgatgtcga tcaccaatat tcctgcggcg ctgagcgata tgatcctcgg tatttccgcc + 6781 aataaactgg ttatcctgtt agtcattacc gtctttttgt tgattatcgg cgcatttatg + 6841 gatatcggtc cggccattct gatttttacc ccgattctgc tgccgatcat ggctaaactg + 6901 ggcgtcgatc cggtgcattt tggcattatc atgatctata acctggcgat tggcaccatt + 6961 acgccgccag ttggcagtgg tttatatgtc ggggcgagcg tcggtaaggt caaagttgag + 7021 gaagtgatta aaccgttgct gcctttttac ggcgcgatta tcggcgttct gttattaatt + 7081 acctacattc cggaaatcac actgttctta ccccgtctac tgggcatcat gtaa +// diff --git a/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.gff b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.gff new file mode 100644 index 0000000000000000000000000000000000000000..1ee608aac1393847560afc7f300043966f1d885c --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.gff @@ -0,0 +1,132 @@ +##gff-version 3 +##sequence-region g4_1 1 7134 +g4_1 Prodigal:2.6 CDS 1 831 . + 0 ID=LIKMDGFN_00001;eC_number=3.5.1.28;Name=amiD;dbxref=COG:COG3023;gene=amiD;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:P75820;locus_tag=LIKMDGFN_00001;product=N-acetylmuramoyl-L-alanine amidase AmiD +g4_1 Prodigal:2.6 CDS 861 1241 . + 0 ID=LIKMDGFN_00002;eC_number=5.3.3.10;Name=hpcD;gene=hpcD;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:Q05354;locus_tag=LIKMDGFN_00002;product=5-carboxymethyl-2-hydroxymuconate Delta-isomerase +g4_1 Prodigal:2.6 CDS 1271 2074 . + 0 ID=LIKMDGFN_00003;eC_number=4.2.1.163;Name=hpcG;gene=hpcG;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:P42270;locus_tag=LIKMDGFN_00003;product=2-oxo-hept-4-ene-1%2C7-dioate hydratase +g4_1 Prodigal:2.6 CDS 2111 3514 . + 0 ID=LIKMDGFN_00004;inference=ab initio prediction:Prodigal:2.6;locus_tag=LIKMDGFN_00004;product=hypothetical protein +g4_1 Prodigal:2.6 CDS 3535 4308 . + 0 ID=LIKMDGFN_00005;Name=hisP;dbxref=COG:COG4598;gene=hisP;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:P02915;locus_tag=LIKMDGFN_00005;product=Histidine transport ATP-binding protein HisP +g4_1 Prodigal:2.6 CDS 4327 5094 . + 0 ID=LIKMDGFN_00006;eC_number=2.1.1.228;Name=trmD;dbxref=COG:COG0336;gene=trmD;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:P0A873;locus_tag=LIKMDGFN_00006;product=tRNA (guanine-N(1)-)-methyltransferase +g4_1 Prodigal:2.6 CDS 5138 5398 . + 0 ID=LIKMDGFN_00007;inference=ab initio prediction:Prodigal:2.6;locus_tag=LIKMDGFN_00007;product=hypothetical protein +g4_1 Prodigal:2.6 CDS 5413 5994 . + 0 ID=LIKMDGFN_00008;inference=ab initio prediction:Prodigal:2.6;locus_tag=LIKMDGFN_00008;product=hypothetical protein +g4_1 Prodigal:2.6 CDS 6052 7134 . + 0 ID=LIKMDGFN_00009;Name=dctM;dbxref=COG:COG1593;gene=dctM;inference=ab initio prediction:Prodigal:2.6,similar to AA sequence:UniProtKB:O07838;locus_tag=LIKMDGFN_00009;product=C4-dicarboxylate TRAP transporter large permease protein DctM +##FASTA +>g4_1 +ATGAAAGCGCTACTGTGGCTGGTGGGTCTCGCGTTGCTGTTAACAGGCTGCGCGAGCGAA +AAAGGAATTATCGATAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCC +TATCCGCGCATTAAAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTG +GCGACGTTAACGGGCCGCAACGTCAGTTCGCATTACCTGATTCCCGCAACCCCGCCATTA +TATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGGCCTGGCATGCG +GGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTG +GAAAATCGCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCG +CAAATTCAGGCATTGATTCCGTTAGCGAAGGACATTATCGCGCGCTATGACATCAAACCG +CAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAGACGATCCCGGCCCGCGC +TTCCCGTGGCGCGAGCTGGCGGCGCAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTG +GCGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCG +TTACTCTCGCGCTATGGCTATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAGCAGCGG +GTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGCAATGGAACGGTATCGCAGATGCC +GAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACGCGATCGC +GCGCATATCGCGCGCGATAGATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCG +AGCAGGCTGATTTACCCGGCCTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGA +TTTTCCCCATCGGCGGTATCCGCAGTCGCGCCCACTGGCTGGATACCTGGCAGATGGCTG +ACGGTAAGCATGATTATGCGTTTGTGCATATGACGCTGAAAATCGGTACCGGGCGCAGCC +TGGAGAGCCGTCAGGAAGTCGGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCG +ACCTGATGGAGAACCGCTATCTGGCGCTGTCGTTTGAGATTGCCGAGCTACATCCGACGC +TCAATTACAAACAAAACAACGTACACGCGTTATTTAAATAGCCGCATATGCGGCGCTCAG +GAGCGCTAGCATGCTCGATAAACAGACCCATACCCTGATCGCCCAGCGACTTAATCAGGC +TGAAAAACAGCGTGAACAAATTCGCGCAGTGTCGCTGGATTATCCCAACATCACTATTGA +AGATGCCTATGCCGTACAGCGTGAATGGGTCAATATCAAGATCGCCGAAGGGCGCACGCT +CAAAGGCCACAAAATCGGCCTGACCTCAAAAGCGATGCAGGCCAGCTCGCAAATCAGCGA +ACCGGATTACGGCGCGCTGCTTGACGATATGTTCTTCCATGACGGCGGCGATATCCCCAC +CGACCGTTTTATCGTCCCGCGTATTGAAGTGGAGCTGGCGTTCGTGCTGGCGAAACCGCT +GCGCGGCCCTCACTGCACGCTGTTCGACGTCTACAACGCCACGGATTATGTGATTCCGGC +GCTGGAACTGATTGACGCCCGCAGCCACAACATCGACCCGGAAACCCAGCGCCCGCGCAA +AGTGTTCGACACCATTTCCGACAACGCCGCCAACGCCGGGGTGATCCTCGGTGGTCGCCC +CATCAAACCAGACGAGCTGGATCTGCGCTGGATCTCCGCGCTGCTCTATCGCAACGGCGT +GATCGAAGAAACCGGCGTCGCCGCAGGCGTGCTGAATCATCCGGCCAACGGCGTGGCGTG +GCTGGCGAACAAGCTTGCCCCCTACGATGTCCAGCTTGAAGCCGGGCAGATCATCCTCGG +CGGCTCGTTCACCCGCCCGGTGCCGGCGAGCAAGGGCGACACCTTCCATGTCGATTACGG +CAACATGGGCGCGATCAGTTGCCGGTTTGTGTAACCAGATCCGCGCGCGCATATATATCG +CGCGCATACGATGCCCGTGAATAAGTTCTCCCGACGTACCCTCCTGACGGCAGGTTCCGC +GCTTGCTGTTCTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCGAGAC +CGTCGATATTAAGGATTATCCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGC +CGACGGACAGACCGTGGTCTTACCGCCAGGATGGGTGTGTGAAAATATCAATGCGGCGAT +AACGATTCCGGCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATGGCCGGGG +ACGGTTTATTTTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAA +TGTGACGCTGGATGTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGACGATGAGCGGCTT +TGGCCCCGTCGCGCAAATTTTCATCGGCGGTAAGGAACCGCAGGTGATGCGTAATCTCAT +TATCGATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAGGGATTTCATAA +CCAAATGGACGGCGCGCGGATTACGCATAGCCGCTTTAGCGATTTGCAGGGGGACGCCAT +TGAGTGGAATGTCGCGATTCACGACCGCGACATCCTGATTTCCGATCATGTCATCGAACG +CATTGATTGTACCAATGGCAAAATCAACTGGGGGATCGGCATCGGGCTGGCGGGTAGCAC +CTATGACAACAGTTATCCTGAAGATCAGGCAGTAAAAAACTTTGTGGTGGCCAATATTAC +CGGATCTGATTGCCGACAGCTGGTGCACGTAGAAAATGGCAAACATTTCGTCATTCGCAA +TGTCAAAGCCAAAAACATCACGCCCGATTTCAGTAAAAATGCGGGTATTGATAACGCAAC +GATCGCCATTTATGGCTGTGATAATTTCGTCATTGATAATATTGATATGACGAATAGTGC +TGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAAAATACCTGTCAATTCCGCAAAACTT +TAAATTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGCATTCA +AATTTCCTCCGGCAACATCCCCTCTTTTGTCGCCATCACCAATGTACGGATGACGCGTGC +TACGCTGGAACTGCATAATCAACCGCAGCACCTCTTTCTGCGTAATATCAACGTGATGCA +AACTTCAGCGATTGGCCCGGCGTTAAAAATGCATTTCGATTTGCGTAAAGATGTCCGTGG +TCAATTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCATGCCATCAATGA +AAACGGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGC +AGTGAATTTTTCGCTGCCGAAGCGGGGAGGGTAACAGAGCTCGCGCGATCGCGAATGTCA +GAAAATAAATTACACGTTATCGATTTGCACAAACGCTACGGCGGTCATGAAGTGCTGAAA +GGGGTATCGTTGCAGGCCCGCGCCGGAGATGTGATTAGCATCATCGGCTCGTCCGGCTCC +GGTAAAAGCACTTTTTTGCGCTGTATTAACTTCCTCGAAAAACCGAGCGAAGGCGCGATT +ATCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGACGGGCAGCTCAAAGTGGCG +GATAAAAATCAGCTACGCTTGTTGCGTACCCGCCTGACGATGGTGTTTCAGCACTTTAAC +CTCTGGAGCCACATGACGGTGCTGGAAAATGTGATGGAAGCGCCGATTCAGGTACTGGGA +TTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGGATTGAT +GAGCGCGCTCAGGGCAAATATCCCGTCCATCTCTCCGGCGGCCAACAGCAGCGCGTTTCT +ATTGCGCGCGCGCTGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACATCGGCG +CTCGATCCTGAACTGGTCGGCGAAGTGTTGCGCATCATGCAACAACTGGCGGAAGAAGGC +AAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCCCGCCATGTCTCTTCGCACGTT +ATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGCGATCCGGAGCAGGTGTTCGGCAAT +CCGCAAAGCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAACAGATACGCGCA +TGGCGAGTGTTTATTGGCATCGTTAGCCTGTTTCCTGAAATGTTCCGCGCAATTACCGAT +TACGGGGTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACATCCAAAGCTGGAGTCCT +CGCGACTTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCG +GGGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCA +GGTGAAGGCGCTAAAGTGATTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGC +GTTAGCGAGCTGGCCACGAATCAGAAGCTTATTCTGGTGTGTGGTCGCTACGAAGGCGTA +GATGAGCGCGTAATTCAGACCGAAATTGACGAAGAATGGTCAATTGGCGATTACGTTCTC +AGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGG +GTTCTGGGGCATGAGGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGT +CCGCACTATACGCGCCCTGAAGTGTTAGAGGGGATGGAAGTACCGCCAGTATTGCTGTCG +GGAAACCATGCCGAGATACGTCGCTGGCGCTTGAAACAGTCGCTGGGCCGAACCTGGCTT +AGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCG +GAGTTCAAAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAGCGATAG +CGCTAGAGCGATGAATAATCATTTTGGGAAAGGGTTAATGGCTGGGTTGCACGCGCCATA +TGCATATAGCGCGCATCATGCGGTGAATTTCTGTTCTGAGTATAAACGTGGCTTTGTATT +GGGTTTTACACACCGTATGTTCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGC +CGGAATTCTGACGCGTCGCTATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGA +GAATCATTCCGGGATGGCGGTTCGCTTCTTTATGGTCGGTTATCGACTCGAAGGTTGACA +GATGCGCGATCGATGTCTACGCTTCTCTATTTGCACGGATTCAACAGTTCCCCTCGCTCG +GCAAAAGCGTGCCAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTCGAGATGATC +GTCCCTCAACTACCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAATCTCTCGTGCTT +GAGCATGGCGGTGCGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACC +TGGCTGTCGCAATGTTTTATGCTGCCGGCTGTGGTGGTGAATCCCGCCGTGCGGCCCTTT +GAATTACTGACCGACTATCTCGGTCAGAACGAGAACCCCTACACCGGGCAGCAATATGTG +CTAGAGTCTCGCCATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCG +GACCTGATCTGGCTACTGCAACAGACGGGCGATGAAGTGCTGGATTACCGCCAGGCGGTG +GCATATTACGCCTCCTGCCGTCAGACAGTGACCGAGGGTGGTAATCACGCATTCACGGGC +TTCGAAGATTATTTCAACCAGATTGTCGATTTTCTTGGACTGCACAGTTGCTGACACGGA +TGCGCCATCGGCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGC +GGGGGAATTGCCGCCCGGCTGGTCAATTTTGCCAAACTGTTTACTGGCAAACTGCCCGGC +TCGCTCTCTTATACCAACATCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCA +ATTGCCGCCTCAACCTCCATCGGCGGCGTGATGGTGCCGATGAGCGCGCGCGAAGGTTAC +GATCGCGGCTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAATTCCG +CCCACCACGGCTTTTATCCTTTATGCGCTGGCAAGCGGGGGAACATCGATTGCCGCTCTG +TTCGCCGGCGGTCTGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTG +GTGGTCGCTAAGCGTCGAAATTATCGGGTTTTCTTCACCGTCCAAAAAGGCATGGCGCTA +AAAGTTGCCGTTGAGGCCATTCCCAGCCTGCTGCTGATCGTGATTATCGTCGGCGGCATT +GTGCAGGGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTATACATTATTG +TTGACGATGGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAG +ACAGTGGTAATGACCGGGGTCATCATGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTC +TCGATGTCGATCACCAATATTCCTGCGGCGCTGAGCGATATGATCCTCGGTATTTCCGCC +AATAAACTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATG +GATATCGGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCGATCATGGCTAAACTG +GGCGTCGATCCGGTGCATTTTGGCATTATCATGATCTATAACCTGGCGATTGGCACCATT +ACGCCGCCAGTTGGCAGTGGTTTATATGTCGGGGCGAGCGTCGGTAAGGTCAAAGTTGAG +GAAGTGATTAAACCGTTGCTGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTATTAATT +ACCTACATTCCGGAAATCACACTGTTCTTACCCCGTCTACTGGGCATCATGTAA diff --git a/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.log b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.log new file mode 100644 index 0000000000000000000000000000000000000000..2d2168766f191824a1996d5ce2a42ff6654d9700 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.log @@ -0,0 +1,119 @@ +[13:18:51] This is prokka 1.14-dev +[13:18:51] Written by Torsten Seemann <torsten.seemann@gmail.com> +[13:18:51] Homepage is https://github.com/tseemann/prokka +[13:18:51] Local time is Tue Feb 12 13:18:51 2019 +[13:18:51] You are aperrin +[13:18:51] Operating system is darwin +[13:18:51] You have BioPerl 1.006924 +[13:18:51] System has 8 cores. +[13:18:51] Will use maximum of 1 cores. +[13:18:51] Annotating as >>> Bacteria <<< +[13:18:51] Generating locus_tag from 'Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna' contents. +[13:18:51] Setting --locustag LIKMDGFN from MD5 5246d0f7b5f59e1da550241263d19049 +[13:18:51] Creating new output folder: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes +[13:18:51] Running: mkdir -p Examples\/1\-res\-Annotate\/tmp_files\/genome4\.fst\-split5N\.fna\-prokkaRes +[13:18:51] Using filename prefix: GEN4.1111.00001.XXX +[13:18:51] Setting HMMER_NCPU=1 +[13:18:51] Writing log to: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.log +[13:18:51] Command: /usr/local/bin/prokka --outdir Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes --cpus 1 --prefix GEN4.1111.00001 Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna +[13:18:51] Appending to PATH: /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin +[13:18:51] Appending to PATH: /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/../common +[13:18:51] Appending to PATH: /Users/aperrin/Softwares/src/prokka/bin +[13:18:51] Looking for 'aragorn' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/aragorn +[13:18:51] Determined aragorn version is 1.2 +[13:18:51] Looking for 'barrnap' - found /usr/local/bin/barrnap +[13:18:51] Determined barrnap version is 0.8 +[13:18:51] Looking for 'blastp' - found /Users/aperrin/Softwares/bin/blastp +[13:18:51] Determined blastp version is 2.3 +[13:18:51] Looking for 'cmpress' - found /usr/local/bin/cmpress +[13:18:51] Determined cmpress version is 1.1 +[13:18:51] Looking for 'cmscan' - found /usr/local/bin/cmscan +[13:18:51] Determined cmscan version is 1.1 +[13:18:51] Looking for 'egrep' - found /usr/bin/egrep +[13:18:51] Looking for 'find' - found /usr/bin/find +[13:18:51] Looking for 'grep' - found /usr/bin/grep +[13:18:51] Looking for 'hmmpress' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/hmmpress +[13:18:51] Determined hmmpress version is 3.1 +[13:18:51] Looking for 'hmmscan' - found /usr/local/bin/hmmscan +[13:18:51] Determined hmmscan version is 3.1 +[13:18:51] Looking for 'java' - found /usr/bin/java +[13:18:51] Looking for 'less' - found /usr/bin/less +[13:18:51] Looking for 'makeblastdb' - found /Users/aperrin/Softwares/bin/makeblastdb +[13:18:51] Determined makeblastdb version is 2.3 +[13:18:51] Looking for 'minced' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/../common/minced +[13:18:51] Determined minced version is 2.0 +[13:18:51] Looking for 'parallel' - found /usr/local/bin/parallel +[13:18:52] Determined parallel version is 20181022 +[13:18:52] Looking for 'prodigal' - found /usr/local/bin/prodigal +[13:18:52] Determined prodigal version is 2.6 +[13:18:52] Looking for 'prokka-genbank_to_fasta_db' - found /Users/aperrin/Softwares/src/prokka/bin/prokka-genbank_to_fasta_db +[13:18:52] Looking for 'sed' - found /usr/bin/sed +[13:18:52] Looking for 'tbl2asn' - found /Users/aperrin/Softwares/src/prokka/bin/../binaries/darwin/tbl2asn +[13:18:52] Determined tbl2asn version is 25.6 +[13:18:52] Using genetic code table 11. +[13:18:52] Loading and checking input file: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna +[13:18:52] Wrote 1 contigs totalling 7134 bp. +[13:18:52] Predicting tRNAs and tmRNAs +[13:18:52] Running: aragorn -l -gc11 -w Examples\/1\-res\-Annotate\/tmp_files\/genome4\.fst\-split5N\.fna\-prokkaRes\/GEN4\.1111\.00001\.fna +[13:18:52] Found 0 tRNAs +[13:18:52] Predicting Ribosomal RNAs +[13:18:52] Running Barrnap with 1 threads +[13:18:52] Found 0 rRNAs +[13:18:52] Skipping ncRNA search, enable with --rfam if desired. +[13:18:52] Total of 0 tRNA + rRNA features +[13:18:52] Searching for CRISPR repeats +[13:18:52] Found 0 CRISPRs +[13:18:52] Predicting coding sequences +[13:18:52] Contigs total 7134 bp, so using meta mode +[13:18:52] Running: prodigal -i Examples\/1\-res\-Annotate\/tmp_files\/genome4\.fst\-split5N\.fna\-prokkaRes\/GEN4\.1111\.00001\.fna -c -m -g 11 -p meta -f sco -q +[13:18:52] Found 9 CDS +[13:18:52] Connecting features back to sequences +[13:18:52] Not using genus-specific database. Try --usegenus to enable it. +[13:18:52] Annotating CDS, please be patient. +[13:18:52] Will use 1 CPUs for similarity searching. +[13:18:52] There are still 9 unannotated CDS left (started with 9) +[13:18:52] Will use blast to search against /Users/aperrin/Softwares/src/prokka/db/kingdom/Bacteria/sprot with 1 CPUs +[13:18:52] Running: cat Examples\/1\-res\-Annotate\/tmp_files\/genome4\.fst\-split5N\.fna\-prokkaRes\/sprot\.faa | parallel --gnu --plain -j 1 --block 1161 --recstart '>' --pipe blastp -query - -db /Users/aperrin/Softwares/src/prokka/db/kingdom/Bacteria/sprot -evalue 1e-06 -num_threads 1 -num_descriptions 1 -num_alignments 1 -seg no > Examples\/1\-res\-Annotate\/tmp_files\/genome4\.fst\-split5N\.fna\-prokkaRes\/sprot\.blast 2> /dev/null +[13:18:53] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/sprot.faa +[13:18:53] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/sprot.blast +[13:18:53] There are still 3 unannotated CDS left (started with 9) +[13:18:53] Will use hmmer3 to search against /Users/aperrin/Softwares/src/prokka/db/hmm/HAMAP.hmm with 1 CPUs +[13:18:53] Running: cat Examples\/1\-res\-Annotate\/tmp_files\/genome4\.fst\-split5N\.fna\-prokkaRes\/HAMAP\.hmm\.faa | parallel --gnu --plain -j 1 --block 379 --recstart '>' --pipe hmmscan --noali --notextw --acc -E 1e-06 --cpu 1 /Users/aperrin/Softwares/src/prokka/db/hmm/HAMAP.hmm /dev/stdin > Examples\/1\-res\-Annotate\/tmp_files\/genome4\.fst\-split5N\.fna\-prokkaRes\/HAMAP\.hmm\.hmmer3 2> /dev/null +[13:18:53] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/HAMAP.hmm.faa +[13:18:53] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/HAMAP.hmm.hmmer3 +[13:18:53] Labelling remaining 3 proteins as 'hypothetical protein' +[13:18:53] Found 6 unique /gene codes. +[13:18:53] Fixed 0 colliding /gene names. +[13:18:53] Adding /locus_tag identifiers +[13:18:53] Assigned 9 locus_tags to CDS and RNA features. +[13:18:53] Writing outputs to Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/ +[13:18:53] Generating annotation statistics file +[13:18:53] Generating Genbank and Sequin files +[13:18:53] Running: tbl2asn -V b -a r10k -l paired-ends -M n -N 1 -y 'Annotated using prokka 1.14-dev from https://github.com/tseemann/prokka' -Z Examples\/1\-res\-Annotate\/tmp_files\/genome4\.fst\-split5N\.fna\-prokkaRes\/GEN4\.1111\.00001\.err -i Examples\/1\-res\-Annotate\/tmp_files\/genome4\.fst\-split5N\.fna\-prokkaRes\/GEN4\.1111\.00001\.fsa 2> /dev/null +[13:18:54] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/errorsummary.val +[13:18:54] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.dr +[13:18:54] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.fixedproducts +[13:18:54] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.ecn +[13:18:54] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.val +[13:18:54] Repairing broken .GBK output that tbl2asn produces... +[13:18:54] Running: sed 's/COORDINATES: profile/COORDINATES:profile/' < Examples\/1\-res\-Annotate\/tmp_files\/genome4\.fst\-split5N\.fna\-prokkaRes\/GEN4\.1111\.00001\.gbf > Examples\/1\-res\-Annotate\/tmp_files\/genome4\.fst\-split5N\.fna\-prokkaRes\/GEN4\.1111\.00001\.gbk +[13:18:54] Deleting unwanted file: Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.gbf +[13:18:54] Output files: +[13:18:54] Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.fsa +[13:18:54] Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.faa +[13:18:54] Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.log +[13:18:54] Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.ffn +[13:18:54] Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.gbk +[13:18:54] Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.gff +[13:18:54] Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.tbl +[13:18:54] Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.sqn +[13:18:54] Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.tsv +[13:18:54] Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.fna +[13:18:54] Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.err +[13:18:54] Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.txt +[13:18:54] Annotation finished successfully. +[13:18:54] Walltime used: 0.05 minutes +[13:18:54] If you use this result please cite the Prokka paper: +[13:18:54] Seemann T (2014) Prokka: rapid prokaryotic genome annotation. Bioinformatics. 30(14):2068-9. +[13:18:54] Type 'prokka --citation' for more details. +[13:18:54] Share and enjoy! diff --git a/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.sqn b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.sqn new file mode 100644 index 0000000000000000000000000000000000000000..cde29bf57f4b0c7718991f9a1678c2acbf175f68 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.sqn @@ -0,0 +1,798 @@ +Seq-entry ::= set { + class genbank , + seq-set { + set { + class nuc-prot , + descr { + source { + org { + taxname "Genus species" , + orgname { + mod { + { + subtype strain , + subname "strain" } } , + gcode 11 } } } , + comment "Annotated using prokka 1.14-dev from + https://github.com/tseemann/prokka" , + user { + type + str "NcbiCleanup" , + data { + { + label + str "method" , + data + str "SeriousSeqEntryCleanup" } , + { + label + str "version" , + data + int 8 } , + { + label + str "month" , + data + int 2 } , + { + label + str "day" , + data + int 12 } , + { + label + str "year" , + data + int 2019 } } } , + create-date + std { + year 2019 , + month 2 , + day 12 } } , + seq-set { + seq { + id { + local + str "g4_1" } , + descr { + molinfo { + biomol genomic } } , + inst { + repr raw , + mol dna , + length 7134 , + seq-data + iupacna "ATGAAAGCGCTACTGTGGCTGGTGGGTCTCGCGTTGCTGTTAACAGGCTGCGCGA +GCGAAAAAGGAATTATCGATAAAGAGGGATATCAGCTTGATACCCGACATCGGGCGCAGGCGGCCTATCCGCGCATTA +AAGTCCTGGTGATTCACTATACGGCGGAAAACTTTGACGTTTCGCTGGCGACGTTAACGGGCCGCAACGTCAGTTCGC +ATTACCTGATTCCCGCAACCCCGCCATTATATGGCGGTAAACCGCGCATCTGGCAACTGGTGCCGGAACAGGATCAGG +CCTGGCATGCGGGCGTCAGTTTCTGGCGAGGCGCCACGCGTCTCAATGATACGTCTATTGGCATTGAGCTGGAAAATC +GCGGCTGGCGAATGTCCGGCGGGGTGAAATCTTTCGCGCCGTTTGAATCCGCGCAAATTCAGGCATTGATTCCGTTAG +CGAAGGACATTATCGCGCGCTATGACATCAAACCGCAGAATGTGGTGGCCCATGCGGATATCGCGCCGCAGCGTAAAG +ACGATCCCGGCCCGCGCTTCCCGTGGCGCGAGCTGGCGGCGCAGGGGATTGGCGCCTGGCCTGACGCCCAGCGTGTGG +CGTTTTATCTGGCTGGACGCGCGCCGTATACGCCAGTCGATACCGCAACGGTGCTTGCGTTACTCTCGCGCTATGGCT +ATGAAGTCAAAGCCGATATGACGGCGCGCGAGCAGCAGCGGGTGATTATGGCGTTCCAGATGCACTTCCGTCCGGCGC +AATGGAACGGTATCGCAGATGCCGAAACGCAGGCGATTGCCGAAGCATTACTGGAGAAGTACGGCCAGGATTAACGCG +ATCGCGCGCATATCGCGCGCGATAGATGCCGCACTTTATTGCTGAATGTACTGAAAATATTCGCGAGCAGGCTGATTT +ACCCGGCCTGTTCAGCAAGGTAAACGAGGCGCTGGCCGCCAGCGGGATTTTCCCCATCGGCGGTATCCGCAGTCGCGC +CCACTGGCTGGATACCTGGCAGATGGCTGACGGTAAGCATGATTATGCGTTTGTGCATATGACGCTGAAAATCGGTAC +CGGGCGCAGCCTGGAGAGCCGTCAGGAAGTCGGCGAAATGCTGTTTGGGCTGATTAAAGCCCACTTCGCCGACCTGAT +GGAGAACCGCTATCTGGCGCTGTCGTTTGAGATTGCCGAGCTACATCCGACGCTCAATTACAAACAAAACAACGTACA +CGCGTTATTTAAATAGCCGCATATGCGGCGCTCAGGAGCGCTAGCATGCTCGATAAACAGACCCATACCCTGATCGCC +CAGCGACTTAATCAGGCTGAAAAACAGCGTGAACAAATTCGCGCAGTGTCGCTGGATTATCCCAACATCACTATTGAA +GATGCCTATGCCGTACAGCGTGAATGGGTCAATATCAAGATCGCCGAAGGGCGCACGCTCAAAGGCCACAAAATCGGC +CTGACCTCAAAAGCGATGCAGGCCAGCTCGCAAATCAGCGAACCGGATTACGGCGCGCTGCTTGACGATATGTTCTTC +CATGACGGCGGCGATATCCCCACCGACCGTTTTATCGTCCCGCGTATTGAAGTGGAGCTGGCGTTCGTGCTGGCGAAA +CCGCTGCGCGGCCCTCACTGCACGCTGTTCGACGTCTACAACGCCACGGATTATGTGATTCCGGCGCTGGAACTGATT +GACGCCCGCAGCCACAACATCGACCCGGAAACCCAGCGCCCGCGCAAAGTGTTCGACACCATTTCCGACAACGCCGCC +AACGCCGGGGTGATCCTCGGTGGTCGCCCCATCAAACCAGACGAGCTGGATCTGCGCTGGATCTCCGCGCTGCTCTAT +CGCAACGGCGTGATCGAAGAAACCGGCGTCGCCGCAGGCGTGCTGAATCATCCGGCCAACGGCGTGGCGTGGCTGGCG +AACAAGCTTGCCCCCTACGATGTCCAGCTTGAAGCCGGGCAGATCATCCTCGGCGGCTCGTTCACCCGCCCGGTGCCG +GCGAGCAAGGGCGACACCTTCCATGTCGATTACGGCAACATGGGCGCGATCAGTTGCCGGTTTGTGTAACCAGATCCG +CGCGCGCATATATATCGCGCGCATACGATGCCCGTGAATAAGTTCTCCCGACGTACCCTCCTGACGGCAGGTTCCGCG +CTTGCTGTTCTTCCTTTTCTGCGCGCCTTGCCGGTACAGGCGCGTGAACCTCGCGAGACCGTCGATATTAAGGATTAT +CCGGCGGATGACGGTATCGCCTCGTTCAAACAGGCCTTCGCCGACGGACAGACCGTGGTCTTACCGCCAGGATGGGTG +TGTGAAAATATCAATGCGGCGATAACGATTCCGGCGGGAAAAACGCTGCGGGTACAGGGCGCGGTGCGTGGGAATGGC +CGGGGACGGTTTATTTTGCAGGACGGGTGTCAGGTGGTGGGGGAGCAGGGCGGCAGTCTGCACAATGTGACGCTGGAT +GTTCGCGGGTCGGACTGTGTGATTAAAGGCGTGACGATGAGCGGCTTTGGCCCCGTCGCGCAAATTTTCATCGGCGGT +AAGGAACCGCAGGTGATGCGTAATCTCATTATCGATGACATCACCGTTACCCACGCCAACTACGCCATTCTCCGCCAG +GGATTTCATAACCAAATGGACGGCGCGCGGATTACGCATAGCCGCTTTAGCGATTTGCAGGGGGACGCCATTGAGTGG +AATGTCGCGATTCACGACCGCGACATCCTGATTTCCGATCATGTCATCGAACGCATTGATTGTACCAATGGCAAAATC +AACTGGGGGATCGGCATCGGGCTGGCGGGTAGCACCTATGACAACAGTTATCCTGAAGATCAGGCAGTAAAAAACTTT +GTGGTGGCCAATATTACCGGATCTGATTGCCGACAGCTGGTGCACGTAGAAAATGGCAAACATTTCGTCATTCGCAAT +GTCAAAGCCAAAAACATCACGCCCGATTTCAGTAAAAATGCGGGTATTGATAACGCAACGATCGCCATTTATGGCTGT +GATAATTTCGTCATTGATAATATTGATATGACGAATAGTGCTGGGATGCTCATCGGCTATGGCGTCGTTAAAGGAAAA +TACCTGTCAATTCCGCAAAACTTTAAATTAAACGCTATTCGGTTGGATAATCGCCAGGTTGCTTATAAATTACGCGGC +ATTCAAATTTCCTCCGGCAACATCCCCTCTTTTGTCGCCATCACCAATGTACGGATGACGCGTGCTACGCTGGAACTG +CATAATCAACCGCAGCACCTCTTTCTGCGTAATATCAACGTGATGCAAACTTCAGCGATTGGCCCGGCGTTAAAAATG +CATTTCGATTTGCGTAAAGATGTCCGTGGTCAATTTATGGCCCGCCAGGACACGCTGCTTTCCCTCGCTAATGTTCAT +GCCATCAATGAAAACGGGCAGAGTTCCGTGGATATCGACAGGATTAATCACCAAACCGTGAATGTCGAAGCAGTGAAT +TTTTCGCTGCCGAAGCGGGGAGGGTAACAGAGCTCGCGCGATCGCGAATGTCAGAAAATAAATTACACGTTATCGATT +TGCACAAACGCTACGGCGGTCATGAAGTGCTGAAAGGGGTATCGTTGCAGGCCCGCGCCGGAGATGTGATTAGCATCA +TCGGCTCGTCCGGCTCCGGTAAAAGCACTTTTTTGCGCTGTATTAACTTCCTCGAAAAACCGAGCGAAGGCGCGATTA +TCGTGAACGGTCAGAACATTAATCTGGTGCGCGACAAAGACGGGCAGCTCAAAGTGGCGGATAAAAATCAGCTACGCT +TGTTGCGTACCCGCCTGACGATGGTGTTTCAGCACTTTAACCTCTGGAGCCACATGACGGTGCTGGAAAATGTGATGG +AAGCGCCGATTCAGGTACTGGGATTAAGCAAGCACGACGCGCGCGAGCGGGCGTTGAAATATCTGGCGAAGGTGGGGA +TTGATGAGCGCGCTCAGGGCAAATATCCCGTCCATCTCTCCGGCGGCCAACAGCAGCGCGTTTCTATTGCGCGCGCGC +TGGCGATGGAACCTGACGTTTTACTGTTCGATGAACCCACATCGGCGCTCGATCCTGAACTGGTCGGCGAAGTGTTGC +GCATCATGCAACAACTGGCGGAAGAAGGCAAAACGATGGTGGTGGTCACGCATGAAATGGGCTTCGCCCGCCATGTCT +CTTCGCACGTTATTTTTCTGCATCAGGGGAAAATTGAAGAAGAGGGCGATCCGGAGCAGGTGTTCGGCAATCCGCAAA +GCCCGCGTTTACAGCAATTCCTGAAAGGCTCGCTGAAATAACAGATACGCGCATGGCGAGTGTTTATTGGCATCGTTA +GCCTGTTTCCTGAAATGTTCCGCGCAATTACCGATTACGGGGTAACTGGCCGGGCAGTAAAAAAAGGCCTGCTGAACA +TCCAAAGCTGGAGTCCTCGCGACTTCGCGCATGACCGGCACCGTACCGTGGACGACCGTCCTTACGGCGGCGGACCGG +GGATGTTAATGATGGTGCAACCCTTGCGGGACGCCATTCACGCAGCAAAAGCCGCGGCAGGTGAAGGCGCTAAAGTGA +TTTATCTGTCGCCTCAGGGACGCAAGCTTGATCAAGCGGGCGTTAGCGAGCTGGCCACGAATCAGAAGCTTATTCTGG +TGTGTGGTCGCTACGAAGGCGTAGATGAGCGCGTAATTCAGACCGAAATTGACGAAGAATGGTCAATTGGCGATTACG +TTCTCAGCGGTGGCGAACTACCGGCAATGACGCTGATTGACTCCGTCGCCCGGTTTATACCGGGGGTTCTGGGGCATG +AGGCATCAGCAATCGAAGATTCGTTTGCTGATGGGTTGCTGGATTGTCCGCACTATACGCGCCCTGAAGTGTTAGAGG +GGATGGAAGTACCGCCAGTATTGCTGTCGGGAAACCATGCCGAGATACGTCGCTGGCGCTTGAAACAGTCGCTGGGCC +GAACCTGGCTTAGAAGACCTGAACTTCTGGAAAACCTGGCTCTGACTGAAGAGCAAGCAAGGTTGCTGGCGGAGTTCA +AAACAGAACACGCACAACAGCAGCATAAACATGATGGGATGGCATAGCGATAGCGCTAGAGCGATGAATAATCATTTT +GGGAAAGGGTTAATGGCTGGGTTGCACGCGCCATATGCATATAGCGCGCATCATGCGGTGAATTTCTGTTCTGAGTAT +AAACGTGGCTTTGTATTGGGTTTTACACACCGTATGTTCGAAAAGACCGGCGATCGTCAACTTAGCGCGTGGGAGGCC +GGAATTCTGACGCGTCGCTATGGTCTGGATAAAGAAATGGTGATGGATTTCTTTAAAGAGAATCATTCCGGGATGGCG +GTTCGCTTCTTTATGGTCGGTTATCGACTCGAAGGTTGACAGATGCGCGATCGATGTCTACGCTTCTCTATTTGCACG +GATTCAACAGTTCCCCTCGCTCGGCAAAAGCGTGCCAGCTAAAAAACTGGCTGGCGGAGCGTCATCCGCATGTCGAGA +TGATCGTCCCTCAACTACCGCCGTATCCTGCCGATGCGGCGGAGTTGCTGGAATCTCTCGTGCTTGAGCATGGCGGTG +CGCCATTAGGGCTGGTAGGATCGTCGCTGGGTGGTTATTACGCCACCTGGCTGTCGCAATGTTTTATGCTGCCGGCTG +TGGTGGTGAATCCCGCCGTGCGGCCCTTTGAATTACTGACCGACTATCTCGGTCAGAACGAGAACCCCTACACCGGGC +AGCAATATGTGCTAGAGTCTCGCCATATTTATGATCTTAAAGTCATGCAGATTGACCCGCTGGAAGCGCCGGACCTGA +TCTGGCTACTGCAACAGACGGGCGATGAAGTGCTGGATTACCGCCAGGCGGTGGCATATTACGCCTCCTGCCGTCAGA +CAGTGACCGAGGGTGGTAATCACGCATTCACGGGCTTCGAAGATTATTTCAACCAGATTGTCGATTTTCTTGGACTGC +ACAGTTGCTGACACGGATGCGCCATCGGCTTGCTGGCCGTGCCGTTCTTCGTTTTGTCCGGGGTGATCATGAATAGCG +GGGGAATTGCCGCCCGGCTGGTCAATTTTGCCAAACTGTTTACTGGCAAACTGCCCGGCTCGCTCTCTTATACCAACA +TCGTCGGCAATATGATGTTCGGTGCAATTTCCGGATCGGCAATTGCCGCCTCAACCTCCATCGGCGGCGTGATGGTGC +CGATGAGCGCGCGCGAAGGTTACGATCGCGGCTTTGCGGCCGCGGTGAATATCGCCTCCGCGCCGACGGGAATGTTAA +TTCCGCCCACCACGGCTTTTATCCTTTATGCGCTGGCAAGCGGGGGAACATCGATTGCCGCTCTGTTCGCCGGCGGTC +TGGTCGCGGGAGTGCTGTGGGGCGTTGGCTGTATGCTGGTCACGCTGGTGGTCGCTAAGCGTCGAAATTATCGGGTTT +TCTTCACCGTCCAAAAAGGCATGGCGCTAAAAGTTGCCGTTGAGGCCATTCCCAGCCTGCTGCTGATCGTGATTATCG +TCGGCGGCATTGTGCAGGGGATTTTCACCGCCATTGAAGCCTCCGCGATTGCCGTGGTGTATACATTATTGTTGACGA +TGGTGTTTTACCGCACGCTGAAAATTAAGGATTTGCCTTCGATTTTGCTCCAGACAGTGGTAATGACCGGGGTCATCA +TGTTCCTGCTGGCAACCTCTTCGGCGATGTCCTTCTCGATGTCGATCACCAATATTCCTGCGGCGCTGAGCGATATGA +TCCTCGGTATTTCCGCCAATAAACTGGTTATCCTGTTAGTCATTACCGTCTTTTTGTTGATTATCGGCGCATTTATGG +ATATCGGTCCGGCCATTCTGATTTTTACCCCGATTCTGCTGCCGATCATGGCTAAACTGGGCGTCGATCCGGTGCATT +TTGGCATTATCATGATCTATAACCTGGCGATTGGCACCATTACGCCGCCAGTTGGCAGTGGTTTATATGTCGGGGCGA +GCGTCGGTAAGGTCAAAGTTGAGGAAGTGATTAAACCGTTGCTGCCTTTTTACGGCGCGATTATCGGCGTTCTGTTAT +TAATTACCTACATTCCGGAAATCACACTGTTCTTACCCCGTCTACTGGGCATCATGTAA" } } , + seq { + id { + local + str "g4_1_1" } , + descr { + title "N-acetylmuramoyl-L-alanine amidase AmiD [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 276 , + seq-data + ncbieaa "MKALLWLVGLALLLTGCASEKGIIDKEGYQLDTRHRAQAAYPRIKVLVIHYTAEN +FDVSLATLTGRNVSSHYLIPATPPLYGGKPRIWQLVPEQDQAWHAGVSFWRGATRLNDTSIGIELENRGWRMSGGVKS +FAPFESAQIQALIPLAKDIIARYDIKPQNVVAHADIAPQRKDDPGPRFPWRELAAQGIGAWPDAQRVAFYLAGRAPYT +PVDTATVLALLSRYGYEVKADMTAREQQRVIMAFQMHFRPAQWNGIADAETQAIAEALLEKYGQD" } , + annot { + { + data + ftable { + { + id + local + id 10 , + data + prot { + name { + "N-acetylmuramoyl-L-alanine amidase AmiD" } , + ec { + "3.5.1.28" } } , + location + int { + from 0 , + to 275 , + id + local + str "g4_1_1" } } } } } } , + seq { + id { + local + str "g4_1_2" } , + descr { + title "5-carboxymethyl-2-hydroxymuconate Delta-isomerase [Genus + species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 126 , + seq-data + ncbieaa "MPHFIAECTENIREQADLPGLFSKVNEALAASGIFPIGGIRSRAHWLDTWQMADG +KHDYAFVHMTLKIGTGRSLESRQEVGEMLFGLIKAHFADLMENRYLALSFEIAELHPTLNYKQNNVHALFK" } , + annot { + { + data + ftable { + { + id + local + id 11 , + data + prot { + name { + "5-carboxymethyl-2-hydroxymuconate Delta-isomerase" } , + ec { + "5.3.3.10" } } , + location + int { + from 0 , + to 125 , + id + local + str "g4_1_2" } } } } } } , + seq { + id { + local + str "g4_1_3" } , + descr { + title "2-oxo-hept-4-ene-1,7-dioate hydratase [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 267 , + seq-data + ncbieaa "MLDKQTHTLIAQRLNQAEKQREQIRAVSLDYPNITIEDAYAVQREWVNIKIAEGR +TLKGHKIGLTSKAMQASSQISEPDYGALLDDMFFHDGGDIPTDRFIVPRIEVELAFVLAKPLRGPHCTLFDVYNATDY +VIPALELIDARSHNIDPETQRPRKVFDTISDNAANAGVILGGRPIKPDELDLRWISALLYRNGVIEETGVAAGVLNHP +ANGVAWLANKLAPYDVQLEAGQIILGGSFTRPVPASKGDTFHVDYGNMGAISCRFV" } , + annot { + { + data + ftable { + { + id + local + id 12 , + data + prot { + name { + "2-oxo-hept-4-ene-1,7-dioate hydratase" } , + ec { + "4.2.1.163" } } , + location + int { + from 0 , + to 266 , + id + local + str "g4_1_3" } } } } } } , + seq { + id { + local + str "g4_1_4" } , + descr { + title "hypothetical protein LIKMDGFN_00004 [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 467 , + seq-data + ncbieaa "MPVNKFSRRTLLTAGSALAVLPFLRALPVQAREPRETVDIKDYPADDGIASFKQA +FADGQTVVLPPGWVCENINAAITIPAGKTLRVQGAVRGNGRGRFILQDGCQVVGEQGGSLHNVTLDVRGSDCVIKGVT +MSGFGPVAQIFIGGKEPQVMRNLIIDDITVTHANYAILRQGFHNQMDGARITHSRFSDLQGDAIEWNVAIHDRDILIS +DHVIERIDCTNGKINWGIGIGLAGSTYDNSYPEDQAVKNFVVANITGSDCRQLVHVENGKHFVIRNVKAKNITPDFSK +NAGIDNATIAIYGCDNFVIDNIDMTNSAGMLIGYGVVKGKYLSIPQNFKLNAIRLDNRQVAYKLRGIQISSGNIPSFV +AITNVRMTRATLELHNQPQHLFLRNINVMQTSAIGPALKMHFDLRKDVRGQFMARQDTLLSLANVHAINENGQSSVDI +DRINHQTVNVEAVNFSLPKRGG" } , + annot { + { + data + ftable { + { + id + local + id 13 , + data + prot { + name { + "hypothetical protein" } } , + location + int { + from 0 , + to 466 , + id + local + str "g4_1_4" } } } } } } , + seq { + id { + local + str "g4_1_5" } , + descr { + title "Histidine transport ATP-binding protein HisP [Genus + species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 257 , + seq-data + ncbieaa "MSENKLHVIDLHKRYGGHEVLKGVSLQARAGDVISIIGSSGSGKSTFLRCINFLE +KPSEGAIIVNGQNINLVRDKDGQLKVADKNQLRLLRTRLTMVFQHFNLWSHMTVLENVMEAPIQVLGLSKHDARERAL +KYLAKVGIDERAQGKYPVHLSGGQQQRVSIARALAMEPDVLLFDEPTSALDPELVGEVLRIMQQLAEEGKTMVVVTHE +MGFARHVSSHVIFLHQGKIEEEGDPEQVFGNPQSPRLQQFLKGSLK" } , + annot { + { + data + ftable { + { + id + local + id 14 , + data + prot { + name { + "Histidine transport ATP-binding protein HisP" } } , + location + int { + from 0 , + to 256 , + id + local + str "g4_1_5" } } } } } } , + seq { + id { + local + str "g4_1_6" } , + descr { + title "tRNA (guanine-N(1)-)-methyltransferase [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 255 , + seq-data + ncbieaa "MFIGIVSLFPEMFRAITDYGVTGRAVKKGLLNIQSWSPRDFAHDRHRTVDDRPYG +GGPGMLMMVQPLRDAIHAAKAAAGEGAKVIYLSPQGRKLDQAGVSELATNQKLILVCGRYEGVDERVIQTEIDEEWSI +GDYVLSGGELPAMTLIDSVARFIPGVLGHEASAIEDSFADGLLDCPHYTRPEVLEGMEVPPVLLSGNHAEIRRWRLKQ +SLGRTWLRRPELLENLALTEEQARLLAEFKTEHAQQQHKHDGMA" } , + annot { + { + data + ftable { + { + id + local + id 15 , + data + prot { + name { + "tRNA (guanine-N(1)-)-methyltransferase" } , + ec { + "2.1.1.228" } } , + location + int { + from 0 , + to 254 , + id + local + str "g4_1_6" } } } } } } , + seq { + id { + local + str "g4_1_7" } , + descr { + title "hypothetical protein LIKMDGFN_00007 [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 86 , + seq-data + ncbieaa "MAGLHAPYAYSAHHAVNFCSEYKRGFVLGFTHRMFEKTGDRQLSAWEAGILTRRY +GLDKEMVMDFFKENHSGMAVRFFMVGYRLEG" } , + annot { + { + data + ftable { + { + id + local + id 16 , + data + prot { + name { + "hypothetical protein" } } , + location + int { + from 0 , + to 85 , + id + local + str "g4_1_7" } } } } } } , + seq { + id { + local + str "g4_1_8" } , + descr { + title "hypothetical protein LIKMDGFN_00008 [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 193 , + seq-data + ncbieaa "MSTLLYLHGFNSSPRSAKACQLKNWLAERHPHVEMIVPQLPPYPADAAELLESLV +LEHGGAPLGLVGSSLGGYYATWLSQCFMLPAVVVNPAVRPFELLTDYLGQNENPYTGQQYVLESRHIYDLKVMQIDPL +EAPDLIWLLQQTGDEVLDYRQAVAYYASCRQTVTEGGNHAFTGFEDYFNQIVDFLGLHSC" } , + annot { + { + data + ftable { + { + id + local + id 17 , + data + prot { + name { + "hypothetical protein" } } , + location + int { + from 0 , + to 192 , + id + local + str "g4_1_8" } } } } } } , + seq { + id { + local + str "g4_1_9" } , + descr { + title "C4-dicarboxylate TRAP transporter large permease protein + DctM [Genus species]" , + molinfo { + biomol peptide , + tech concept-trans } } , + inst { + repr raw , + mol aa , + length 360 , + seq-data + ncbieaa "MNSGGIAARLVNFAKLFTGKLPGSLSYTNIVGNMMFGAISGSAIAASTSIGGVMV +PMSAREGYDRGFAAAVNIASAPTGMLIPPTTAFILYALASGGTSIAALFAGGLVAGVLWGVGCMLVTLVVAKRRNYRV +FFTVQKGMALKVAVEAIPSLLLIVIIVGGIVQGIFTAIEASAIAVVYTLLLTMVFYRTLKIKDLPSILLQTVVMTGVI +MFLLATSSAMSFSMSITNIPAALSDMILGISANKLVILLVITVFLLIIGAFMDIGPAILIFTPILLPIMAKLGVDPVH +FGIIMIYNLAIGTITPPVGSGLYVGASVGKVKVEEVIKPLLPFYGAIIGVLLLITYIPEITLFLPRLLGIM" } , + annot { + { + data + ftable { + { + id + local + id 18 , + data + prot { + name { + "C4-dicarboxylate TRAP transporter large permease + protein DctM" } } , + location + int { + from 0 , + to 359 , + id + local + str "g4_1_9" } } } } } } } , + annot { + { + data + ftable { + { + id + local + id 1 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "g4_1_1" , + location + int { + from 0 , + to 830 , + strand plus , + id + local + str "g4_1" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:P75820" } } , + xref { + { + data + gene { + locus "amiD" , + locus-tag "LIKMDGFN_00001" } } } } , + { + id + local + id 2 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "g4_1_2" , + location + int { + from 860 , + to 1240 , + strand plus , + id + local + str "g4_1" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:Q05354" } } , + xref { + { + data + gene { + locus "hpcD" , + locus-tag "LIKMDGFN_00002" } } } } , + { + id + local + id 3 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "g4_1_3" , + location + int { + from 1270 , + to 2073 , + strand plus , + id + local + str "g4_1" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:P42270" } } , + xref { + { + data + gene { + locus "hpcG" , + locus-tag "LIKMDGFN_00003" } } } } , + { + id + local + id 4 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "g4_1_4" , + location + int { + from 2110 , + to 3513 , + strand plus , + id + local + str "g4_1" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } } , + xref { + { + data + gene { + locus-tag "LIKMDGFN_00004" } } } } , + { + id + local + id 5 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "g4_1_5" , + location + int { + from 3534 , + to 4307 , + strand plus , + id + local + str "g4_1" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:P02915" } } , + xref { + { + data + gene { + locus "hisP" , + locus-tag "LIKMDGFN_00005" } } } } , + { + id + local + id 6 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "g4_1_6" , + location + int { + from 4326 , + to 5093 , + strand plus , + id + local + str "g4_1" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:P0A873" } } , + xref { + { + data + gene { + locus "trmD" , + locus-tag "LIKMDGFN_00006" } } } } , + { + id + local + id 7 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "g4_1_7" , + location + int { + from 5137 , + to 5397 , + strand plus , + id + local + str "g4_1" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } } , + xref { + { + data + gene { + locus-tag "LIKMDGFN_00007" } } } } , + { + id + local + id 8 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "g4_1_8" , + location + int { + from 5412 , + to 5993 , + strand plus , + id + local + str "g4_1" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } } , + xref { + { + data + gene { + locus-tag "LIKMDGFN_00008" } } } } , + { + id + local + id 9 , + data + cdregion { + frame one , + code { + id 11 } } , + product + whole + local + str "g4_1_9" , + location + int { + from 6051 , + to 7133 , + strand plus , + id + local + str "g4_1" } , + qual { + { + qual "inference" , + val "ab initio prediction:Prodigal:2.6" } , + { + qual "inference" , + val "similar to AA sequence:UniProtKB:O07838" } } , + xref { + { + data + gene { + locus "dctM" , + locus-tag "LIKMDGFN_00009" } } } } } } } } } } diff --git a/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.tbl b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.tbl new file mode 100644 index 0000000000000000000000000000000000000000..13f7f5d45429a32dbaea30869471d9f45f3c87bb --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.tbl @@ -0,0 +1,57 @@ +>Feature g4_1 +1 831 CDS + EC_number 3.5.1.28 + dbxref COG:COG3023 + gene amiD + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:P75820 + locus_tag LIKMDGFN_00001 + product N-acetylmuramoyl-L-alanine amidase AmiD +861 1241 CDS + EC_number 5.3.3.10 + gene hpcD + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:Q05354 + locus_tag LIKMDGFN_00002 + product 5-carboxymethyl-2-hydroxymuconate Delta-isomerase +1271 2074 CDS + EC_number 4.2.1.163 + gene hpcG + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:P42270 + locus_tag LIKMDGFN_00003 + product 2-oxo-hept-4-ene-1,7-dioate hydratase +2111 3514 CDS + inference ab initio prediction:Prodigal:2.6 + locus_tag LIKMDGFN_00004 + product hypothetical protein +3535 4308 CDS + dbxref COG:COG4598 + gene hisP + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:P02915 + locus_tag LIKMDGFN_00005 + product Histidine transport ATP-binding protein HisP +4327 5094 CDS + EC_number 2.1.1.228 + dbxref COG:COG0336 + gene trmD + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:P0A873 + locus_tag LIKMDGFN_00006 + product tRNA (guanine-N(1)-)-methyltransferase +5138 5398 CDS + inference ab initio prediction:Prodigal:2.6 + locus_tag LIKMDGFN_00007 + product hypothetical protein +5413 5994 CDS + inference ab initio prediction:Prodigal:2.6 + locus_tag LIKMDGFN_00008 + product hypothetical protein +6052 7134 CDS + dbxref COG:COG1593 + gene dctM + inference ab initio prediction:Prodigal:2.6 + inference similar to AA sequence:UniProtKB:O07838 + locus_tag LIKMDGFN_00009 + product C4-dicarboxylate TRAP transporter large permease protein DctM diff --git a/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.tsv b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.tsv new file mode 100644 index 0000000000000000000000000000000000000000..6fb4e7be29674a310922cdb82a5fbe08125db815 --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.tsv @@ -0,0 +1,10 @@ +locus_tag ftype length_bp gene EC_number COG product +LIKMDGFN_00001 CDS 831 amiD 3.5.1.28 COG3023 N-acetylmuramoyl-L-alanine amidase AmiD +LIKMDGFN_00002 CDS 381 hpcD 5.3.3.10 5-carboxymethyl-2-hydroxymuconate Delta-isomerase +LIKMDGFN_00003 CDS 804 hpcG 4.2.1.163 2-oxo-hept-4-ene-1,7-dioate hydratase +LIKMDGFN_00004 CDS 1404 hypothetical protein +LIKMDGFN_00005 CDS 774 hisP COG4598 Histidine transport ATP-binding protein HisP +LIKMDGFN_00006 CDS 768 trmD 2.1.1.228 COG0336 tRNA (guanine-N(1)-)-methyltransferase +LIKMDGFN_00007 CDS 261 hypothetical protein +LIKMDGFN_00008 CDS 582 hypothetical protein +LIKMDGFN_00009 CDS 1083 dctM COG1593 C4-dicarboxylate TRAP transporter large permease protein DctM diff --git a/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.txt b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.txt new file mode 100644 index 0000000000000000000000000000000000000000..61a7be1fcd34429f898e3d10baea743e4d6e049c --- /dev/null +++ b/Examples/1-res-Annotate/tmp_files/genome4.fst-split5N.fna-prokkaRes/GEN4.1111.00001.txt @@ -0,0 +1,4 @@ +organism: Genus species strain +contigs: 1 +bases: 7134 +CDS: 9 diff --git a/Examples/commands/1-Annotate.sh b/Examples/commands/1-Annotate.sh new file mode 100644 index 0000000000000000000000000000000000000000..30a63cc14e4e848a78bfd81ec4d2b78f82f84cf0 --- /dev/null +++ b/Examples/commands/1-Annotate.sh @@ -0,0 +1,2 @@ +genomeAPCAT annotate -d Examples/genomes -r Examples/1-res-Annotate Examples/input_files/list_genomes.lst -n EXAM + diff --git a/Examples/commands/2-Pangenome.sh b/Examples/commands/2-Pangenome.sh new file mode 100644 index 0000000000000000000000000000000000000000..308db8f7ad1ed40a82f5eb384f229b8faddb794b --- /dev/null +++ b/Examples/commands/2-Pangenome.sh @@ -0,0 +1,2 @@ +genomeAPCAT pangenome -l Examples/1-res-Annotate/LSTINFO-list_genomes.lst -n EXAMPLE4 -d Examples/1-res-Annotate/Proteins -i 0.8 -o Examples/2-res-pangenome + diff --git a/Examples/commands/3-Corepers.sh b/Examples/commands/3-Corepers.sh new file mode 100644 index 0000000000000000000000000000000000000000..754080c98befb74f72e7cd9d916f4b9a959c6ece --- /dev/null +++ b/Examples/commands/3-Corepers.sh @@ -0,0 +1 @@ +genomeAPCAT corepers -p Examples/2-res-pangenome/PanGenome-EXAMPLE4.All.prt-clust-0.8-mode1_2019-02-12_13-25-01.tsv.lst diff --git a/Examples/commands/4-Align.sh b/Examples/commands/4-Align.sh new file mode 100644 index 0000000000000000000000000000000000000000..2140ae065d04a54e8b1033aaf3a341d0002c76f0 --- /dev/null +++ b/Examples/commands/4-Align.sh @@ -0,0 +1 @@ +genomeAPCAT align -c Examples/3-coregenome/PersGenome_PanGenome-EXAMPLE4.All.prt-clust-0.8-mode1_2019-02-12_13-25-01.tsv.lst_1.lst -l Examples/1-res-Annotate/LSTINFO-list_genomes.lst -d Examples/1-res-Annotate/ -o Examples/4-Algn -n EXAMPLE4 diff --git a/Examples/commands/5-Tree.sh b/Examples/commands/5-Tree.sh new file mode 100644 index 0000000000000000000000000000000000000000..180e2dda3e216f95747c27fba4c304a2fd4012ff --- /dev/null +++ b/Examples/commands/5-Tree.sh @@ -0,0 +1 @@ +genomeAPCAT tree -a Examples/4-Algn/Phylo-EXAMPLE4/EXAMPLE4.grp.aln diff --git a/README.md b/README.md index debf9d5c165cbb42c2326572afbcb22a37a472a1..70b02b93207300b3412968e2abcbfa358e4ace45 100755 --- a/README.md +++ b/README.md @@ -1,17 +1,17 @@ -# **-- genomeAPCAT --** +# **-- PanACoTA --** [](https://gitlab.pasteur.fr/aperrin/pipeline_annotation/commits/master) [](http://aperrin.pages.pasteur.fr/pipeline_annotation/htmlcov) [](COPYING) -This README file provides some essential information to install/use genomeAPCAT. But it is better to read the [**full documentation**](http://aperrin.pages.pasteur.fr/pipeline_annotation/html-doc), providing more details : [ ](http://aperrin.pages.pasteur.fr/pipeline_annotation/html-doc) +This README file provides some essential information to install/use PanACoTA. But it is better to read the [**full documentation**](http://aperrin.pages.pasteur.fr/pipeline_annotation/html-doc), providing more details : [ ](http://aperrin.pages.pasteur.fr/pipeline_annotation/html-doc) --- --- --- -genomeAPCAT is a software providing tools for large scale comparative genomics: +PanACoTA is a software providing tools for large scale comparative genomics: - annotation of genomes - pan-genome - persistent genome @@ -22,10 +22,10 @@ genomeAPCAT is a software providing tools for large scale comparative genomics: ## Dependencies -genomeAPCAT is written in **python3**. So, you need python3 (and pip3 for installation) to run it. +PanACoTA is written in **python3**. So, you need python3 (and pip3 for installation) to run it. Its external dependencies are: -- [**prokka**](https://github.com/tseemann/prokka) (to annotate the genomes). If you do not already have it, this software will be automatically installed by `make` script, along with `genomeAPCAT` (see [Install section](#install) +- [**prokka**](https://github.com/tseemann/prokka) (to annotate the genomes). If you do not already have it, this software will be automatically installed by `make` script, along with `PanACoTA` (see [Install section](#install) ) - [**mmseqs**](https://github.com/soedinglab/MMseqs2) (to generate pangenomes) - [**mafft**](http://mafft.cbrc.jp/alignment/software/) (to align persistent genome) @@ -50,26 +50,26 @@ For FastTree, we advise to download C code from [here](http://www.microbesonline You can then add the output `FastTreeMP` to your `$PATH` to be able to run it from everywhere. -## Downloading and updating `genomeAPCAT` +## Downloading and updating `PanACoTA` -You can download `genomeAPCAT` source code by downloading a [zip file](https://gitlab.pasteur.fr/aperrin/pipeline_annotation/repository/archive.zip?ref=master), or by cloning its gitlab repository. By cloning the gitlab repository, you will then be able to update the code to new versions very easily and quickly. Here is how to clone the repository: +You can download `PanACoTA` source code by downloading a [zip file](https://gitlab.pasteur.fr/aperrin/pipeline_annotation/repository/archive.zip?ref=master), or by cloning its gitlab repository. By cloning the gitlab repository, you will then be able to update the code to new versions very easily and quickly. Here is how to clone the repository: git clone https://gitlab.pasteur.fr/aperrin/pipeline_annotation Give your gitlab login, and password. -This will create a repository called `pipeline_annotation`. Go inside this repository to install `genomeAPCAT`, as described hereafter. +This will create a repository called `pipeline_annotation`. Go inside this repository to install `PanACoTA`, as described hereafter. -If a new version of `genomeAPCAT` is released, and you want to use it, type the following command to update the source code: +If a new version of `PanACoTA` is released, and you want to use it, type the following command to update the source code: git pull Then, you will be able to upgrade to the new version (see bellow). -## <a name="install"></a> Installing `genomeAPCAT` (final mode) +## <a name="install"></a> Installing `PanACoTA` (final mode) -To install `genomeAPCAT`, and all its dependencies, from the root directory, type: +To install `PanACoTA`, and all its dependencies, from the root directory, type: ./make @@ -88,15 +88,15 @@ If you have permission issues, you can either use 'sudo' before the previous com - prokka - barrnap (if prokka is not already installed) -## <a name="uninstall"></a> Uninstalling `genomeAPCAT` +## <a name="uninstall"></a> Uninstalling `PanACoTA` -If you don't want `genomeAPCAT` anymore, or if you want to install a newer version, uninstall it by typing: +If you don't want `PanACoTA` anymore, or if you want to install a newer version, uninstall it by typing: ./make uninstall ## Upgrade to new version -If you want to install a new version of `genomeAPCAT` (and you downloaded it by cloning the gitlab repository): +If you want to install a new version of `PanACoTA` (and you downloaded it by cloning the gitlab repository): - update source code to the new version (`git pull`) - upgrade installation to the new version (`./make upgrade`) @@ -108,11 +108,11 @@ If you installed the dependencies (such as prokka) via our installation script, ./make clean -# Running `genomeAPCAT` +# Running `PanACoTA` ## Quick run -`genomeAPCAT` contains 5 different subcommands: +`PanACoTA` contains 5 different subcommands: - `annotate` (annotate all genomes of the dataset, after a quality control) - `pangenome` (generate pan-genome) - `corepers` (generate core-genome or persistent-genome) @@ -121,30 +121,30 @@ If you installed the dependencies (such as prokka) via our installation script, You can run them by typing: - genomeAPCAT <subcommand_name> <arguments_for_subcommand> + PanACoTA <subcommand_name> <arguments_for_subcommand> Each subcommand has its own options and inputs. To get the list of required arguments and other available options for the subcommand you want to run, type: - genomeAPCAT <subcommand> -h + PanACoTA <subcommand> -h ## Examples We provide a folder, `Examples`, containing genomic sequences (in `Examples/genomes`) and examples of input files (in `Examples/input_files`) for the software. -In the [example part of documentation](http://aperrin.pages.pasteur.fr/pipeline_annotation/html-doc/examples.html), you will find information explaining you how to run the different modules of `genomeAPCAT` with this dataset, so that you can try the software. We also describe the results that should be created by each command line. +In the [example part of documentation](http://aperrin.pages.pasteur.fr/pipeline_annotation/html-doc/examples.html), you will find information explaining you how to run the different modules of `PanACoTA` with this dataset, so that you can try the software. We also describe the results that should be created by each command line. **Note:** the provided genomic sequences are taken from real genomes, but then modified and shortened in order to have an example showing different situations, but running very fast. Hence, the examples results should not be interpreted biologically! ## Documentation -You can find more information in [genomeAPCAT documentation](http://aperrin.pages.pasteur.fr/pipeline_annotation/html-doc)! +You can find more information in [PanACoTA documentation](http://aperrin.pages.pasteur.fr/pipeline_annotation/html-doc)! # <a name="develop"></a> Development -This part is for people who want to work on developing `genomeAPCAT` package. +This part is for people who want to work on developing `PanACoTA` package. -## Installing `genomeAPCAT` (development mode) +## Installing `PanACoTA` (development mode) -If you want to install `genomeAPCAT` while still working on modifying the scripts, type: +If you want to install `PanACoTA` while still working on modifying the scripts, type: ./make develop @@ -154,7 +154,7 @@ If you don't want to install the software, you can still test it, and contribute by installing the libraries needed for the software, and those needed for development by running: - pip3 install -r requirements.txt # dependencies used by genomeAPCAT + pip3 install -r requirements.txt # dependencies used by PanACoTA pip3 install -r requirements-dev.txt # libraries used to run tests, generate documentation etc. **Note:** biopython is only used for 'tree' subcommand, with option ``--soft fastme`` or ``--soft quicktree``. If you do not diff --git a/bin/genomeAPCAT b/bin/genomeAPCAT index 72fc087e50e5de022dbb84c9b3a668d1efa44db1..3bf2de3f8ce2b99ee9fbff27af0f04e0d38f4e66 100755 --- a/bin/genomeAPCAT +++ b/bin/genomeAPCAT @@ -2,6 +2,8 @@ # coding: utf-8 import sys +from textwrap import dedent + from genomeAPCAT import __version__ as version from genomeAPCAT.subcommands import annote @@ -26,9 +28,28 @@ def parse_arguments(argv): import argparse # Create main parser - parser = argparse.ArgumentParser(description=("genomeAPCAT - Large scale comparative" - " genomics tools"), - prog='genomeAPCAT') + + parser = argparse.ArgumentParser( + epilog="For more details, visit the MacSyFinder website and see " + "the MacSyFinder documentation.", + formatter_class=argparse.RawDescriptionHelpFormatter, + description=dedent(''' + + + + ___ _____ ___ _____ _____ +( _`\ ( _ )( _`\ (_ _)( _ ) +| |_) ) _ _ ___ | (_) || ( (_) _ | | | (_) | +| ,__/'/'_` )/' _ `\| _ || | _ /'_`\ | | | _ | +| | ( (_| || ( ) || | | || (_( )( (_) )| | | | | | +(_) `\__,_)(_) (_)(_) (_)(____/'`\___/'(_) (_) (_) + + + Large scale comparative genomics tools + + ------------------------------------------- + ''') ) + parser.add_argument('-V', '--version', action='version', version='genomeAPCAT - v. ' + str(version), diff --git a/doc/source/about.rst b/doc/source/about.rst index afbf189f50a9ba8adb2348e248cb136df9c1e7f8..e65dddf60ec9feb3c54634a8126e825afb34f1a0 100755 --- a/doc/source/about.rst +++ b/doc/source/about.rst @@ -1,9 +1,9 @@ ================= -About genomeAPCAT +About PanACoTA ================= -The genomeAPCAT program is available at `<https://gitlab.pasteur.fr/aperrin/pipeline_annotation>`_. +The PanACoTA program is available at `<https://gitlab.pasteur.fr/aperrin/pipeline_annotation>`_. Copyright © `GEM <https://research.pasteur.fr/fr/team/microbial-evolutionary-genomics/>`_ - Institut Pasteur, CNRS UMR3525 @@ -11,4 +11,4 @@ contact: amandine.perrin@pasteur.fr This software is a computer program whose purpose is to provide tools for large scale bacterial comparative genomics. -LICENCE \ No newline at end of file +LICENCE diff --git a/doc/source/develop.rst b/doc/source/develop.rst index 01e322f9d9ef1b6d4fc831ec053e791e13d1e740..05676e3ed2669bac4bc7f149f9180fc19077c1c4 100755 --- a/doc/source/develop.rst +++ b/doc/source/develop.rst @@ -1,14 +1,14 @@ ======================== -Work on genomeAPCAT code +Work on PanACoTA code ======================== -This part is for people who want to work on developing `genomeAPCAT` package: adding new features, correcting bugs etc. +This part is for people who want to work on developing `PanACoTA` package: adding new features, correcting bugs etc. -Installing ``genomeAPCAT`` in development mode +Installing ``PanACoTA`` in development mode ============================================== -If you want to install ``genomeAPCAT`` while still working on modifying the scripts, type: +If you want to install ``PanACoTA`` while still working on modifying the scripts, type: .. code-block:: bash @@ -21,7 +21,7 @@ needed for development by running: .. code-block:: bash - pip3 install -r requirements.txt # dependencies used by genomeAPCAT + pip3 install -r requirements.txt # dependencies used by PanACoTA pip3 install -r requirements-dev.txt # libraries used to run tests, generate documentation etc. .. note:: biopython is only used for 'tree' subcommand, with option ``--soft fastme`` or ``--soft quicktree``. If you do not plan to use this, you do not need to install biopython. You can comment the ``biopython>=1.60`` line in ``requirements.txt`` (add a ``#`` at the beginning of the line). diff --git a/doc/source/examples.rst b/doc/source/examples.rst index 8138324b5a61ef8a9d1ce6b6cc6cd832d14056d4..be3e61e1d2f27440bb6e2f11611cee9827911a08 100755 --- a/doc/source/examples.rst +++ b/doc/source/examples.rst @@ -57,7 +57,7 @@ Quality control If you just want to do quality control on the dataset, type:: - genomeAPCAT annotate -d Examples/genomes -r my_results Examples/input_files/list_genomes.lst -Q + PanACoTA annotate -d Examples/genomes -r my_results Examples/input_files/list_genomes.lst -Q This will create a folder ``my_results``, containing: @@ -74,9 +74,9 @@ This will create a folder ``my_results``, containing: :width: 40% - ``discarded-list_genomes.lst``: should be empty. The default limits are :math:`L90 \leq 100` and :math:`nbcontigs \leq 999`. In the png files, we can see that we are very far from those limits, so, no genome is discarded. -- ``genomeAPCAT-annotate_list_genomes.log``: log file. See information on what happened during the run: traceback of stdout. -- ``genomeAPCAT-annotate_list_genomes.log.err``: log file but only with Warnings and Errors. If it is empty, everything went well! -- ``genomeAPCAT-annotate_list_genomes.log.details``: with the quality control only option, this file is exactly the same as the ``.log`` file. It will add details when annotation step is run. +- ``PanACoTA-annotate_list_genomes.log``: log file. See information on what happened during the run: traceback of stdout. +- ``PanACoTA-annotate_list_genomes.log.err``: log file but only with Warnings and Errors. If it is empty, everything went well! +- ``PanACoTA-annotate_list_genomes.log.details``: with the quality control only option, this file is exactly the same as the ``.log`` file. It will add details when annotation step is run. - ``info-genomes-list_genomes.lst``: file with information on each genome: size, number of contigs and L90:: orig_name gsize nb_conts L90 @@ -99,7 +99,7 @@ Annotation Now that you have seen the distribution of L90 and #contig values in your genomes, and decided which limits you want to use (if you do not want to use the default ones), you can annotate the genomes which are under those limits with:: - genomeAPCAT annotate -d Examples/genomes -r my_results Examples/input_files/list_genomes.lst -n GENO --l90 3 --nbcont 10 + PanACoTA annotate -d Examples/genomes -r my_results Examples/input_files/list_genomes.lst -n GENO --l90 3 --nbcont 10 Here, we put the L90 limit to 3, which should lead to the removal of 1 genome (genome2, according to the ``info-genomes-list_genomes.lst``). We also put the nbcont limit to 10. However, this should not remove any genome, as all have less than 10 contigs. We put these limits just to show how the program works with your own limits, but they do not have any significance here, as a genome with L90 = 4 is not a bad quality genome! @@ -130,7 +130,7 @@ PanGenome step To do a pangenome, you need to provide the list of genomes to consider, with 1 genome per line. Only the first column (genome name) will be considered, but you can use a file containing other columns...such as the one you already have in the result folder of annotation step: ``LSTINFO-list_genomes.lst``! However, of course, if you want to do a pangenome of less genomes than the ones you annotated, you are free to create a new file with the genomes you want! Here, we are doing a pangenome of the 3 genomes annotated before. Here is the command line: - genomeAPCAT pangenome -l my_results/LSTINFO-list_genomes.lst -n GENO3 -d my_results/Proteins -i 0.8 -o my_results/pangenome + PanACoTA pangenome -l my_results/LSTINFO-list_genomes.lst -n GENO3 -d my_results/Proteins -i 0.8 -o my_results/pangenome With: @@ -151,7 +151,7 @@ Core/Persistent Genome step The core genome is inferred from the PanGenome. So, the only required file is your pangenome, obtained at last step. By default, it will generate a CoreGenome. Here is the command line to obtain the CoreGenome of our dataset:: - genomeAPCAT corepers -p my_results/pangenome/PanGenome-GENO3.All.prt-clust-0.8-mode1_<date>.tsv.lst + PanACoTA corepers -p my_results/pangenome/PanGenome-GENO3.All.prt-clust-0.8-mode1_<date>.tsv.lst **Replace `<date>` by your real filename** @@ -165,7 +165,7 @@ Alignment step You can then do an alignment of all the proteins of each persistent family. For example, to align the 6 core families found in the previous step:: - genomeAPCAT align -c my_results/pangenome/PersGenome_<pangenome-filename>_1.lst -l my_results/LSTINFO-list_genomes.lst -n GENO3-1 -d my_results -o my_results/Phylogeny + PanACoTA align -c my_results/pangenome/PersGenome_<pangenome-filename>_1.lst -l my_results/LSTINFO-list_genomes.lst -n GENO3-1 -d my_results -o my_results/Phylogeny **Replace `PersGenome_<pangenome-filename>_1.lst` by your real persistent genome filename** @@ -187,7 +187,7 @@ Tree step You can infer a phylogenetic tree from the alignment of persistent families. By default, it uses FastTree to infer the phylogenetic tree, with a GTR DNA substitution model, and no bootstrap. To run this on the alignment generated by the previous step using 5 threads, use:: - genomeAPCAT tree -a my_results/Phylogeny/Phylo-GENO3-1/GENO3-1.grp.aln --threads 5 + PanACoTA tree -a my_results/Phylogeny/Phylo-GENO3-1/GENO3-1.grp.aln --threads 5 In your output directory, ``my_results/Phylogeny/Phylo-GENO3-1``, you will find your phylogenetic tree file, called ``GENO3-1.grp.aln.fasttree_tree.nwk``. If you followed all previous steps, your file should contain something close to: diff --git a/doc/source/index.rst b/doc/source/index.rst index 56e35fb8797530e0d1e2a8991d937d07c02cee86..03427c4a7a79a4f95ecb9a1ab1b941894d823a08 100755 --- a/doc/source/index.rst +++ b/doc/source/index.rst @@ -1,6 +1,6 @@ -.. genomeAPCAT documentation master file +.. PanACoTA documentation master file -Welcome to genomeAPCAT's documentation! +Welcome to PanACoTA's documentation! ======================================= .. toctree:: diff --git a/doc/source/starting.rst b/doc/source/starting.rst index 125ee477c1f55d88d7b0a0ccd09ad3ebd48d658c..cd5b18f783cf81f0cfcf62e8c9a3edfd312c7d32 100755 --- a/doc/source/starting.rst +++ b/doc/source/starting.rst @@ -1,15 +1,15 @@ ========================= -Starting with genomeAPCAT +Starting with PanACoTA ========================= -``genomeAPCAT`` is a Python package, developed in Python 3. +``PanACoTA`` is a Python package, developed in Python 3. Downloading and updating ======================== -You can download ``genomeAPCAT`` source code by downloading an archive (zip, tar.gz), or by cloning its gitlab repository. By cloning the gitlab repository, you will then be able to update the code to new versions very easily and quickly. Here is how to clone the repository: +You can download ``PanACoTA`` source code by downloading an archive (zip, tar.gz), or by cloning its gitlab repository. By cloning the gitlab repository, you will then be able to update the code to new versions very easily and quickly. Here is how to clone the repository: .. code-block:: bash @@ -17,9 +17,9 @@ You can download ``genomeAPCAT`` source code by downloading an archive (zip, tar When asked, give your gitlab login, and password. -This will create a repository called ``pipeline_annotation``. Go inside this repository (``cd pipeline_annotation``) to install ``genomeAPCAT``, as described hereafter. +This will create a repository called ``pipeline_annotation``. Go inside this repository (``cd pipeline_annotation``) to install ``PanACoTA``, as described hereafter. -If a new version of ``genomeAPCAT`` is released, and you want to use it, type the following command to update the source code: +If a new version of ``PanACoTA`` is released, and you want to use it, type the following command to update the source code: .. code-block:: bash @@ -35,11 +35,11 @@ Installation: '**./make**' and its options Dependencies ------------ -``genomeAPCAT`` is written in **python3**. So, you need python3 (and pip3 for installation) to run it. +``PanACoTA`` is written in **python3**. So, you need python3 (and pip3 for installation) to run it. Its external dependencies are: -- `prokka <https://github.com/tseemann/prokka>`_ (to annotate the genomes). If you do not already have it, this software will be automatically installed by `make` script, along with `genomeAPCAT` (see :ref:`Install section <installing>`) +- `prokka <https://github.com/tseemann/prokka>`_ (to annotate the genomes). If you do not already have it, this software will be automatically installed by `make` script, along with `PanACoTA` (see :ref:`Install section <installing>`) - `mmseqs <https://github.com/soedinglab/MMseqs2>`_ (to generate pangenomes) - `mafft <http://mafft.cbrc.jp/alignment/software/>`_ (to align persistent genome) - At least one of those softwares, to infer a phylogenetic tree: @@ -69,11 +69,11 @@ You can then add the output ``FastTreeMP`` to your ``$PATH`` to be able to run i .. _installing: -Installing ``genomeAPCAT`` +Installing ``PanACoTA`` -------------------------- -To install ``genomeAPCAT`` from the ``pipeline_annotation`` directory, type: +To install ``PanACoTA`` from the ``pipeline_annotation`` directory, type: .. code-block:: bash @@ -94,10 +94,10 @@ just as any other software. .. warning:: If you plan to work on the scripts, choose the development installation (see :doc:`Developer documentation <develop>`). -Uninstalling ``genomeAPCAT`` +Uninstalling ``PanACoTA`` ---------------------------- -If you don't want ``genomeAPCAT`` anymore, uninstall it by typing: +If you don't want ``PanACoTA`` anymore, uninstall it by typing: .. code-block:: bash @@ -110,7 +110,7 @@ If you don't want ``genomeAPCAT`` anymore, uninstall it by typing: Upgrade to new version ---------------------- -If you want to install a new version of ``genomeAPCAT``: +If you want to install a new version of ``PanACoTA``: .. code-block:: bash @@ -133,7 +133,7 @@ If you installed the dependencies (such as prokka) via our installation script, Quick run ========= -``genomeAPCAT`` contains 5 different subcommands: +``PanACoTA`` contains 5 different subcommands: - ``annotate`` (annotate all genomes of the dataset, after a quality control) - ``pangenome`` (generate pan-genome) @@ -145,11 +145,11 @@ You can run them by typing: .. code-block:: bash - genomeAPCAT <subcommand_name> <arguments_for_subcommand> + PanACoTA <subcommand_name> <arguments_for_subcommand> Each subcommand has its own options and inputs. To get the list of required arguments and other available options for the subcommand you want to run, type: .. code-block:: bash - genomeAPCAT <subcommand> -h + PanACoTA <subcommand> -h diff --git a/doc/source/usage.rst b/doc/source/usage.rst index 9a66e2c342d1d42d945e4f300aacad1b39075f89..a317fa0be8e134b465f360ae2728ea3a359ff1fd 100755 --- a/doc/source/usage.rst +++ b/doc/source/usage.rst @@ -1,8 +1,8 @@ ============================= -Running genomeAPCAT: tutorial +Running PanACoTA: tutorial ============================= -``genomeAPCAT`` contains 5 subcommands, for the different steps: +``PanACoTA`` contains 5 subcommands, for the different steps: - ``annotate`` (annotate all genomes of the dataset, after a quality control) - ``pangenome`` (generate pan-genome) - ``corepers`` (generate core-genome or persistent-genome) @@ -11,11 +11,11 @@ Running genomeAPCAT: tutorial You can run them by typing:: - genomeAPCAT <subcommand_name> <arguments_for_subcommand> + PanACoTA <subcommand_name> <arguments_for_subcommand> Each subcommand has its own options and inputs. To get the list of required arguments and other available options for the subcommand you want to run, type:: - genomeAPCAT <subcommand> -h + PanACoTA <subcommand> -h .. note:: In the example command lines, we put ``<>`` around the fields that you have to replace by the information corresponding to what you want. For example, if we write ``command -D <seqfile>`` and the sequence file you want to use is in your current directory and is called ``my_sequence.fa``, then you should write ``command -D my_sequence.fa``. @@ -41,7 +41,7 @@ We will now describe each subcommand, with its options. You can see all required arguments and available options with:: - genomeAPCAT annotate -h + PanACoTA annotate -h The input for annotation is a set of genomes, in (multi-)fasta format. All files to annotate must be in a same directory, referred after by ``<db_path>``. However, this directory can also contain other files/sequences, not used in this study. The program will only use the files specified in the ``<list_file>``, which is the main file you have to provide for this step. @@ -109,7 +109,7 @@ With some softwares, the different contigs of a draft genome are all concatenate >genome_seq AACACACGATCTCGGCAGCGCANNNNNNNNNNNNNACAGCATNNNNTCGCGCCGACGNNACTATAACAGCAGACNNNNNNNNNNCACACCGGGTATCAGCAGCAGACGACGACGAACGAANNNNNNNNNNACACAGCACTATACGNACAGCA... -This genome is a draft with 4 contigs. By default, ``genomeAPCAT`` will split the sequences each time there is stretch of at least 5 ``N``, in order to have 1 replicon per fasta entry. For example, with the previous file in input, it will create a new multi-fasta file with:: +This genome is a draft with 4 contigs. By default, ``PanACoTA`` will split the sequences each time there is stretch of at least 5 ``N``, in order to have 1 replicon per fasta entry. For example, with the previous file in input, it will create a new multi-fasta file with:: >genome_seq_cont1 AACACACGATCTCGGCAGCGCA @@ -230,7 +230,7 @@ With this information, you will be able to see which genomes should be removed f You can run this quality control with (order of arguments does not matter):: - genomeAPCAT annotate <list_file> -d <dbpath> -r <res_path> -Q + PanACoTA annotate <list_file> -d <dbpath> -r <res_path> -Q with: @@ -251,9 +251,9 @@ This will create a folder ``<res_path>``, with the following files inside: And log files: - - ``genomeAPCAT-annotate_<list_file>.log``: log file. See information on what happened during the run: traceback of stdout. - - ``genomeAPCAT-annotate_<list_file>.log.err``: log file but only with Warnings and errors. If it is empty, everything went well! - - ``genomeAPCAT-annotate_<list_file>.log.details``: same as ``.log`` file, but with more detailed information (for example, while running annotation, you can have the time of start/end of annotation of each individual genome). This file can be quite big if you have a lot of genomes. + - ``PanACoTA-annotate_<list_file>.log``: log file. See information on what happened during the run: traceback of stdout. + - ``PanACoTA-annotate_<list_file>.log.err``: log file but only with Warnings and errors. If it is empty, everything went well! + - ``PanACoTA-annotate_<list_file>.log.details``: same as ``.log`` file, but with more detailed information (for example, while running annotation, you can have the time of start/end of annotation of each individual genome). This file can be quite big if you have a lot of genomes. .. _annot: @@ -262,7 +262,7 @@ Annotation When you know the limits you want to use for the L90 and number of contigs, you can run the full annotation step, and not only the quality control. Use:: - genomeAPCAT annotate <list_file> -d <dbpath> -r <res_path> -n <name> [--l90 <num> --nbcont <num>] + PanACoTA annotate <list_file> -d <dbpath> -r <res_path> -n <name> [--l90 <num> --nbcont <num>] with: - same arguments as before @@ -289,7 +289,7 @@ This will also create a folder ``<res_path>``, with the following files inside: Options ------- -Here is the list of options available when running ``genomeAPCAT annotate``: +Here is the list of options available when running ``PanACoTA annotate``: - ``-n <name>``: required when not running quality control only (see :ref:`annotation<annot>`) - ``-Q``: run quality control only (see :ref:`QC only<qco>`) @@ -308,7 +308,7 @@ Here is the list of options available when running ``genomeAPCAT annotate``: You can see all required arguments and available options with:: - genomeAPCAT pangenome -h + PanACoTA pangenome -h To construct a pangenome, you need to specify **which genomes** you want to include in the dataset. Each of these genomes must have a unique file, called ``<genome_name>.prt``, containing all **amino-acid sequences of its CDS**. Those ``.prt`` files must all be in **a same directory**, referenced here after by ``<dbdir>``. As for the annotation step, this folder can contain other files, but only the ones given in the list_file will be taken into account. @@ -357,7 +357,7 @@ Ideally, you should follow the 'gembase_format', ``<name>.<date>.<strain_num>.<p - ``<strain_num>`` strain number (only numeric characters) -If your protein files were generated by ``genomeAPCAT annotate``, they are already in this format! +If your protein files were generated by ``PanACoTA annotate``, they are already in this format! Those fields will be used to sort pangenome families by species (if you do a pangenome containing different species), strain number (inside a same species), and protein number (inside a same strain). They will also be essential if you want to generate a core or persistent genome after. @@ -444,7 +444,7 @@ Do pangenome To do a pangenome, run the following command:: - genomeAPCAT pangenome -l <list_file> -n <dataset_name> -d <path/to/dbdir> -o <path/to/outdir> -i <min_id> + PanACoTA pangenome -l <list_file> -n <dataset_name> -d <path/to/dbdir> -o <path/to/outdir> -i <min_id> with: @@ -468,7 +468,7 @@ It will also contain other files and directories, that could help you if you nee - ``tmp_<dataset_name>.All.prt-mode<mode_num_given>_<current_date_and_time>`` folder, containing all temporary files used by MMseqs2 to cluster your proteins. - ``<dataset_name>.All.prt-msDB*``: 5 files (``*`` being nothing, ``.index``, ``.lookup``, ``_h``, ``_h.index``) corresponding to the protein databank, in the format used by MMseqs2. - ``<dataset_name>.All.prt-clust-<min_id>-mode<mode_num_given>_<current_date_and_time>*``: 3 files (``*`` being nothing, ``.index``, ``.tsv``) generated by MMseqs2 corresponding to the clustering of your proteins - - ``genomeAPCAT-pangenome_<dataset_name>.log*``: the 3 log files as in the annotate subcommand (.log, .log.details, .log.err). See their description :ref:`here<logf>` + - ``PanACoTA-pangenome_<dataset_name>.log*``: the 3 log files as in the annotate subcommand (.log, .log.details, .log.err). See their description :ref:`here<logf>` - ``mmseq_<dataset_name>.All.prt_<min_id>-mode<mode_num_given>_<current_date_and_time>.log``: MMseqs2 log file. - ``Pangenome-<dataset_name>.All.prt-clust-<min_id>-mode<mode_num_given>_<current_date_and_time>.tsv.lst.bin`` is a binary file containing Python objects corresponding to the pangenome. File only used by the program to do calculations faster the next time it needs this information (to generate Core or Persistent genome for example). @@ -492,7 +492,7 @@ You can also specify other options with: You can see all required arguments and available options with:: - genomeAPCAT corepers -h + PanACoTA corepers -h As core and persistent genomes are inferred from the pangenome, the only file required to generate a core or persistent genome is the pangenome of your dataset, in the format described in :ref:`pangenome part<panfile>`. @@ -515,7 +515,7 @@ Do corepers To do a coregenome, run the following command:: - genomeAPCAT corepers -p <pangenome_file> + PanACoTA corepers -p <pangenome_file> If you want to do a persistent genome, use the following options to specify what you want: @@ -538,9 +538,9 @@ In your pangenome folder (or where you specified if you used the ``-o`` option), You can see all required arguments and available options with:: - genomeAPCAT align -h + PanACoTA align -h -In order to align your persistent families, you need to provide your persistent genome file (as generated by genomeAPCAT corepers), and the list of genome names included in the dataset. +In order to align your persistent families, you need to provide your persistent genome file (as generated by PanACoTA corepers), and the list of genome names included in the dataset. Input file formats ------------------ @@ -677,7 +677,7 @@ Align To do the alignment of all proteins of your persistent genome, run:: - genomeAPCAT align -c <pers_genome> -l <list_file> -n <dataset_name> -d <dbdir> -o <resdir> + PanACoTA align -c <pers_genome> -l <list_file> -n <dataset_name> -d <dbdir> -o <resdir> with: @@ -691,7 +691,7 @@ Add ``--threads <num>`` to parallelize the alignments. Put 0 to use all cores of In your ``<resdir>`` directory, you will find: - - ``genomeAPCAT-align_<dataset_name>.log*``: the 3 log files as in the :ref:`other steps<logf>`. + - ``PanACoTA-align_<dataset_name>.log*``: the 3 log files as in the :ref:`other steps<logf>`. - a folder ``List-<dataset_name>``: contains, for each genome, the list of persistent proteins (that must be extracted to align them). - a folder ``Align-<dataset_name>``: contains: @@ -709,7 +709,7 @@ In your ``<resdir>`` directory, you will find: You can see all required arguments and available options with:: - genomeAPCAT tree -h + PanACoTA tree -h To infer a phylogenetic tree, you need to provide an alignment file, in fasta format. Each fasta entry will be a leaf of the phylogenetic tree. @@ -736,12 +736,12 @@ By default, 'tree' subcommand will use `FastTreeMP <http://journals.plos.org/plo .. code-block:: bash - genomeAPCAT tree -a <align_file> + PanACoTA tree -a <align_file> However, we also provide the possibility to use `FastME <https://academic.oup.com/mbe/article/32/10/2798/1212138/FastME-2-0-A-Comprehensive-Accurate-and-Fast>`_ or `Quicktree <https://www.ncbi.nlm.nih.gov/pubmed/12424131>`_. For that, add the option ``-s <soft>`` with ``fastme`` or ``quicktree`` in ``<soft>``. -See ``genomeAPCAT tree -h`` to have an overview of all options available. +See ``PanACoTA tree -h`` to have an overview of all options available. FastTree options ^^^^^^^^^^^^^^^^ @@ -757,7 +757,7 @@ In your ``<outdir>`` directory, you will find: - ``<dataset_name>.grp.aln.fasttree.log``: logfile of FastTree, with information on running steps, and intermediate trees inferred - ``<dataset_name>.grp.aln.fasttree_tree.nwk``: the final tree inferred, in Newick format - - ``genomeAPCAT-tree-fasttree.log*``: the 3 log files as in the :ref:`other steps<logf>` + - ``PanACoTA-tree-fasttree.log*``: the 3 log files as in the :ref:`other steps<logf>` FastME options @@ -765,7 +765,7 @@ FastME options To use fastme with default options, run:: - genomeAPCAT tree -a <align_file> -s fastme + PanACoTA tree -a <align_file> -s fastme You can also specify the following options: @@ -781,14 +781,14 @@ In your ``<outdir>`` directory, you will find: - ``<dataset_name>.grp.aln.phylip.fastme.log``: logfile of FastME, with information on running steps - ``<dataset_name>.grp.aln.phylip.fastme_dist-mat.txt``: distance matrix of all given genomes - ``<dataset_name>.grp.aln.phylip.fastme_tree.nwk``: the final tree inferred in Newick format - - ``genomeAPCAT-tree-fastme.log*``: the 3 log files as in the :ref:`other steps<logf>` + - ``PanACoTA-tree-fastme.log*``: the 3 log files as in the :ref:`other steps<logf>` Quicktree options ^^^^^^^^^^^^^^^^^ To use Quicktree with default options, run:: - genomeAPCAT tree -a <align_file> -s quicktree + PanACoTA tree -a <align_file> -s quicktree You can also specify the following options: @@ -800,4 +800,4 @@ In your ``<outdir>`` directory, you will find: - ``<dataset_name>.grp.aln.stockholm``: alignment converted in Stockholm format, the input of Quicktree - ``<dataset_name>.grp.aln.stockholm.quicktree.log``: logfile of quicktree, empty if no error occurred - ``<dataset_name>.grp.aln.stockholm.quicktree_tree.nwk``: the final tree inferred in Newick format - - ``genomeAPCAT-tree-quicktree.log*``: the 3 log files as in the :ref:`other steps<logf>` + - ``PanACoTA-tree-quicktree.log*``: the 3 log files as in the :ref:`other steps<logf>` diff --git a/doc/source/whatis.rst b/doc/source/whatis.rst index e2eafd6616d1bba56fd6e3fd26c227203f876e6f..269a97a2188b871325f25c2703fa700276d21c1b 100755 --- a/doc/source/whatis.rst +++ b/doc/source/whatis.rst @@ -1,12 +1,12 @@ =================== -What is genomeAPCAT +What is PanACoTA =================== -``genomeAPCAT`` is a software providing tools for large scale bacterial comparative genomics. From a set of complete and/or draft genomes, you can: +``PanACoTA`` is a software providing tools for large scale bacterial comparative genomics. From a set of complete and/or draft genomes, you can: - Do a quality control of your strains, to eliminate poor quality genomes, which would not give any information for the comparative study - Uniformly annotate all genomes - Do a Pan-genome - Do a Core or Persistent genome - Align all Core/Persistent families -- Infer a phylogenetic tree from the Core/Persistent families \ No newline at end of file +- Infer a phylogenetic tree from the Core/Persistent families diff --git a/titles.txt b/titles.txt new file mode 100644 index 0000000000000000000000000000000000000000..2c6050b6e2892fab30ca682866d9f5564109bb42 --- /dev/null +++ b/titles.txt @@ -0,0 +1,50 @@ + ___ _ __ ___ _ +| o \ _ _ / \ / _||_ _|/ \ +| _//o\ |/ \| o ( (_/o\ || o | +|_| \_,]L_n||_n_|\__\_/_||_n_| + + +efti + + ____ _ ____ _____ _ + | _ \ __ _ _ __ / \ / ___|__|_ _|/ \ + | |_) / _` | '_ \ / _ \| | / _ \| | / _ \ + | __/ (_| | | | |/ ___ \ |__| (_) | |/ ___ \ + |_| \__,_|_| |_/_/ \_\____\___/|_/_/ \_\ + + +ivrit + + + ___ _____ ___ _____ _____ +( _`\ ( _ )( _`\ (_ _)( _ ) +| |_) ) _ _ ___ | (_) || ( (_) _ | | | (_) | +| ,__/'/'_` )/' _ `\| _ || | _ /'_`\ | | | _ | +| | ( (_| || ( ) || | | || (_( )( (_) )| | | | | | +(_) `\__,_)(_) (_)(_) (_)(____/'`\___/'(_) (_) (_) + + + +puffy + + _____ _____ _______ + | __ \ /\ / ____| |__ __|/\ + | |__) |_ _ _ __ / \ | | ___ | | / \ + | ___/ _` | '_ \ / /\ \| | / _ \| | / /\ \ + | | | (_| | | | |/ ____ \ |___| (_) | |/ ____ \ + |_| \__,_|_| |_/_/ \_\_____\___/|_/_/ \_\ + + + +big + +______ ___ _____ _____ ___ +| ___ \ / _ \/ __ \ |_ _/ _ \ +| |_/ /_ _ _ __ / /_\ \ / \/ ___ | |/ /_\ \ +| __/ _` | '_ \| _ | | / _ \| || _ | +| | | (_| | | | | | | | \__/\ (_) | || | | | +\_| \__,_|_| |_\_| |_/\____/\___/\_/\_| |_/ + + + +doom \ No newline at end of file