Merge branch 'master' of gitlab.pasteur.fr:hub-courses/python_one_week_4_biologists_solutions

dd840dd8 · Bertrand NÉRON · 145cc286 · f1d462b6 · dd840dd8 · dd840dd8
Commit dd840dd8 authored 2 months ago by Bertrand NÉRON
--- a/source/Data_Types.rst
+++ b/source/Data_Types.rst
@@ -330,39 +330,6 @@ Exercise
 .. #. Using the shorter string  ``s = 'gaattc'`` draw what happens in memory when you reverse ``s``.


-Exercise
--------
-
-| The ``il2_human`` sequence contains 4 cysteins (C) in positions 9, 78, 125, 145.
-| We want to generate the sequence of a mutant where the cysteins 78 and 125 are replaced by serins (S)
-| Write the pseudocode, before proposing an implementation:
-
-
-We have to take care of the difference between Python string numbering and usual position numbering:
-
-| C in seq -> in string
-|     9 -> 8
-|    78 -> 77
-|   125 -> 124
-|   145 -> 144
-
-| *generate 3 slices from the il2_human*
-| *head <- from the begining and cut between the first cystein and the second*
-| *body <- include  the 2nd and 3rd cystein*
-| *tail <- cut after the 3rd cystein until the end*
-| *replace body cystein by serin*
-| *make new sequence with head body_mutate tail*
-
-
-::
-
-    il2_human = 'MYRMQLLSCIALSLALVTNSAPTSSSTKKTQLQLEHLLLDLQMILNGINNYKNPKLTRMLTFKFYMPKKATELKHLQCLEEELKPLEEVLNLAQSKNFHLRPRDLISNINVIVLELKGSETTFMCEYADETATIVEFLNRWITFCQSIISTLT'
-    head = il2_human[:77]
-    body = il2_human[77:125]
-    tail = il2_human[126:]
-    body_mutate = body.replace('C', 'S')
-    il2_mutate = head + body_mutate + tail
-
 Exercise
 --------

@@ -388,7 +355,8 @@ Use the sv40 sequence to test your function.
   >>>
   >>> sequence = fasta_to_one_line(sv40)
   >>> gc_pc = gc_percent(sequence)
-   >>> report = "The sv40 is {0} bp length and has {1:.2%} gc".format(len(sequence), gc_pc)
+   >>> # report = "The sv40 is {0} bp length and has {1:.2%} gc".format(len(sequence), gc_pc)
+   >>> report = f"The sv40 is {len(sequence)} bp length and has {gc_pc:.2%} gc"
   >>> print report
   'The sv40 is 5243 bp length and has 40.80% gc'


--- a/source/Input_Output.rst
+++ b/source/Input_Output.rst
@@ -89,7 +89,7 @@ Exercise

 We ran a blast with the following command *blastall -p blastp -d uniprot_sprot -i query_seq.fasta -e 1e-05 -m 8 -o blast2.txt*

-m 8 is the tabular output. So each fields is separate to the following by a '\t' 
+-m 8 is the tabular output. So each fields is separated from the following by a '\\t'

 The fields are: query id, database sequence (subject) id, percent identity, alignment length, number of mismatches, number of gap openings, 
 query start, query end, subject start, subject end, Expect value, HSP bit score.

--- a/source/_static/code/gc_percent.py
+++ b/source/_static/code/gc_percent.py

 def gc_percent(seq):
    """
+    Compute the ratio of GC in a DNA sequence
+
    :param seq: The nucleic sequence to compute
    :type seq: string
    :return: the percent of GC in the sequence
    :rtype: float
    """
    seq = seq.upper()
-    gc_pc = float(seq.count('G') + seq.count('C')) / float(len(seq))
+    gc_pc = seq.count('G') + seq.count('C') / len(seq)
    return gc_pc
--- a/source/_static/code/parse_blast.py
+++ b/source/_static/code/parse_blast.py
@@ -45,5 +45,5 @@ if __name__ == '__main__':
    table_sorted = sorted(table_hits, key=itemgetter(2), reverse=True)
    # alternative
    # table_sorted = sorted(table, key = lambda x : x[2], reversed = True)
-    write_blast_output(table_hits, 'blast_sorted.txt')  
+    write_blast_output(table_sorted, 'blast_sorted.txt')