diff --git a/source/Collection_Data_Types.rst b/source/Collection_Data_Types.rst index c1bcc6464c39bcd43099dcdc6f2c6020f5ce4e5d..967456f1c75d6bc52a60147d1ce5d329c3252fb3 100644 --- a/source/Collection_Data_Types.rst +++ b/source/Collection_Data_Types.rst @@ -110,19 +110,16 @@ wihout using python shell, what is the results of the following statements: x[3] = -4 # what is the value of x now ? y = sum(x)/len(x) #what is the value of y ? why ? - y = 0 + y = 0.5 +.. warning:: -because sum(x) is an integer, len(x) is also an integer so in python2.x the result is an integer, -all the digits after the periods are discarded. -In python3 we will obtain the expected result (see :ref:``) - - -Exercise --------- + In python2 the result is :: + + y = 0 -How to compute safely the average of a list? :: + because sum(x) is an integer, len(x) is also an integer so in python2.x the result is an integer, + all the digits after the periods are discarded. - float(sum(l)) / float(len(l)) Exercise -------- @@ -206,9 +203,9 @@ first implementation: second implementation: """""""""""""""""""""" -Mathematically speaking the generation of all codons can be the cartesiens product +Mathematically speaking the generation of all codons can be the cartesian product between 3 vectors 'acgt'. -In python there is a function to do that in ``itertools module``: `https://docs.python.org/2/library/itertools.html#itertools.product <product>`_ +In python there is a function to do that in ``itertools module``: `https://docs.python.org/3/library/itertools.html#itertools.product <product>`_ .. literalinclude:: _static/code/codons_itertools.py @@ -274,7 +271,7 @@ So we can use the specifycity of set :: Exercise -------- -We need to compute the occurence of all kmers of a given lenght present in a sequence. +We need to compute the occurrence of all kmers of a given length present in a sequence. Below we propose 2 algorithms. @@ -378,8 +375,8 @@ bonus: Print the kmers by ordered by occurences. -| see `https://docs.python.org/2/library/stdtypes.html#mutable-sequence-types <sort>`_ -| see `https://docs.python.org/2/library/operator.html#operator.itemgetter <operator.itemgetter>`_ +| see `https://docs.python.org/3/library/stdtypes.html#mutable-sequence-types <sort>`_ +| see `https://docs.python.org/3/library/operator.html#operator.itemgetter <operator.itemgetter>`_ .. literalinclude:: _static/code/kmer_2.py @@ -475,7 +472,7 @@ Exercise let the following enzymes collection: :: import collections - RestrictEnzyme = collections.namedtuple("RestrictEnzyme", "name comment sequence cut end") + RestrictEnzyme = collections.namedtuple("RestrictEnzyme", ("name", "comment", "sequence", "cut", "end")) ecor1 = RestrictEnzyme("EcoRI", "Ecoli restriction enzime I", "gaattc", 1, "sticky") ecor5 = RestrictEnzyme("EcoRV", "Ecoli restriction enzime V", "gatatc", 3, "blunt") @@ -504,7 +501,7 @@ and the 2 dna fragments: :: #. Write a function *seq_one_line* which take a multi lines sequence and return a sequence in one line. #. Write a function *enz_filter* which take a sequence and a list of enzymes and return a new list containing - the enzymes which are a binding site in the sequence + the enzymes which have a binding site in the sequence #. use the functions above to compute the enzymes which cut the dna_1 apply the same functions to compute the enzymes which cut the dna_2 compute the difference between the enzymes which cut the dna_1 and enzymes which cut the dna_2 @@ -532,7 +529,7 @@ with this algorithm we find if an enzyme cut the dna but we cannot find all cuts for enz in enzymes: print enz.name, dna_1.count(enz.sequence) -the latter algorithm display the number of occurence of each enzyme, But we cannot determine the position of every sites. +the latter algorithm display the number of occurrence of each enzyme, But we cannot determine the position of every sites. We will see how to do this later.