Commit 5b40406f authored by Nicolas  MAILLET's avatar Nicolas MAILLET
Browse files

User guide is done

parent 8dc1d397
......@@ -11,6 +11,10 @@ On the following, nomenclature of `Schechter and Berger <https://www.ncbi.nlm.ni
...P3-P2-P1-|-P1'-P2'-P3'...
In **RPG**, this nomenclature is represented as::
...(P3)(P2)(P1)(,)(P1')(P2')(P3')...
-----------------
Available enzymes
-----------------
......
......@@ -237,7 +237,7 @@ Output of **RPG** contains several informations for each generated peptide, in t
- Peptide isoelectric point estimation (pI)
- Peptide sequence
Peptide molecular weight approximation is computed as the addition of average isotopic masses of each amino acid present in the peptide. Then the average isotopic mass of one water molecule is added to it. Molecular weight values are given in Dalton (Da). It does not take into consideration any kind of modification and for the first and last peptide, the computation is not perfect as it should not be added 1 water to them, but around 17 Da to the N terminal and 1 Da to the C terminal.
Peptide molecular weight approximation is computed as the addition of average isotopic masses of each amino acid present in the peptide. Then the average isotopic mass of one water molecule is added to it. Molecular weight values are given in Dalton (Da). It does not take into consideration any kind of modification and for the first and last peptide.
Isoelectric point is computed by solving Henderson–Hasselbalch equation using binary search. It is based on Lukasz P. Kozlowski works (http://isoelectric.org/index.html).
......@@ -342,11 +342,15 @@ On the following, nomenclature of `Schechter and Berger <https://www.ncbi.nlm.ni
...P3-P2-P1-|-P1'-P2'-P3'...
In **RPG**, this nomenclature is represented as::
...(P3)(P2)(P1)(,)(P1')(P2')(P3')...
Definition of rules
-------------------
A rule specify which amino acid is targeted by the enzyme, the cleavage position (i.e. **before** or **after** the targeted amino acid) and optionally the surrounding context. Each amino acid must be surrounded by parenthesis, i.e. '**(**' and '**)**' and the cleavage position is symbolized by a comma, i.e. '**,**'.
A rule specify which amino acid is targeted by the enzyme, the cleavage position (i.e. **before** or **after** the targeted amino acid) and optionally the surrounding context. Each amino acid must be surrounded by parenthesis, i.e. '**(**' and '**)**' and the cleavage position is symbolized by a comma, i.e. '**,**'. The comma must always be directly before or after a parenthesis.
For example, to define a cleavage occurring **before** A, one must input::
......@@ -391,7 +395,7 @@ To make this enzyme always cleaves, for example, before A (`P1'`) followed by D
(,A)(D)
(C)()(B)(,A)(D)
Not that this enzyme will **not** cut BAD, has it is specified that it will cut before A preceded by B in `P1` **if there is C in `P3`**. Identically, it will **not** cut C*BA*, has D is require in `P2'` for the second rule::
Not that this enzyme will **not** cleave BAD, has it is specified that it will cleave before A preceded by B in `P1` **if there is C in `P3`**. Identically, it will **not** cleave C*BA*, has D is require in `P2'` for the second rule::
$rpg -a
Name of the new enzyme?
......@@ -444,24 +448,223 @@ It is possible to define none related cleavage rules for the same enzyme, for ex
This enzyme will cleave after G (position `P1`) followed by G in `P1'` and also after W (`P1`) preceded by P in `P2` and followed by E in `P1'` and T in `P2'`.
Note that each rule must concerned only **one** cleavage site. It is not possible to input rule like::
(A,)(B,)
This would define an enzyme cleaving after A in `P1` followed by B in `P1'` but also cleaving after B in `P1` preceded by A in `P2`. The proper way to input this is by using two separate rules::
(A,)(B)
(A)(B,)
However, it is possible to wrote rules in a more efficient way as explain in :ref:`easy`.
Definition of exceptions
------------------------
An exception should always be linked to a rule.
An exception specify when a cleavage should **not** occurs. **Exceptions must always be linked to a rule**.
For example, to define a cleavage occurring **before** A (`P1'`), one must input::
(,A)
Exceptions can then be inputted, for example a cleavage occurs before A(`P1'`), except when P is in (`P2'`) is defined by adding this exception::
(,A)(P)
This enzyme will always cleave before A when not followed by P::
rpg -a
Name of the new enzyme?
rpg_example_userguide
Create a cleaving rule (c) or an exception (e)? (q) to quit:
c
Write your cleaving rule, (q) to quit:
(,A)
Create a cleaving rule (c) or an exception (e)? (q) to quit:
e
Write your exception rule, (q) to quit:
(,A)(P)
Create a cleaving rule (c) or an exception (e)? (q) to quit:
q
Add an other enzyme? (y/n)
n
rpg -i CWBADE -e 28
>Input_0_rpg_example_userguide_3_3_307.36728_5.46
CWB
>Input_1_rpg_example_userguide_6_3_333.29818_3.4
ADE
rpg -i CWBAPE -e 28
>Input_0_rpg_example_userguide_0_6_604.67828_3.6
CWBAPE
It is possible to input complex exceptions. For the previous enzyme, we can add the following exception::
(G)(T)()(,A)()(F)
This enzyme will always cleave before A when not followed by P or preceded by G in `P3`, T in `P2` and F in `P3'` **at the same time**::
rpg -a
Name of the new enzyme?
rpg_example_userguide
Create a cleaving rule (c) or an exception (e)? (q) to quit:
c
Write your cleaving rule, (q) to quit:
(,A)
Create a cleaving rule (c) or an exception (e)? (q) to quit:
e
Write your exception rule, (q) to quit:
(,A)(P)
Create a cleaving rule (c) or an exception (e)? (q) to quit:
e
Write your exception rule, (q) to quit:
(G)(T)()(,A)()(F)
Create a cleaving rule (c) or an exception (e)? (q) to quit:
q
Add an other enzyme? (y/n)
n
rpg -i CWBADE -e 28
>Input_0_rpg_example_userguide_3_3_307.36728_5.46
CWB
>Input_1_rpg_example_userguide_6_3_333.29818_3.4
ADE
rpg -i CWBAPE -e 28
>Input_0_rpg_example_userguide_0_6_604.67828_3.6
CWBAPE
rpg -i GTBAPF -e 28
>Input_0_rpg_example_userguide_0_6_491.54438_5.54
GTBAPF
rpg -i GTBAPE -e 28
>Input_0_rpg_example_userguide_3_3_176.17228_5.54
GTB
>Input_1_rpg_example_userguide_6_3_315.32628_3.6
APE
It is important to understand that an exception should always be linked to a rule. If one inputs this rule::
(A,)
followed by this exception::
(B,)(C)
Writing easily complex enzymes
the exception will not be taken into account. This enzyme will just always cleave after A.
.. _easy:
Easily writing complex enzymes
------------------------------
Mot clé 'or' et double comma
copier/coller
To make enzyme creation easier to use, two tricks are available.
The first one simplify the definition of enzymes cleaving **before** and **after** a given amino acid. Defining an enzyme cleaving, for example, before **and** after A, can be done with two rules::
(,A)
(A,)
Deleting enzymes
----------------
or simply using::
(,A,)
The second tricks is the use of the keyword `or`. This allows multiple possibilities for on position. For example::
(,A or B)
is equivalent to::
(,A)
(,B)
.. warning:: do not input ``(,A or ,B)``, as a comma must always directly preceding or following a parenthesis.
Those two tricks help one complex enzymes. For example, :ref:`peps13` preferentially cleaves around F or L, sometimes before, sometimes after, depending on the context. More specifically, it will not cleave before F or L in `P1'` followed by P in `P2'`. It will not cleave before F or L in `P1'` preceded by R in `P1` or P in `P2` or H/K/R in `P3`. It will not cleave after F or L in `P1` followed by P in `P2'`. And it will not cleave after F or L in `P1` preceded by P in `P2` or H/K/R in `P3`.
It can be defined either by::
cleaving rules:
(F,)
(L,)
(,F)
(,L)
exception rules:
(,F)(P)
(,L)(P)
(R)(,F)
(R)(,L)
(P)()(,F)
(P)()(,L)
(H)()()(,F)
(K)()()(,F)
(R)()()(,F)
(H)()()(,L)
(K)()()(,L)
(R)()()(,L)
(F,)()(P)
(L,)()(P)
(P)(F,)
(P)(L,)
(H)()(F,)
(K)()(F,)
(R)()(F,)
(H)()(L,)
(K)()(L,)
(R)()(L,)
or, in a condensed way::
cleaving rule:
(,F or L,)
exception rules:
(,F or L)(P)
(R)(,F or L)
(P)()(,F or L)
(H or K or R)()()(,F or L)
(F or L,)()(P)
(P)(F or L,)
(H or K or R)()(F or L,)
Those two definitions are completely equivalent for **RPG**.
Example of enzymes
------------------
All available enzymes are in :ref:`enzymes`, including their **RPG**'s definition.
Deleting user-defined enzymes
=============================
All user-defined enzymes are stored in ``~/rpg_user.py``. This file is automatically generated by **RPG** en written in **Python**.
Each enzyme definition starts with::
# User-defined enzyme <name of the enzyme>
and finishes with::
CPT_ENZ += 1
followed by 3 blank line.
To remove an enzyme, be sure to backup the file **before** any modifications. Then just remove the whole Python code of the enzyme, including the mentioned lines above. Do not do any other modifications, as this code is used in **RPG** and any wrong modifications will make the software unable to run.
To remove all user-defined enzymes, just delete ``~/rpg_user.py`` file. It will be created again (empty) at the next launch of **RPG**.
Enzymes are stored in ~/rpg_user.py
\ No newline at end of file
Obviously, all deleted enzymes can not be recovered and will need to be defined again in **RPG**, using -a option, if one wants to use them again.
\ No newline at end of file
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment