Skip to content
Snippets Groups Projects
Commit edf49df9 authored by Etienne Kornobis's avatar Etienne Kornobis
Browse files

update seaborn practicals (2)

parent e3a50101
Branches
No related tags found
No related merge requests found
%% Cell type:markdown id:rotary-designation tags:
%% Cell type:markdown id:instrumental-personal tags:
# <center>**TP**</center>
# <center><b>Hands-on</b></center>
<div style="text-align:center">
<img src="images/seaborn.png" width="600px">
<div>
Bertrand Néron, François Laurent, Etienne Kornobis
<br />
<a src=" https://research.pasteur.fr/en/team/bioinformatics-and-biostatistics-hub/">Bioinformatics and Biostatistiqucs HUB</a>
<br />
© Institut Pasteur, 2021
</div>
</div>
%% Cell type:markdown id:respected-history tags:
%% Cell type:markdown id:compliant-basis tags:
Practice your graphing skills using data from milieu intérieur in `data/mi.csv`:
%% Cell type:code id:adolescent-spirituality tags:
%% Cell type:markdown id:departmental-exhibition tags:
- Do a boxplot showing the differences in temperature between females and males:
%% Cell type:code id:98e904b6-6e90-4c74-a463-2339d3961250 tags:
``` python
```
%% Cell type:markdown id:widespread-rendering tags:
%% Cell type:markdown id:portuguese-worse tags:
- Do a boxplot showing the differences in temperature between females and males:
- Using a histogram and continuous probability density curve, display the distribution of age in the dataset
%% Cell type:code id:dressed-performer tags:
%% Cell type:code id:55756807-e1fb-4fb5-878c-5e46acea7a11 tags:
``` python
```
%% Cell type:markdown id:acute-debut tags:
%% Cell type:markdown id:prepared-stephen tags:
- Using an histogram, display the distribution of age in the dataset (with kde as well)
- Use a barplot to show the count of vaccinated for yellow fever (see the documentation for a countplot)
%% Cell type:code id:crucial-bracelet tags:
%% Cell type:code id:1425046c-a058-45fe-95b5-5eca6ebbd33a tags:
``` python
```
%% Cell type:code id:minor-secretariat tags:
%% Cell type:markdown id:immediate-method tags:
- Plot the distribution of age for the people vaccinated for the flu
%% Cell type:code id:d567194c-3698-44c9-b5f8-b8a3d3493b0c tags:
``` python
```
%% Cell type:markdown id:processed-diameter tags:
%% Cell type:markdown id:temporal-synthesis tags:
- Use a barplot to show the count of vaccinated for yellow fever (see the documentation for a countplot)
- Feel free to explore more of [seaborn](https://seaborn.pydata.org/examples/index.html) !
%% Cell type:markdown id:db56d49a-4770-4f9e-af6b-78960574d338 tags:
# Exploring count matrices from RNA-seq data
%% Cell type:markdown id:5377668b-dea5-4c20-8249-5266f98774eb tags:
<img src="images/rnaseq.png" style="margin:0 auto;width:800px">
%% Cell type:code id:indian-response tags:
%% Cell type:markdown id:ebf1606b-0b21-4821-a899-551ec33c977e tags:
- Import the count_matrix tsv file from the data folder
%% Cell type:code id:eb53a1f5-9ea7-491e-bcfa-820cb1663af5 tags:
``` python
```
%% Cell type:markdown id:scenic-adoption tags:
%% Cell type:markdown id:c80d9947-9ccf-4499-a1c2-9194377cd054 tags:
- Plot the distribution of age for the people vaccinated for the flu
- Simplify the dataframe to only have the "Geneid", "WTx" and "Cx" columns
%% Cell type:code id:weighted-terrain tags:
%% Cell type:code id:56e90032-75ce-47b5-9cd3-95219cd7b26e tags:
``` python
```
%% Cell type:markdown id:operating-union tags:
%% Cell type:markdown id:eb65b51f-f689-4a66-b47c-e79f0e9eba52 tags:
- Feel free to explore more of [seaborn](https://seaborn.pydata.org/examples/index.html) !
- Format properly your DataFrame to be able to use https://seaborn.pydata.org/generated/seaborn.clustermap.html to realize a heatmap.
%% Cell type:code id:9b422fcb-7cc1-4766-92e3-276742381ae6 tags:
``` python
```
%% Cell type:markdown id:f8d6188e-3a37-4ba5-b377-a11696054e9c tags:
- Explore the clustermap documentation to have a more visual heatmap by standardizing the data within genes.
%% Cell type:code id:06be3f98-2167-44ac-9318-955286d77903 tags:
``` python
```
%% Cell type:markdown id:2e61a207-223a-4c01-88ea-76b1b8c3a0b9 tags:
- Reformat the counts_df dataframe to have genes in columns and samples in rows.
- Add a "group" column defining the grouping of the samples:
- "WTx" samples will be from the "WT" group.
- "Cx" samples will be from the "C" group.
%% Cell type:code id:eea3f521-6960-44ab-ac0b-fcf5a002237f tags:
``` python
```
%% Cell type:markdown id:9a88ecb1-9ed3-4160-91ee-24a30e994b71 tags:
- Display a barplot showing the mean expression for each group for a particular gene (for example "gene-LEPBI_RS00065").
%% Cell type:code id:cf74e85e-eef3-4023-bb88-5a864cf3c3f9 tags:
``` python
```
%% Cell type:markdown id:99e2455a-cb7d-44d5-a4a0-2cf272c814ab tags:
- Try plotting a swarmplot on top of the previous barplot:
%% Cell type:code id:7cf225f9-aea7-4cd9-ac90-a99592799527 tags:
``` python
```
%% Cell type:markdown id:d200d375-362e-4c1d-a88e-130b094e6feb tags:
- Now plot the same data using a boxplot. Can you see the problem of displaying boxplots for this kind of data ?
%% Cell type:code id:e4daf00e-9a2c-4ec4-9d26-aa18aae5d82d tags:
``` python
```
%% Cell type:markdown id:2e1cabe0-aab7-4f0e-888e-81aae7d5df8d tags:
- Compute the median of each genes by groups:
%% Cell type:code id:6ffd0f59-0fd7-41b9-a87a-c6e1a74145e8 tags:
``` python
```
%% Cell type:markdown id:308cc10b-6727-4bc5-b05d-4777037e252e tags:
We are going now to add extra annotations to this median table in order to identify genes of interest.
- Import the annotation.csv table from the data folder:
%% Cell type:code id:9be6ee5b-d497-47fa-8ac5-cf5514fd52c0 tags:
``` python
```
%% Cell type:markdown id:50fa81a7-3f34-4160-ad2d-f77d21be9ac0 tags:
Annotations in this table are available for many types of loci (the "genetic_type" column), but here we will focus on the "gene" genetic_type.
- Filter the annotation dataframe to have only "gene" as "genetic_type".
%% Cell type:code id:f9a8bcf7-0bcc-43e8-828a-ec204658e528 tags:
``` python
```
%% Cell type:markdown id:f8a4e744-e7e2-43b6-b3d4-e59feb40d3ff tags:
- Concatenate the dataframe with median by group and the annotation dataframe together:
%% Cell type:code id:afd8467a-33e1-4b9e-8f6d-b2229099c874 tags:
``` python
```
%% Cell type:markdown id:af9f8e1f-5f8b-4152-b08a-44e957f13cec tags:
- Calculate an estimate of the gene expression fold change for each gene (by dividing the C median expressions by WT median expressions).
- Add it as a "FoldChange" column to the previous dataframe.
%% Cell type:code id:bb617d00-2c2d-45cc-ace0-3656dc999b17 tags:
``` python
```
%% Cell type:markdown id:d70eb26b-0a26-4bbc-af03-ba8781b09fb5 tags:
- Use a barplot to display fold changes and using the new gene annotation (The "Name" column)
%% Cell type:code id:4dd4cbee-547f-43f1-9ed7-173f3040b8d5 tags:
``` python
```
%% Cell type:markdown id:34a26492-7c6b-4a07-a4de-67ec8f693cdc tags:
- By calculating the length of each gene and using a visualisation, does gene expression appears correlated with gene length ?
%% Cell type:code id:6f35b696-0807-4df4-9310-cb9197e7bf85 tags:
``` python
```
%% Cell type:markdown id:a2627322-e6a5-422f-8a69-b89dbd4b777e tags:
- Create a function which produce a single image with four different plots of your choice and save it to pdf file.
%% Cell type:code id:70e001a1-2848-4fb7-9f33-7beb4475e0fc tags:
``` python
```
%% Cell type:markdown id:0d05aba4-3c85-4cd9-85f3-5296b19308fb tags:
# Extras
%% Cell type:markdown id:66d6668e-683f-462e-a72f-28bdda8736f2 tags:
- Using ipywidget, make a function to display barplot of gene expression by groups with the gene being selected by the user (using a Dropdown widget for example).
%% Cell type:code id:e587f202-7ca4-43fb-ac3c-015c740c69d2 tags:
``` python
```
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment