diff --git a/Tuesday/GSEA_camera/data/LFC_genes_human.RData b/Tuesday/GSEA_camera/data/LFC_genes_human.RData
new file mode 100644
index 0000000000000000000000000000000000000000..fef875cb5cf7ce118c20f071d28a3f0cd4232199
Binary files /dev/null and b/Tuesday/GSEA_camera/data/LFC_genes_human.RData differ
diff --git a/Tuesday/GSEA_camera/data/differential_genes_human.RData b/Tuesday/GSEA_camera/data/differential_genes_human.RData
new file mode 100644
index 0000000000000000000000000000000000000000..562aef32767bf3efa1bec966d3000285b7df3bda
Binary files /dev/null and b/Tuesday/GSEA_camera/data/differential_genes_human.RData differ
diff --git a/Tuesday/GSEA_camera/data/differential_genes_myc.RData b/Tuesday/GSEA_camera/data/differential_genes_myc.RData
new file mode 100644
index 0000000000000000000000000000000000000000..2176cdccb6584786e61c144f785c85996b155a6f
Binary files /dev/null and b/Tuesday/GSEA_camera/data/differential_genes_myc.RData differ
diff --git a/Tuesday/practicals_day2_PHINDAccess.Rmd b/Tuesday/practicals_day2_PHINDAccess.Rmd
new file mode 100644
index 0000000000000000000000000000000000000000..e70eece4acd7f99706d105074913c77de3366191
--- /dev/null
+++ b/Tuesday/practicals_day2_PHINDAccess.Rmd
@@ -0,0 +1,545 @@
+---
+title: "Practical session - PHINDAccess course: gene set enrichment analysis"
+author: "Hélène Lopez-Maestre, Emeline Perthame, Natalia Pietrosemoli"
+date: "24 May 2022"
+output:
+  html_document:
+    code_folding: hide
+    theme: spacelab
+    toc: yes
+    toc_float: yes
+editor_options: 
+  chunk_output_type: console
+---
+
+```{r setup, message=FALSE, warning=FALSE}
+# option for html output
+knitr::opts_chunk$set(echo = TRUE)
+
+# Set the random generator seed so we can reproduce exactly results without any stochasticity
+set.seed(1234)
+
+# Load packages that will be useful for the analysis
+library(FactoMineR)
+library(factoextra)
+library(limma)
+library(ggplot2)
+library(RColorBrewer)
+library(pheatmap)
+library(ggpubr)
+library(org.Hs.eg.db)
+library(pathview)
+library(DESeq2)
+# library(AnnotationDbi)
+
+# library(dplyr)
+
+# library(KEGGREST)
+# library(biomaRt)
+# library(png)
+
+
+# library(progress)
+
+# library(gage)
+
+# Cosmetic choice: set light theme for ggplot2 plots
+theme_set(theme_light())
+
+# Set significance level for the statistical tests (e.g. False Discovery Rate)
+alpha <- 0.05
+
+```
+
+# I. Gene set analysis 
+
+**Goals**
+
+- Identify and understand the steps of a functional enrichment analysis
+
+- Understand the output of provided R code lines. There are often several ways to write an R code obtaining the same result. This practical offers one possible solution! 
+
+- Even if you are not expected to be completely independent, ideally, by the end of the course, you should be able to reproduce a similar analysis. 
+
+For the enrichment analysis, we will use the Competitive Gene Set Test Accounting for Inter-gene Correlation, CAMERA,  which is  
+a gene set test implemented in the limma package. 
+This comprehensive package was originally implemented for the differential analysis of microarray and  later adapted for RNA-Seq data analysis.
+
+Do not hesitate to check the documentation  (https://www.bioconductor.org/packages/release/bioc/vignettes/limma/inst/doc/usersguide.pdf).
+
+
+## A. Gene set enrichment analysis on Mycobacterium tuberculosis - Gene Ontology
+
+We will perform the analysis on our Mycobacterium tuberculosis dataset,
+for which we have previously obtained the Gene Ontology annotations.
+
+### Load data and inspect the data.
+
+For this exercise, the input dataset is from an experiment conducted in  Myc tb, with 3 biological conditions (FR, CR and HR) and 3 replicates per condition annotated with the corresponding gene ontology terms. 
+
+We perform a PCA to reduce the dimensionality of our data and inspect the 
+distribution of our samples accoording to the biological condition they belong to. 
+
+```{r}
+# Import  data
+
+## Define YOUR OWN working directory:
+mywd = "/Users/npietros/Documents/Course-PHINDAccess-2022/Hands on Day 2-Functional Analysis/"
+setwd(mywd)
+
+# Load annotated filtered counts created in the annotation hands-on
+
+count.data <- read.csv("/Users/npietros/Documents/Course-PHINDAccess-2022/Hands on Day 1-Annotation/output/annotatedFilteredCounts_myc.txt", sep = "\t")
+
+# Create a target object containing the biological conditions
+# this represents the experimental design of your experiment:
+# 9 samples and three conditions
+
+target <- data.frame(samples = c("FR1","FR2","FR3","CR1","CR2","CR3","HR1","HR2","HR3"),
+                     condition = c("FR","FR","FR","CR","CR","CR","HR","HR","HR"))
+target$condition <- factor(target$condition)
+
+# Inspect the target file
+
+# Q1. How many conditions do you have in the target file? 
+# target$condition
+
+# Q2. Which pairwise comparisons can we define?  
+
+# ****** Step 1: Perform Principal Component Analysis to 
+# reduce the dimensionality of our data and see how samples are distributed.
+
+
+# Before performing the PCA, we need to perform a variance stabilizing transformation to make sure our data values are approximately homoskedastic (having constant variance along the range of mean values).
+# These transformation is useful when checking for outliers too.
+
+# Prepare count matrix and apply vst 
+
+counts.trans <- varianceStabilizingTransformation(as.matrix(count.data[,c("FR1","FR2","FR3","CR1","CR2","CR3","HR1","HR2","HR3")]))
+rownames(counts.trans) <- count.data$geneID
+
+# After transforming our counts, we will perfom our PCA only on the most variable genes.
+
+# Calculate the variation in expression for all genes.
+std <- apply(counts.trans,1,sd)
+
+# Filter most variant features
+selected.genes.pca <- order(std, decreasing = TRUE)[1:200]
+
+# prepare data for PCA: extract the 200 most variant features
+counts.pca <- t(counts.trans[selected.genes.pca,])
+
+# perform PCA
+pca <- PCA(counts.pca, graph = FALSE)
+
+# Q3. How are the samples distributed  according to the PCA analysis?  
+fviz_pca_ind(pca)
+
+# use the habillage option of fviz_pca_ind function to colour samples by condition
+fviz_pca_ind(pca, habillage = target$condition, mean.point = FALSE)
+```
+
+### Perform gene set enrichment analysis using CAMERA
+
+Camera main three inputs are 1) the count matrix 2) the indexes to match the count matrix genes and the gene set indexes and 3)  the design of the model.
+
+```{r}
+# ****** Step 2: Perform GSEA using CAMERA
+
+# 1) load list of GO annotation object previously saved
+load("/Users/npietros/Documents/Course-PHINDAccess-2022/Hands on Day 1-Annotation/output/list_go_myc.RData")
+# head(list_go)
+
+# 2) We need to get the index giving the correspondence between gene names  in this list and our count matrix. This is done directly with a camera function ids2indices
+
+# 
+# ?ids2indices
+# ids2indices = Convert Gene Identifiers to Indices for Gene Sets
+index <- ids2indices(list_go,rownames(counts.trans))
+
+# CAMERA needs to be applied on counts transformed using the voom  transformation that transforms count data to log2-counts per million (logCPM).
+
+counts.trans.voom <- voom(count.data[,c("FR1","FR2","FR3","CR1","CR2","CR3","HR1","HR2","HR3")])
+
+# 3) create design matrix 
+
+# first possible design
+design <- model.matrix( ~ condition, data = target)
+# create contrasts vectors to compare the 3 conditions
+FRvsCR <- c(0,1,0)
+FRvsHR <- c(0,1,-1)
+HRvsCR <- c(0,0,1)
+HRvsFR <- c(0,-1,1)
+
+# run camera for each comparison of the design
+cam.FRvsCR <- camera(counts.trans.voom,design=design,contrast=FRvsCR,index=index, inter.gene.cor = 0, sort = FALSE)
+cam.FRvsHR <- camera(counts.trans.voom,design=design,contrast=FRvsHR,index=index, inter.gene.cor = 0, sort = FALSE)
+cam.HRvsCR <- camera(counts.trans.voom,design=design,contrast=HRvsCR,index=index, inter.gene.cor = 0, sort = FALSE)
+
+# alternative design
+design0 <- model.matrix( ~ -1 + condition, data = target)
+# and then repeat the steps (write corresponding contrasts and run camera)
+FRvsCR0 <- c(-1,1,0)
+FRvsHR0 <- c(0,1,-1)
+HRvsCR0 <- c(-1,0,1)
+
+# Q4. How many GO terms are significant for each comparison? What are their names? 
+
+sum(cam.FRvsCR$FDR <= alpha)
+sum(cam.FRvsHR$FDR <= alpha)
+sum(cam.HRvsCR$FDR <= alpha)
+
+# Retrieves the names of the significant GO terms
+# rownames(cam.FRvsHR)[cam.FRvsHR$FDR <= alpha]
+# unique(count.data$GO[grepl("GO:0004418",count.data$GO)])
+
+# Number of genes in this GO term
+# list_go[["GO:0004418"]]
+# Number of genes in this GO term AND in our dataset
+# index[["GO:0004418"]]
+
+rownames(cam.HRvsCR)[cam.HRvsCR$FDR <= alpha]
+
+```
+
+NB1 : check the documentation of "contrast" R package which is useful to compute contrasts for complex designs (https://cran.r-project.org/web/packages/contrast/contrast.pdf)
+
+NB2 : If there are too many signficant go terms as a result of the enrichment, one alternative to reduce the number of terms is  using REVIGO  (http://revigo.irb.hr/).
+
+### Visualize results
+
+The visualization step is crucial for interpreting the results from the gene set enrichment analysis.
+
+We plot the results for a given  pairwise comparison using a "bubble plot" that orders the significant GO terms by it's significance level. The size of the "bubble" is proportional to the number of genes in our dataset that are annotated with that specific GO term, and the color of the bubble represents the direction of the enrichment.
+
+We can also add a more systemic view of the results of all the biological comparisons using a heatmap plot,
+where each column corresponds to a pairwise comparison, and each row corresponds to a gene set. The color of each cell corresponds to the direction of the enrichment.
+
+#### 1. Within a given comparison : Bubble plot 
+
+```{r}
+
+# ****** Step 3: Visualization of the results 
+
+# 1. Within a given comparison : Bubble plot 
+cam.HRvsCR.signif <- cam.HRvsCR[cam.HRvsCR$FDR <= 0.05,]
+
+ggplot(cam.HRvsCR.signif,
+       aes(x=reorder(rownames(cam.HRvsCR.signif),-log10(FDR)),
+           y=-log10(FDR),size = NGenes, color = Direction)) +
+  geom_point() + coord_flip() +
+  scale_color_manual(values=c("Down"="blue","Up"="Red")) +
+  ggtitle("HR vsCR" ) + xlab("GO terms")
+```
+
+
+#### 2. For all the comparsions of the system
+
+```{r}
+# 2. Visualize significant terms for the 3 comparisons through a heatmap
+
+# Create a matrix of significance with terms in rows and comparisons in column
+# Each cell of the matrix contains the significance information of a term for a given comparison and the direction of the pathway
+FDR.mat <- data.frame(HRvsCR=cam.HRvsCR$FDR,
+                      FRvsHR=cam.FRvsHR$FDR,
+                      FRvsCR=cam.FRvsCR$FDR)
+rownames(FDR.mat) <- rownames(cam.HRvsCR)
+signif <- 1*(FDR.mat <= alpha)
+
+Direction.mat <- data.frame(HRvsCR=cam.HRvsCR$Direction,
+                      FRvsHR=cam.FRvsHR$Direction,
+                      FRvsCR=cam.FRvsCR$Direction)
+rownames(Direction.mat) <- rownames(cam.HRvsCR)
+Direction <- +1*(Direction.mat == "Up") -1*(Direction.mat == "Down")
+
+df <- signif * Direction
+rownames(df) <- rownames(signif)
+idx <- rowSums(abs(df))
+df <- df[idx > 0, ,drop = FALSE]
+
+# Make the plot 
+col <- rev(brewer.pal(3,"RdBu"))
+pheatmap(df, scale = "none", clustering_method = "ward.D", 
+         treeheight_row = 0, treeheight_col = 0, na_col = "grey",
+         color = col,
+         cluster_cols = FALSE, 
+         cluster_rows = TRUE,
+         cellwidth = 12, cellheight = 12, fontsize = 12,
+         legend_breaks = c(-1,-0.5,0,0.5,1),
+         legend_labels = c("Down","","Non sig.","","Up"))
+
+# Replace GO terms ids by their name !!
+# When you create list_go you should add the name of the GO terms in the names of the elements of the list instead of the GO ids ! 
+
+# Q5. How do you interpret the  cell color  in this heatmap?  
+```
+
+### Cross results of GSEA and differential analysis
+
+We can also identify how many of our (previously identified) differentially expressed genes 
+are present in the statistically significant GO terms.
+
+
+```{r}
+
+# 1) load the lists of differentially expressed genes computed from a previous analysis not shown here 
+load("data/differential_genes_myc.RData")
+# inspect the loaded
+# names(DEG.myc)
+# head(DEG.myc$DEG.FRvsCR)
+
+# Q6. How many genes are differentially expressed for each comparison?
+sapply(DEG.myc,length)
+
+# It is interesting to find out if the genes in the significant go terms are significant in the differential analysis
+
+# list of genes belonging to significant GO terms
+
+sel.go <- "GO:0008855"
+DEG.myc$DEG.FRvsCR[DEG.myc$DEG.FRvsCR %in% list_go[[sel.go]]]
+# 0
+DEG.myc$DEG.FRvsHR[DEG.myc$DEG.FRvsHR %in% list_go[[sel.go]]] 
+# "ERDMAN_1240" is significant for comparison FRvsHR and belongs to "GO:0008855" which is significant in GSEA
+DEG.myc$DEG.HRvsCR[DEG.myc$DEG.HRvsCR %in% list_go[[sel.go]]] 
+# "ERDMAN_1240" "ERDMAN_1241" are significant for comparison HRvsCR and belong to "GO:0008855" which is significant in GSEA
+```
+
+## B. Gene set analysis on Human - KEGG Pathways
+
+We will perform the analysis on our Human dataset
+for which we have previously obtained the Kegg pathway annotations.
+
+### Load data and inspect the data
+
+```{r}
+# Load annotated filtered counts created in the annotation hands-on
+
+count.data <- read.csv("/Users/npietros/Documents/Course-PHINDAccess-2022/Hands on Day 1-Annotation/output/annotatedCounts_Human_solution1.txt", sep = ",", row.names = 1)
+
+
+# Create a target object containing the biological conditions
+# this represents the experimental design of your experiment:
+# 6 samples, two conditions, 3 donnors
+
+target <- data.frame(samples=colnames(count.data),
+                     condition=c("C","C","C","T","T","T"),
+                     donor=c("1","2","3","1","2","3"))
+target$condition <- factor(target$condition)
+target$donor <- factor(target$donor)
+
+# Q1. What are the conditions? What groups will be compared?  
+
+# filter low counts genes
+count.data <- count.data[rowSums(count.data) > 3,]
+
+# Q2. How many genes do you have now?
+
+# ***** Step 1: visualisation through PCA
+# Perform PCA
+counts.trans <- varianceStabilizingTransformation(as.matrix(count.data))
+# filter most variant genes
+std <- apply(counts.trans,1,sd)
+selected.genes.pca <- order(std, decreasing = TRUE)[1:200]
+counts.pca <- t(counts.trans[selected.genes.pca,])
+pca <- PCA(counts.pca, graph = FALSE)
+p1 <- fviz_pca_ind(pca)
+
+# similarly, color the samples by condition using the habillage argument
+
+# also, visualise the PCA on axes 2 and 3 
+p2 <- fviz_pca_ind(pca, axes = c(2,3))
+
+# we can also plot side by side the two plots
+ggarrange(p1,p2,ncol=2)
+
+# Q3. How do the samples segregate? What do you notice? 
+
+# We can explore the effect of removing the batch effect
+# by doing a new PCA
+# remove batch effect for PCA (warning !!! only for visualization)
+
+adjusted.counts.pca <- counts.trans
+adjusted.counts.pca <- removeBatchEffect(counts.trans, batch=target$donor)
+# filter most variant genes
+std <- apply(adjusted.counts.pca,1,sd)
+selected.genes.pca <- order(std, decreasing = TRUE)[1:200]
+counts.pca <- t(adjusted.counts.pca[selected.genes.pca,])
+pca <- PCA(counts.pca, graph = FALSE)
+fviz_pca_ind(pca)
+```
+
+
+### Perform GSEA using CAMERA
+
+```{r}
+
+# 1) load list of Kegg pathways annotation object previously saved
+load("/Users/npietros/Documents/Course-PHINDAccess-2022/Hands on Day 1-Annotation/output/Kegg.GeneIDs.RData")
+
+# Inspect the object
+# Q4: How many Kegg Pathways are there ?
+
+# Make index for CAMERA 
+index <- ids2indices(Kegg.GeneIDs,rownames(counts.trans))
+
+# Make design matrix, which accounts for condition and donor effect
+design <- model.matrix( ~ condition + donor, data = target)
+# design
+# Create contrast vector
+TvsC <- c(0,1,0,0)
+
+# Q5. Which groups will be compared with this contrast? 
+
+# run camera
+counts.trans.voom <- voom(count.data)
+cam.TvsC <- camera(counts.trans.voom,design=design,contrast=TvsC,index=index, inter.gene.cor = 0)
+```
+
+### Visualize results
+
+We plot the results for a given  pairwise comparison using a "bubble plot" that orders the significant GO terms by it's significance level. The size of the "bubble" is proportional to the number of genes in our dataset that are annotated with that specific GO term, and the color of the bubble represents the direction of the enrichment.
+
+We can also add a more specific view of the results for a given pathway using a heatmap plot,
+where each column corresponds to a sample, and each row corresponds to a gene whithin the pathway. The color of each cell corresponds to the expression level of the gene.
+
+Finally, we can produce a plot for a specific pathway using the pathview R package,
+where genes present in our dataset will be highlighted in the pathway, and the color of the pathway element corresponds to log fold change between the two given conditions.
+
+#### 1. Within comparison : Bubble plot 
+
+```{r, fig.height=8}
+# ****** Step 3: Visualization of the results 
+
+# 1. Within comparison : Bubble plot 
+
+cam.TvsC.signif <- cam.TvsC[cam.TvsC$FDR <= alpha,]
+ggplot(cam.TvsC.signif,aes(x=reorder(rownames(cam.TvsC.signif),1/FDR), y=-log10(FDR),size = NGenes, color = Direction)) + 
+  geom_point() + 
+  coord_flip() +
+  scale_color_manual(values=c("Down"="blue","Up"="Red")) + 
+  xlab("Kegg pathways")+ ggtitle("T vs C")
+
+# Q6. How many gene sets   are significant for this comparison? 
+# dim(cam.TvsC.signif)
+# or 
+# sum(cam.TvsC$FDR <= 0.05)
+
+```
+
+#### 2. Within a pathway: heatmap
+
+```{r}
+ # 2. Within a pathway : Visualize gene expression within a pathway and cross the differential expression results
+
+# load the lists of differentially expressed genes computed from a previous analysis (not shown here)
+load("data/differential_genes_human.RData")
+# head(DEG.TvsC)
+
+# Q7. How many genes are differentially expressed for each comparison?
+# length(DEG.TvsC)
+
+# 2. Within a specific pathway
+
+# vector of colors
+col <- rev(colorRampPalette(brewer.pal(n = 7, name = "RdBu"))(100))
+
+# select a significant pathway
+sel.path <- "Melanoma"
+
+
+# select genes in the count table that belong to "sel.path" pathway
+idx <- rownames(count.data)[rownames(count.data) %in% Kegg.GeneIDs[[grep(sel.path,names(Kegg.GeneIDs))]]]
+
+df <- counts.trans.voom[idx,]$E
+df <- t(scale(t(df)))
+
+m <- max(abs(df), na.rm=TRUE)
+mybreaks <- seq(-m,m,length.out=101)
+df <- na.omit(df)
+pheatmap(df,cluster_rows = TRUE, cluster_cols = FALSE,
+         color = col,
+         breaks = mybreaks,
+         border_color = NA,
+         scale = "none",
+         cellwidth = 9, cellheight = 9, fontsize = 9,
+         clustering_method = "ward.D2",
+        gaps_col = c(3),
+        main = sel.path 
+         )
+
+## 4. ADVANCED PART to change gene ids in the heatmap
+
+### But ... identifiers of genes are ENTREZ ids and gene symbols are more easy to interpret
+## So ... we have to deal with identifiers again to retrieve gene symbols 
+
+df <- counts.trans.voom[idx,]$E
+df <- t(scale(t(df)))
+symbol <- mapIds(org.Hs.eg.db,keys=rownames(df),keytype="ENTREZID",column="SYMBOL")
+
+# and also cross results with the DEG so we can highligth those genes 
+
+labrow <- NULL
+for (i in 1:nrow(df)){
+  if (rownames(df)[i] %in% DEG.TvsC){labrow[i] <- as.expression(bquote(bold(.(symbol[i]))))} else {labrow[i] <- symbol[i]}
+}
+
+m <- max(abs(df), na.rm=TRUE)
+mybreaks <- seq(-m,m,length.out=101)
+df <- na.omit(df)
+pheatmap(df,cluster_rows = TRUE, cluster_cols = FALSE,
+         labels_row = labrow,
+         color = col,
+         breaks = mybreaks,
+         border_color = NA,
+         scale = "none",
+         cellwidth = 9, cellheight = 9, fontsize = 9,
+         clustering_method = "ward.D2",
+        gaps_col = c(3),
+        main = sel.path 
+         )
+
+
+```
+
+#### 3.  Visualise a pathway with pathview
+
+Pathview is a tool set for pathway based data integration and visualization. It maps and renders a wide variety of biological data on relevant pathway graphs. 
+Fore more documentation (pathview has many advanced options), check the vignette of the package (https://bioconductor.org/packages/release/bioc/vignettes/pathview/inst/doc/pathview.pdf).
+
+```{r}
+# 1) Load a vector containing the log2 fold changes (LFC) from the differential analysis
+load("data/LFC_genes_human.RData")
+# Inspect the loaded object
+# str(LFC.TvsC)
+# Q8 What kind of gene identifiers are being used here ?
+
+
+# one argument of pathview() is the LFC computed during the differential analysis
+# check the documentation for this function 
+
+# ?pathview
+# look at the "gene.data" argument 
+
+
+# get the pathway id of "NF-kappa B signaling pathway" KEGG pathway
+pv.out <- pathview(gene.data = LFC.TvsC, pathway.id = "04064",
+                   species = "hsa", out.suffix = "human")
+# change the sign 
+pv.out <- pathview(gene.data = -LFC.TvsC, pathway.id = "04064",
+                   species = "hsa", out.suffix = "human")
+# the result is not printed in the R console
+# pathview() indicates where the output is saved 
+# can you find to which directory is saved ?
+
+# Q9: try with another pathway 
+
+```
+
+
+# R session information
+
+In the reproducible research framework, an important step is to save all the versions of the packages used to perform the statistical analysis. They must be provided when submitting a paper.
+
+```{r sessionInfo, results='asis'}
+sessionInfo()
+```