Commit afc977a3 authored by Anne  BITON's avatar Anne BITON
Browse files

Load saved cell ids to reproduce exact same cell selection as in the paper

parent 71efd3e5
......@@ -202,7 +202,7 @@ There are `r sum(table(maxfrac$genemax)>=50)` genes that are predominant in at l
## Seurat violin plots
A Seurat object containing the unfiltered UMI counts is saved in the RData file `data/derived/YS_library123/YS_library123_umis.rda`.
A Seurat object containing the unfiltered UMI counts is saved in the RData file `data/derived/YS/YS_umis.rda`.
```{r, fig.height=15, fig.width=9}
......@@ -228,8 +228,8 @@ VlnPlot(object = umis, features = c("nFeature_RNA", "nCount_RNA", "percent.mito"
cols = annot$nextseq500_run[match(levels(umis$condition_halfplate), annot$condition_halfplate)]+1)
#as.character(umis$nextseq500_run))
dir.create(paste0(dirdata,'/derived/YS_library123/'))
save(umis, file=paste0(dirdata,'derived/YS_library123/YS_library123_umis.rda'))
dir.create(paste0(dirdata,'/derived/YS/'))
save(umis, file=paste0(dirdata,'derived/YS/YS_umis.rda'))
```
......@@ -299,6 +299,8 @@ sceall <- subset(sceall, subset = percent.mito < 0.05)
# remove spike-ins which are not usable anyway given their very low counts
sceall <- sceall[grep('^ERCC-', rownames(sceall), invert = TRUE), ]
sce <- as.SingleCellExperiment(sceall)
```
`r ncol(sceall)` cells and `r nrow(sceall)` genes were selected .
......@@ -308,9 +310,8 @@ sceall <- sceall[grep('^ERCC-', rownames(sceall), invert = TRUE), ]
Doublet detection is done using the package `scDblFinder` using default parameters and running the analysis for each plate.
see https://www.biorxiv.org/content/10.1101/2020.02.02.930578v1.full.pdf,.
```{r doublet}
```{r doublet, eval=FALSE}
library(scDblFinder)
sce <- as.SingleCellExperiment(sceall)
sce$groupForDoubletDetection <- sce$plate
sce$groupForDoubletDetection[sce$nextseq500_run == 2] <- 'run2' #regroup run2 since there are very few cells after passing preliminary filtering
sce <- scDblFinder(sce, samples=sce$groupForDoubletDetection)
......@@ -319,26 +320,35 @@ table(sce$scDblFinder.class)
```
Distribution of cells detected as doublets across plates by `scDblFinder`: `r table(sce$scDblFinder.class, sce$condition_halfplate) %>% knitr::kable()`.
We filter out `r sum(sce$scDblFinder.class == 'doublet')` cells detected as being doublets by `scDblFinder`.
```{r remove doublets}
```{r remove doublets, eval=FALSE}
sceall <- sceall[, sce$scDblFinder.class != 'doublet']
sce <- sce[, sce$scDblFinder.class != 'doublet']
```
<!-- Distribution of cells detected as doublets across plates by `scDblFinder`: r table(sce$scDblFinder.class, sce$condition_halfplate) %>% knitr::kable(). -->
## Number of cells after this first filtering step
<!-- We filter out r sum(sce$scDblFinder.class == 'doublet') cells detected as being doublets by `scDblFinder`. -->
We discard plate 15 that only contains less than 10 cells passing the quality thresholds.
Since we can't reproduce exactly the same selection of cells as the on in the paper (scDblFinder v1.1.8 doublet selection seems to vary a bit across runs of the same function). We saved the IDs of the selected cells in data/derived/YS and are loading them now.
We also discard plate 15 that only contains less than 10 cells passing the quality thresholds.
```{r}
sce <- sce[, sce$plate != '15']
# save cell selection
#selcells <- colnames(sce)
#saveRDS(selcells, file =paste0(dirdata,'derived/YS/YS_selcells.rds'))
selcells <- readRDS(file =paste0(dirdata,'derived/YS/YS_selcells.rds'))
sce <- sce[,selcells]
sceall <- sceall[,selcells]
```
## Number of cells after this first filtering step
The number of selected plates/cells for each plate is: `r table(sceall$condition_replicate_plate) %>% knitr::kable()`.
The number of cells selected per condition is: `r table(sceall$condition) %>% knitr::kable()`.
The number of cells selected per condition and genotype is: `r table(sceall$condition,sceall$genotype, useNA='ifany') %>% knitr::kable()`.
......@@ -346,11 +356,11 @@ The number of cells selected per condition and genotype is: `r table(sceall$cond
```{r}
#colData(sce) <- annot[match(colnames(sce), annot$cellID),]
save(sce, file=paste0(dirdata,'derived/YS_library123/YS_library123_sce.rda'))
save(sce, file=paste0(dirdata,'derived/YS/YS_sce.rda'))
```
The number of selected cells per replicate selected for each facs annotation is: `r table(sceall$FACS) %>% knitr::kable()`.
The UMI counts of the data remaining after this first filtering step are available in `data/derived/YS_library123/YS_library123_sce.rda`.
The UMI counts of the data remaining after this first filtering step are available in `data/derived/YS/YS_sce.rda`.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment