Commit dc2876ab authored by Rachel  LEGENDRE's avatar Rachel LEGENDRE
Browse files

change name to chIPflow from ePeak

parent 58eb4beb
<img align="left" width="100" height="100" src="images/EpiFlow_logo.png" alt="ChIPflow">
<img align="left" width="100" height="100" src="images/EpiFlow_logo.png" alt="ePeak">
# ChIPflow: from replicated ChIP-seq raw data to epigenomic dynamics
# ePeak: from replicated chromatin profiling data to epigenomic dynamics
......@@ -10,15 +10,15 @@
[[_TOC_]]
# What is ChIPflow ?
# What is ePeak ?
ChIPflow is a snakemake-based workflow for the analysis of ChIP-seq data from raw FASTQ files to differential analysis of transcription factor binding or histone modification marking. It streamlines critical steps like the quality assessment of the immunoprecipitation using the cross correlation and the replicate comparison for both narrow and broad peaks. For the differential analysis ChIPflow provides linear and non linear methods for normalisation between samples as well as conservative and stringent models for estimating the variance and testing the significance of the observed differences (see [chipflowr](https://gitlab.pasteur.fr/hub/chipflowr)).
ePeak is a snakemake-based workflow for the analysis of ChIP-seq data from raw FASTQ files to differential analysis of transcription factor binding or histone modification marking. It streamlines critical steps like the quality assessment of the immunoprecipitation using the cross correlation and the replicate comparison for both narrow and broad peaks. For the differential analysis ePeak provides linear and non linear methods for normalisation between samples as well as conservative and stringent models for estimating the variance and testing the significance of the observed differences (see [chipflowr](https://gitlab.pasteur.fr/hub/chipflowr)).
<img src="images/chipflow_pipeline.svg" width="700">
<img src="images/ePeak_pipeline.svg" width="700">
# How to install ChIPflow ?
# How to install ePeak ?
## Installation with singularity
......@@ -29,11 +29,11 @@ Pre-required tools:
- pysam
- singularity
A tutorial to create a conda environment with all dependencies is available here : [env.sh](https://gitlab.pasteur.fr/hub/chipflow/-/blob/master/env.sh)
A tutorial to create a conda environment with all dependencies is available here : [env.sh](https://gitlab.pasteur.fr/hub/ePeak/-/blob/master/env.sh)
Download the singularity container:
` singularity pull --arch amd64 --name chipflow.img library://rlegendre/default/chipflow:latest `
` singularity pull --arch amd64 --name ePeak.img library://rlegendre/default/ePeak:latest `
## Manual installation
......@@ -73,38 +73,38 @@ module load pysam
* Clone workflow:
`git clone https://gitlab.pasteur.fr/hub/chipflow.git`
`git clone https://gitlab.pasteur.fr/hub/ePeak.git`
* Download singularity container:
```
cd chipflow
singularity pull --arch amd64 --name chipflow.img library://rlegendre/default/chipflow:latest
cd ePeak
singularity pull --arch amd64 --name ePeak.img library://rlegendre/default/ePeak:latest
```
Then, you can configure the workflow.
# How to run ChIPflow ?
# How to run ePeak ?
> Note: you can test the pipeline with an example datasets: [here](https://gitlab.pasteur.fr/hub/chipflow#run-the-pipeline-on-test-data)
> Note: you can test the pipeline with an example datasets: [here](https://gitlab.pasteur.fr/hub/ePeak#run-the-pipeline-on-test-data)
## Usage
* Step 1: Download workflow
`git clone https://gitlab.pasteur.fr/hub/chipflow.git`
`git clone https://gitlab.pasteur.fr/hub/ePeak.git`
In any case, if you use this workflow in a paper, don't forget to give credits to the authors by citing the URL of this repository and, its DOI (see [above](https://gitlab.pasteur.fr/hub/chipflow#how-to-cite-chipflow-))
In any case, if you use this workflow in a paper, don't forget to give credits to the authors by citing the URL of this repository and, its DOI (see [above](https://gitlab.pasteur.fr/hub/ePeak#how-to-cite-ePeak-))
If you are using Singularity, you need to copy the Singularity image in the cloned ChIPflow directory.
If you are using Singularity, you need to copy the Singularity image in the cloned ePeak directory.
* Step 2: Configure workflow
To configure your analysis, you need to fill 2 configuration files, one to specify the design experimental of you project and one to parameter each step of the pipeline (stored in `config/`):
- [config.yaml](https://gitlab.pasteur.fr/hub/chipflow#how-to-fill-the-design)
- [design.txt](https://gitlab.pasteur.fr/hub/chipflow#how-to-fill-the-design)
- [config.yaml](https://gitlab.pasteur.fr/hub/ePeak#how-to-fill-the-design)
- [design.txt](https://gitlab.pasteur.fr/hub/ePeak#how-to-fill-the-design)
In addition, you can customize the MultiQC report by configuring this file: [multiqc_config.yaml](https://gitlab.pasteur.fr/hub/chipflow#how-to-fill-the-multiqc-config)
In addition, you can customize the MultiQC report by configuring this file: [multiqc_config.yaml](https://gitlab.pasteur.fr/hub/ePeak#how-to-fill-the-multiqc-config)
* Step 3: Execute workflow
......@@ -170,12 +170,12 @@ Design columns:
| NB_IP | Replicate number of the histone mark or TFs (i.e. 1 or 2) |
| NB_INPUT | Replicate number of INPUT file (i.e. 1 or 2) |
Link to an Example: [design.txt](https://gitlab.pasteur.fr/hub/chipflow/-/blob/master/test/design.txt)
Link to an Example: [design.txt](https://gitlab.pasteur.fr/hub/ePeak/-/blob/master/test/design.txt)
### How to fill the config file
All the parameters to run the pipeline are gathered in a YAML config file that the user has to fill before running the pipeline. Here is an filled example: [config.yaml](https://gitlab.pasteur.fr/hub/chipflow/-/blob/master/test/config.yaml)
All the parameters to run the pipeline are gathered in a YAML config file that the user has to fill before running the pipeline. Here is an filled example: [config.yaml](https://gitlab.pasteur.fr/hub/ePeak/-/blob/master/test/config.yaml)
This config file is divided in 2 sections:
......@@ -230,7 +230,7 @@ At the beginning of `config/multiqc_config.yaml` file, you have the possibility
# Title to use for the report.
title: "ChIP-seq analysis"
subtitle: "ChIP-seq analysis of CTCF factor in breast tumor cells" # Set your own text
intro_text: "MultiQC reports summarise analysis results produced from ChIPflow"
intro_text: "MultiQC reports summarise analysis results produced from ePeak"
# Add generic information to the top of reports
report_header_info:
......@@ -244,7 +244,7 @@ report_header_info:
## ChIPflow running modes
## ePeak running modes
After the read pre-processing steps of the pipeline, MACS2 peak calling software is used to identify enriched regions. Several settings of MACS2 are possible:
- To estimate the fragment size you can: use MACS2 model (default) or use PhantomPeakQualTool.
......@@ -397,10 +397,10 @@ done
- Inside `IP_NAME` you can use "-" but do not "\_" because this is used to separate `MARK`, `COND`, `REP` and `MATE` from FASTQ filenames. For example: `TF4_Ctl-HeLa_rep1_R1.fastq.gz`
**Can I use relative path in config ?**
- yes, but you need to consider ChIPflow directory as origin.
- yes, but you need to consider ePeak directory as origin.
**What if I have 3 INPUT replicates?**
- You can use ChIPflow with more than 2 replicates, replicates will be merged in a Maxi Pool.
- You can use ePeak with more than 2 replicates, replicates will be merged in a Maxi Pool.
**What if I have 3 IP replicate?**
- The IDR for 3 IP replicates is not yet implemented.
......@@ -429,7 +429,7 @@ done
**Can I force the re-calculation of all the steps ?**
- Yes, you can add this snakemake option `--forceall` to force the execution of the first rule.
**Can I rename ChIPflow directory ?**
**Can I rename ePeak directory ?**
- yes, but must to be the same as in config.yaml (`analysis_dir`) or use relative path
**The pipeline fails because the IDR doesn't select enough reads?**
......@@ -439,9 +439,9 @@ done
- If MACS2 cannot compute the fragment size estimation (or if you want), set `no_model` to yes, and the fragment length use for MACS2 will be the one computed by PhantompeakQualTools for each sample.
**What if I don't know if my chromatim factor in narrow or broad?**
- The output directories names of peak Calling, peak reproducibility and differential analysis steps includes the peak calling mode name, the peak reproducibility method name and the normalisation and variance estimation method name. This allows ChIPflow to test multiple combinations of peak calling, peak reproducibility and differential analysis parameters without erasing any output.
- The output directories names of peak Calling, peak reproducibility and differential analysis steps includes the peak calling mode name, the peak reproducibility method name and the normalisation and variance estimation method name. This allows ePeak to test multiple combinations of peak calling, peak reproducibility and differential analysis parameters without erasing any output.
- For example, if you have run the pipeline in narrow mode, and you want broad mode, you just need to modify the corresponding parameter in the configuration YAML file. The pipeline will then restart at the peak calling step and all the output will be stored in "06-PeakCalling/{}" directories.
- In case of unknown chromatin factor, we advice to run ChIPflow in narrow mode with IDR and IA, and afterward in broad mode. Results from narrow peak calling will be stored in "06-PeakCalling/Narrow" directory, and in "06-PeakCalling/Broad" for broad peak calling.
- In case of unknown chromatin factor, we advice to run ePeak in narrow mode with IDR and IA, and afterward in broad mode. Results from narrow peak calling will be stored in "06-PeakCalling/Narrow" directory, and in "06-PeakCalling/Broad" for broad peak calling.
---
......@@ -454,6 +454,6 @@ done
* Claudia Chica
# How to cite ChIPflow ?
# How to cite ePeak ?
https://doi.org/10.1101/2021.02.02.429342
......@@ -1001,7 +1001,7 @@ final_output = [multiqc_output]
include: os.path.join(RULES, "multiqc.rules")
rule chipflow:
rule ePeak:
input: final_output
......
......@@ -33,7 +33,7 @@
# Title to use for the report.
title: "ChIP-seq analysis"
subtitle: "test of epigenomics" # Set your own text
intro_text: "MultiQC reports summarise analysis results produced from ChIPflow"
intro_text: "MultiQC reports summarise analysis results produced from ePeak"
# Add generic information to the top of reports
report_header_info:
......
#########################################################################
# ChIPflow: Standardize and reproducible ChIP-seq analysis from raw #
# ePeak: Standardize and reproducible ChIP-seq analysis from raw #
# data to differential analysis #
# Authors: Rachel Legendre, Maelle Daunesse #
# Copyright (c) 2019-2020 Institut Pasteur (Paris) and CNRS. #
# #
# This file is part of ChIPflow workflow. #
# This file is part of ePeak workflow. #
# #
# ChIPflow is free software: you can redistribute it and/or modify #
# ePeak is free software: you can redistribute it and/or modify #
# it under the terms of the GNU General Public License as published by #
# the Free Software Foundation, either version 3 of the License, or #
# (at your option) any later version. #
# #
# ChIPflow is distributed in the hope that it will be useful, #
# ePeak is distributed in the hope that it will be useful, #
# but WITHOUT ANY WARRANTY; without even the implied warranty of #
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the #
# GNU General Public License for more details . #
# #
# You should have received a copy of the GNU General Public License #
# along with ChIPflow (LICENSE). #
# along with ePeak (LICENSE). #
# If not, see <https://www.gnu.org/licenses/>. #
#########################################################################
rule EstimateLibraryComplexity:
input:
lib_complexity_input
output:
metrics = lib_complexity_metrics
singularity:
"chipflow.img"
"epeak.img"
log:
out = lib_complexity_log_std,
err = lib_complexity_log_err
......
#########################################################################
# ChIPflow: Standardize and reproducible ChIP-seq analysis from raw #
# ePeak: Standardize and reproducible ChIP-seq analysis from raw #
# data to differential analysis #
# Authors: Rachel Legendre, Maelle Daunesse #
# Copyright (c) 2019-2020 Institut Pasteur (Paris) and CNRS. #
# #
# This file is part of ChIPflow workflow. #
# This file is part of ePeak workflow. #
# #
# ChIPflow is free software: you can redistribute it and/or modify #
# ePeak is free software: you can redistribute it and/or modify #
# it under the terms of the GNU General Public License as published by #
# the Free Software Foundation, either version 3 of the License, or #
# (at your option) any later version. #
# #
# ChIPflow is distributed in the hope that it will be useful, #
# ePeak is distributed in the hope that it will be useful, #
# but WITHOUT ANY WARRANTY; without even the implied warranty of #
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the #
# GNU General Public License for more details . #
# #
# You should have received a copy of the GNU General Public License #
# along with ChIPflow (LICENSE). #
# along with ePeak (LICENSE). #
# If not, see <https://www.gnu.org/licenses/>. #
#########################################################################
rule bamCoverage:
input:
bamCoverage_input
......@@ -29,7 +30,7 @@ rule bamCoverage:
output:
bamCoverage_output
singularity:
"chipflow.img"
"epeak.img"
params:
options = bamCoverage_options
envmodules:
......
#########################################################################
# ChIPflow: Standardize and reproducible ChIP-seq analysis from raw #
# ePeak: Standardize and reproducible ChIP-seq analysis from raw #
# data to differential analysis #
# Authors: Rachel Legendre, Maelle Daunesse #
# Copyright (c) 2019-2020 Institut Pasteur (Paris) and CNRS. #
# #
# This file is part of ChIPflow workflow. #
# This file is part of ePeak workflow. #
# #
# ChIPflow is free software: you can redistribute it and/or modify #
# ePeak is free software: you can redistribute it and/or modify #
# it under the terms of the GNU General Public License as published by #
# the Free Software Foundation, either version 3 of the License, or #
# (at your option) any later version. #
# #
# ChIPflow is distributed in the hope that it will be useful, #
# ePeak is distributed in the hope that it will be useful, #
# but WITHOUT ANY WARRANTY; without even the implied warranty of #
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the #
# GNU General Public License for more details . #
# #
# You should have received a copy of the GNU General Public License #
# along with ChIPflow (LICENSE). #
# along with ePeak (LICENSE). #
# If not, see <https://www.gnu.org/licenses/>. #
#########################################################################
rule bed_to_gff:
input:
bed_to_gff_input
......@@ -30,7 +31,7 @@ rule bed_to_gff:
output:
temp(bed_to_gff_output)
singularity:
"chipflow.img"
"epeak.img"
shell:
"""
awk 'BEGIN{{OFS="\t"}} {{print $1"\tChIPflow\tpeak\t"$2+1"\t"$3+1"\t.\t.\t.\tgene_id \"$1"_"$2"_"$3}}' {input} > {output} 2> {log.out}
......
#########################################################################
# ChIPflow: Standardize and reproducible ChIP-seq analysis from raw #
# ePeak: Standardize and reproducible ChIP-seq analysis from raw #
# data to differential analysis #
# Authors: Rachel Legendre, Maelle Daunesse #
# Copyright (c) 2019-2020 Institut Pasteur (Paris) and CNRS. #
# #
# This file is part of ChIPflow workflow. #
# This file is part of ePeak workflow. #
# #
# ChIPflow is free software: you can redistribute it and/or modify #
# ePeak is free software: you can redistribute it and/or modify #
# it under the terms of the GNU General Public License as published by #
# the Free Software Foundation, either version 3 of the License, or #
# (at your option) any later version. #
# #
# ChIPflow is distributed in the hope that it will be useful, #
# ePeak is distributed in the hope that it will be useful, #
# but WITHOUT ANY WARRANTY; without even the implied warranty of #
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the #
# GNU General Public License for more details . #
# #
# You should have received a copy of the GNU General Public License #
# along with ChIPflow (LICENSE). #
# along with ePeak (LICENSE). #
# If not, see <https://www.gnu.org/licenses/>. #
#########################################################################
......@@ -32,7 +32,7 @@ rule bedgraph:
genome = bedgraph_genome,
options = bedgraph_options
singularity:
"chipflow.img"
"epeak.img"
log:
out = bedgraph_logs
shell:
......@@ -44,15 +44,10 @@ rule bedgraph:
if [[ -s {input.scaleFactor} && {input.scaleFactor} == *"_scaleFactor.txt"* ]]
then
S=$(cat {input.scaleFactor})
echo $S
bedtools genomecov -bg -i $temp_file -g {params.genome} -scale $S > {output} 2> {log.out}
else
bedtools genomecov -bg -i $temp_file -g {params.genome} > {output} 2> {log.out}
fi
"""
#########################################################################
# ChIPflow: Standardize and reproducible ChIP-seq analysis from raw #
# ePeak: Standardize and reproducible ChIP-seq analysis from raw #
# data to differential analysis #
# Authors: Rachel Legendre, Maelle Daunesse #
# Copyright (c) 2019-2020 Institut Pasteur (Paris) and CNRS. #
# #
# This file is part of ChIPflow workflow. #
# This file is part of ePeak workflow. #
# #
# ChIPflow is free software: you can redistribute it and/or modify #
# ePeak is free software: you can redistribute it and/or modify #
# it under the terms of the GNU General Public License as published by #
# the Free Software Foundation, either version 3 of the License, or #
# (at your option) any later version. #
# #
# ChIPflow is distributed in the hope that it will be useful, #
# ePeak is distributed in the hope that it will be useful, #
# but WITHOUT ANY WARRANTY; without even the implied warranty of #
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the #
# GNU General Public License for more details . #
# #
# You should have received a copy of the GNU General Public License #
# along with ChIPflow (LICENSE). #
# along with ePeak (LICENSE). #
# If not, see <https://www.gnu.org/licenses/>. #
#########################################################################
rule bowtie2_index:
input:
fasta = bowtie2_index_fasta
output:
bowtie2_index_output_done
singularity:
"chipflow.img"
"epeak.img"
params:
prefix = bowtie2_index_output_prefix
log:
......
#########################################################################
# ChIPflow: Standardize and reproducible ChIP-seq analysis from raw #
# ePeak: Standardize and reproducible ChIP-seq analysis from raw #
# data to differential analysis #
# Authors: Rachel Legendre, Maelle Daunesse #
# Copyright (c) 2019-2020 Institut Pasteur (Paris) and CNRS. #
# #
# This file is part of ChIPflow workflow. #
# This file is part of ePeak workflow. #
# #
# ChIPflow is free software: you can redistribute it and/or modify #
# ePeak is free software: you can redistribute it and/or modify #
# it under the terms of the GNU General Public License as published by #
# the Free Software Foundation, either version 3 of the License, or #
# (at your option) any later version. #
# #
# ChIPflow is distributed in the hope that it will be useful, #
# ePeak is distributed in the hope that it will be useful, #
# but WITHOUT ANY WARRANTY; without even the implied warranty of #
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the #
# GNU General Public License for more details . #
# #
# You should have received a copy of the GNU General Public License #
# along with ChIPflow (LICENSE). #
# along with ePeak (LICENSE). #
# If not, see <https://www.gnu.org/licenses/>. #
#########################################################################
rule bowtie2_mapping:
input:
fastq = bowtie2_mapping_input,
......@@ -30,7 +31,7 @@ rule bowtie2_mapping:
sort = bowtie2_mapping_sort,
bam = temp(bowtie2_mapping_bam)
singularity:
"chipflow.img"
"epeak.img"
log:
err = bowtie2_mapping_logs_err,
out = bowtie2_mapping_logs_out
......
#########################################################################
# ChIPflow: Standardize and reproducible ChIP-seq analysis from raw #
# ePeak: Standardize and reproducible ChIP-seq analysis from raw #
# data to differential analysis #
# Authors: Rachel Legendre, Maelle Daunesse #
# Copyright (c) 2019-2020 Institut Pasteur (Paris) and CNRS. #
# #
# This file is part of ChIPflow workflow. #
# This file is part of ePeak workflow. #
# #
# ChIPflow is free software: you can redistribute it and/or modify #
# ePeak is free software: you can redistribute it and/or modify #
# it under the terms of the GNU General Public License as published by #
# the Free Software Foundation, either version 3 of the License, or #
# (at your option) any later version. #
# #
# ChIPflow is distributed in the hope that it will be useful, #
# ePeak is distributed in the hope that it will be useful, #
# but WITHOUT ANY WARRANTY; without even the implied warranty of #
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the #
# GNU General Public License for more details . #
# #
# You should have received a copy of the GNU General Public License #
# along with ChIPflow (LICENSE). #
# along with ePeak (LICENSE). #
# If not, see <https://www.gnu.org/licenses/>. #
#########################################################################
rule chipflowr:
input:
config = chipflowr_config_r
......@@ -32,7 +33,7 @@ rule chipflowr:
log:
out = chipflowr_logs
singularity:
"chipflow.img"
"epeak.img"
shell:
"""
set +o pipefail
......
#########################################################################
# ChIPflow: Standardize and reproducible ChIP-seq analysis from raw #
# ePeak: Standardize and reproducible ChIP-seq analysis from raw #
# data to differential analysis #
# Authors: Rachel Legendre, Maelle Daunesse #
# Copyright (c) 2019-2020 Institut Pasteur (Paris) and CNRS. #
# #
# This file is part of ChIPflow workflow. #
# This file is part of ePeak workflow. #
# #
# ChIPflow is free software: you can redistribute it and/or modify #
# ePeak is free software: you can redistribute it and/or modify #
# it under the terms of the GNU General Public License as published by #
# the Free Software Foundation, either version 3 of the License, or #
# (at your option) any later version. #
# #
# ChIPflow is distributed in the hope that it will be useful, #
# ePeak is distributed in the hope that it will be useful, #
# but WITHOUT ANY WARRANTY; without even the implied warranty of #
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the #
# GNU General Public License for more details . #
# #
# You should have received a copy of the GNU General Public License #
# along with ChIPflow (LICENSE). #
# along with ePeak (LICENSE). #
# If not, see <https://www.gnu.org/licenses/>. #
#########################################################################
rule chipflowr_init:
input:
matrix = chipflowr_init_input,
......
#########################################################################
# ChIPflow: Standardize and reproducible ChIP-seq analysis from raw #
# ePeak: Standardize and reproducible ChIP-seq analysis from raw #
# data to differential analysis #
# Authors: Rachel Legendre, Maelle Daunesse #
# Copyright (c) 2019-2020 Institut Pasteur (Paris) and CNRS. #
# #
# This file is part of ChIPflow workflow. #
# This file is part of ePeak workflow. #
# #
# ChIPflow is free software: you can redistribute it and/or modify #
# ePeak is free software: you can redistribute it and/or modify #
# it under the terms of the GNU General Public License as published by #
# the Free Software Foundation, either version 3 of the License, or #
# (at your option) any later version. #
# #
# ChIPflow is distributed in the hope that it will be useful, #
# ePeak is distributed in the hope that it will be useful, #
# but WITHOUT ANY WARRANTY; without even the implied warranty of #
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the #
# GNU General Public License for more details . #
# #
# You should have received a copy of the GNU General Public License #
# along with ChIPflow (LICENSE). #
# along with ePeak (LICENSE). #
# If not, see <https://www.gnu.org/licenses/>. #
#########################################################################
rule computeMatrix:
input:
computeMatrix_input
......@@ -33,7 +34,7 @@ rule computeMatrix:
log:
out = computeMatrix_logs
singularity:
"chipflow.img"
"epeak.img"
envmodules:
"deepTools"
threads:
......
#########################################################################
# ePeak: Standardize and reproducible ChIP-seq analysis from raw #
# data to differential analysis #
# Authors: Rachel Legendre, Maelle Daunesse #
# Copyright (c) 2019-2020 Institut Pasteur (Paris) and CNRS. #
# #
# This file is part of ePeak workflow. #
# #
# ePeak is free software: you can redistribute it and/or modify #
# it under the terms of the GNU General Public License as published by #
# the Free Software Foundation, either version 3 of the License, or #
# (at your option) any later version. #
# #
# ePeak is distributed in the hope that it will be useful, #
# but WITHOUT ANY WARRANTY; without even the implied warranty of #
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the #
# GNU General Public License for more details . #
# #
# You should have received a copy of the GNU General Public License #
# along with ePeak (LICENSE). #
# If not, see <https://www.gnu.org/licenses/>. #
#########################################################################
rule compute_FRiP_scores:
input:
bam = compute_FRiP_scores_input
output:
tab = compute_FRiP_scores_output
threads: 8
run:
import pysam
samfile = pysam.AlignmentFile(input.bam, "rb")
bed_files = ["peaks.bed"]
cr = countReadsPerBin.CountReadsPerBin([bam_file1, bam_file2],
bedFile=bed_files,
numberOfProcessors=threads)
reads_at_peaks = cr.run()
print reads_at_peaks
total = reads_at_peaks.sum(axis=0)
bam1 = pysam.AlignmentFile(bam_file1)
bam2 = pysam.AlignmentFile(bam_file2)
frip1 = float(total[0]) / bam1.mapped
frip2 = float(total[1]) / bam2.mapped
print frip1, frip2
with open(output.tab, 'w') as file_fp:
file_fp.write(S)
\ No newline at end of file
#########################################################################
# ChIPflow: Standardize and reproducible ChIP-seq analysis from raw #
# ePeak: Standardize and reproducible ChIP-seq analysis from raw #
# data to differential analysis #
# Authors: Rachel Legendre, Maelle Daunesse #
# Copyright (c) 2019-2020 Institut Pasteur (Paris) and CNRS. #