Link to an Example: [design.txt](https://gitlab.pasteur.fr/hub/ePeak/-/blob/master/test/design.txt)
**Example of config file and associated fastq file:**
**Example of design file and associated fastq file:**
For an experiment with one mark (H3K27ac) and 2 conditions (shCtrl and shUbc9) in duplicates, except for one INPUT which have failed. The list of fastq is the following :
IP of the first condition are linked to the unique INPUT, and each IP file of second condition is linked to his INPUT.
> For design with INPUT files, NB_IP and NB_INPUT correspond to the replicate number of files to associate IP with a specific INPUT. So, one line by IP (for each replicate) is expected.
### How to fill the config file
...
...
@@ -241,7 +242,9 @@ design:
Each step has it proper setting in independent chunk.
The second section of the configuration file is divided in multiple chunks. Each chunk gather the parameters of one step of the pipeline: `adapters`, `bowtie2_mapping`, `mark_duplicates`, `remove_blacklist`, `peak_calling`, `compute_idr` and `differential_analysis`.
The `options` parameter present in `adapters`, `bowtie2_mapping` and `peak_calling` allows you to provide any parameter recognised by cutadapt, bowtie2 and macs2 respectively. For example for the `bowtie2_mapping` chunk, `options` can be fill with "--very-sensitive".
The `options` parameter present in `adapters`, `bowtie2_mapping` and `peak_calling` allows you to provide any parameter recognised by cutadapt, bowtie2 and macs2 respectively. For example for the `bowtie2_mapping` chunk, `options` can be fill with "--very-sensitive" only or more options (adapted to paired-end data, see example 2).
At the beginning of `config/multiqc_config.yaml` file, you have the possibility to customize header of MultiQC report according to your experiment as you can see below:
...
...
@@ -333,9 +344,9 @@ intersectionApproach:
```
### Default mode for CUT&RUN/CUT&Tag
### Default mode for CUT&RUN (with INPUT)
With CUT&RUN/CUT&Tag data, make deduplication only on INPUT/IgG data (dedup_IP to False). Then perform a stringent peak calling with SEACR and use Intersection Approach. Overlapping parameter of IA on peaks is set at 0.8. Set SEACR normalization to non if experiment have control genome (a scaling factor will be calulated from spike-in) .
With CUT&RUN data, make deduplication only on INPUT/IgG data (dedup_IP to False). Then perform a stringent peak calling with SEACR and use Intersection Approach. Overlapping parameter of IA on peaks is set at 0.8. Set SEACR normalization to non if experiment have control genome (a scaling factor will be calulated from spike-in) .
```
mark_duplicates:
...
...
@@ -360,6 +371,58 @@ intersectionApproach:
```
### Default mode for CUT&Tag (or with no INPUT)
With CUT&Tag data, that does not require an input control or IgG control sample, the "noCTL" version of ePeak is recommended.
To running ePeak without any control, the design file should be adapted:
```
IP_NAME NB_IP
H3K27ac_shCtrl 2
H3K27ac_shUbc9 2
```
> For design without any INPUT file, NB_IP correspond to the number of replicates available for each sample. So, one line by sample (one for all replicates) is expected.