README.md 2.04 KB
Newer Older
Hanna  JULIENNE's avatar
Hanna JULIENNE committed
1
# JASS analysis pipeline
Hanna  JULIENNE's avatar
Hanna JULIENNE committed
2
## Overview
Hanna  JULIENNE's avatar
Hanna JULIENNE committed
3 4 5 6 7

We present here a nextflow pipeline to harmonize, impute and analyze jointly GWAS summary statistics.

The current pipeline integrate the following workflow:

Hanna  JULIENNE's avatar
Hanna JULIENNE committed
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
![workflow image](./doc/workflow.png)

## Quick Start - Run pipeline on test data

Install nextflow as explain here : https://www.nextflow.io/docs/latest/getstarted.html

Clone the current repository locally:

```
    git clone https://gitlab.pasteur.fr/statistical-genetics/jass_suite_pipeline.git
```

Place your Reference Panel into the /Ref_Panel subfolder

Download and extract reference panel for LD-score in the pipeline folder:
```
    wget https://data.broadinstitute.org/alkesgroup/LDSCORE/eur_w_ld_chr.tar.bz2
    tar -jxvf eur_w_ld_chr.tar.bz2
```
Once done you can launch the pipeline as:

```
    nextflow run jass_pipeline.nf --ref_panel {ABSOLUTE_PATH_TO_PIPELINE_FOLDER}/Ref_panel --gwas_folder {ABSOLUTE_PATH_TO_PIPELINE_FOLDER}/test_data/ -with-report jass_report.html

```
Hanna  JULIENNE's avatar
Hanna JULIENNE committed
33

Hanna  JULIENNE's avatar
Hanna JULIENNE committed
34
## Required Input
Hanna  JULIENNE's avatar
Hanna JULIENNE committed
35

36
The following Item are necessary to run JASS pipeline on real data
Hanna  JULIENNE's avatar
Hanna JULIENNE committed
37

38
1. --meta_data : A path toward a meta-data file describing GWAS (see example file in ./input_files/test1.csv and [jass_preprocessing documentation](http://statistical-genetics.pages.pasteur.fr/jass_preprocessing/))
hjulienne's avatar
hjulienne committed
39
2. --gwas_folder : A path toward a folder containing the summary statistics to analyze
Hanna  JULIENNE's avatar
Hanna JULIENNE committed
40 41 42 43
3. --ref_panel :A folder containing a Reference Panel in the .bim, .bed, .fam format
4. --ld-folder : A path toward a folder containing LD matrices (that can be generated from the reference panel with the raiss package as described here : http://statistical-genetics.pages.pasteur.fr/raiss/#precomputation-of-ld-correlation)
5. --group If you wish to compute joint analyses with the pipeline, a group file with the each phenotype group written on a separated line

Hanna  JULIENNE's avatar
Hanna JULIENNE committed
44

hjulienne's avatar
hjulienne committed
45
## Optional parameters
Hanna  JULIENNE's avatar
Hanna JULIENNE committed
46 47

* --output_folder : A path toward a folder to write pipeline results (inittable, worktable...). by default results will be publish in the workflow directory.
Hanna  JULIENNE's avatar
Hanna JULIENNE committed
48

49
Parameters can be specified in command line or by editing the