update conf+README

ccdcb7e9 · Hanna JULIENNE · e214a50d · ccdcb7e9 · ccdcb7e9 · ccdcb7e9
Commit ccdcb7e9 authored 2 years ago by Hanna JULIENNE
--- a/README.md
+++ b/README.md
@@ -14,8 +14,8 @@ The current pipeline integrate the following workflow:
 This pipeline enables you to run multi-trait GWAS in a computationaly efficient way

 1. Install nextflow as explain here : https://www.nextflow.io/docs/latest/getstarted.html
-2. Install the jass_preprocessing python package or use its docker container (see below).
-3. Install the JASS python package or download its docker container.
+2. Install the [jass_preprocessing](https://statistical-genetics.pages.pasteur.fr/jass_preprocessing/#installation) python package or use its docker container (see below).
+3. Install the [JASS](https://statistical-genetics.pages.pasteur.fr/jass/install.html) python package or download its docker container.


 ### Launch pipeline on test data ###
@@ -63,13 +63,26 @@ The following Item are necessary to run JASS pipeline on real data
 1. --meta_data : A path toward a meta-data file describing GWAS (see example file in ./input_files/test1.csv and [jass_preprocessing documentation](http://statistical-genetics.pages.pasteur.fr/jass_preprocessing/))
 2. --gwas_folder : A path toward a folder containing the summary statistics to analyze
 3. --ref_panel_WG : a path toward a reference panel (all genome as 1 file). See below to download curated reference panels by ancestries derived from 1000G
-4. --ld-folder : A path toward a folder containing LD matrices (that can be generated from the reference panel with the raiss package as described here : http://statistical-genetics.pages.pasteur.fr/raiss/#precomputation-of-ld-correlation)
-5. --group If you wish to compute joint analyses with the pipeline, a group file with the each phenotype group written on a separated line

 ## Optional parameters

 * --output_folder : A path toward a folder to write pipeline results (inittable, worktable...). by default results will be publish in the workflow directory.
+
+### to launch multi-trait GWAS at the end of the pipeline
+
+You can use this pipeline to launch a batch of multi-trait GWAS at the end of the pipeline
+* --group If you wish to compute joint analyses with the pipeline, a group file with the each phenotype group written on a separated line
+
+Alternatively, use the **jass create-project-data command line** on the inittable file (all your summary statistique harmonized) stored.
+See JASS documentation for its usage (https://statistical-genetics.pages.pasteur.fr/jass/generating_joint_analysis.html).
+
+### To launch imputation based on summary statistics
+
+For this step you will need to install an additional dependency [RAISS](https://gitlab.pasteur.fr/statistical-genetics/raiss) python package.
+
 * --ref_panel : A folder containing a Reference Panel in the .bim, .bed, .fam format for imputation with RAISS
+* --ld-folder : A path toward a folder containing LD matrices (that can be generated from the reference panel with the raiss package as described here : http://statistical-genetics.pages.pasteur.fr/raiss/#precomputation-of-ld-correlation)
+
 ## Available reference panels

 To make reference panel readily available, we use git lfs.
@@ -101,26 +114,47 @@ If you wish to perform imputation step using RAISS you will need to:
 3. Follow RAISS documentation to generate Linkage desiquilibrium matrices

 ## Running the LDSC regression covariance step
+### To infer multi-trait z-scores null distribution, heritabilities, genetic correlations using the LDscore regression

-Download and extract reference panel for LD-score in the pipeline folder:
+For exactitude, we recommend using the LDscore regression to infer the multivariate distribution of Z-scores under the null. 
+The alternative, implemented by default, is to estimate the null distribution by computing the covariance of Zscore with low genetic signal.
+Hence this step is not strickly required.
+
+When computed for a large number of trait, this step can be computationally intensive,
+and require a HPC cluster.
+
+1. For hg37 and the EUR ancestry, you can download their Download and extract reference panel for LD-score in the pipeline folder:
 ```
    wget https://data.broadinstitute.org/alkesgroup/LDSCORE/eur_w_ld_chr.tar.bz2
    tar -jxvf eur_w_ld_chr.tar.bz2
 ```
+If you want to analyze data in hg38 and for all ancestries, you can contact the main developper of this pipeline (hanna.julienne@pasteur.fr)
+to request the needed input files
+2. To activate the LDscore option turn this flag to true:
+    * --group If you wish to compute joint analyses with the pipeline, a group file with the each phenotype group written on a separated line
+
+3. Give the path of the reference panel
+Using the LDscore regression on 
+```
+    --LD_SCORE_folder ${PATH_to_REFERENCE}
+```

 ##  Usage Example on HPC Cluster

 If you are working with a HPC server (Slurm job scheduler), you can adapt the nextflow_sbatch.config file and launch the pipeline with a command like:

-sbatch --mem-per-cpu 32G -p common,dedicated,ggs --qos=long --wrap "module load java/13.0.2;module load singularity/3.8.3;module load graphviz/2.42.3;./nextflow run imputation_only.nf  -with-report imput_report.html -with-timeline imput_timeline.html -c nextflow_sbatch.config -qs 300"
+sbatch --mem-per-cpu 32G -p common,dedicated,ggs --qos=long --wrap "module load java/13.0.2;module load singularity/3.8.3;module load graphviz/2.42.3;./nextflow run imputation_only.nf  -with-report imput_report.html -with-timeline imput_timeline.html -c nextflow_slurm.config -qs 300"

 ## Using docker container

-Stable versions of JASS tools are available as docker container:
+Stable versions of JASS tools and dependencies are available as docker container:

+- plink:
+https://quay.io/repository/biocontainers/plink?tab=tags
+- LDscore:
+https://quay.io/repository/biocontainers/ldsc?tab=tags
 - JASS preprocessing:
 https://quay.io/repository/biocontainers/jass_preprocessing?tab=tags
-
 - JASS containers:
 https://quay.io/repository/biocontainers/jass?tab=tags
 - RAISS containers:

--- a/nextflow_local.config
+++ b/nextflow_local.config
--- a/nextflow_test.config
+++ b/nextflow_test.config