Skip to content
Snippets Groups Projects
Commit ffc624ab authored by hjulienne's avatar hjulienne
Browse files

more details in the doc index

parent efaed84e
No related branches found
No related tags found
No related merge requests found
Pipeline #7784 passed
......@@ -41,25 +41,55 @@ execute the following lines:
cd JASS_Pre-processing/
pip3 install ./jass_preprocessing/
Preprocessing example
=====================
The file : "/JASS_Pre-processing/main_preprocessing.py" gives an example on how to use
this package.
The file : "/JASS_Pre-processing/main_preprocessing.py" gives a complete example on
how to use this package.
Input
======
* A reference panel (1000 genome format)
* A reference panel (1000 genome format). The user is expected to provide a reference panel in tsv format with the following columns in that order, without header:
+-----+-----+------------+-----+-----+---------+
| chr | pos | snp_id | ref | alt | MAF |
+=====+=====+============+=====+=====+=========+
| 1 |13116| rs62635286 | T | G |0.0970447|
+-----+-----+------------+-----+-----+---------+
| 1 |13118| rs200579949| A | G |0.0970447|
+-----+-----+------------+-----+-----+---------+
| 1 |14604| rs541940975| A | G | 0.147564|
+-----+-----+------------+-----+-----+---------+
| 1 |14930| rs75454623 | A | G | 0.482228|
+-----+-----+------------+-----+-----+---------+
* Folder containing all raw gwas data (all chromosomes in one file)
* a list containing the name of GWAS file to the string format.
* A descriptor csv files that will described each GWAS summary statistic files
* A descriptor csv files that will described each GWAS summary statistic files:
* a header
* 1 line per study
* the fields are:
+-------------------------------------------+------------------------------------------------------------+
| category | field name |
+===========================================+============================================================+
| path to the data | filename |
+-------------------------------------------+------------------------------------------------------------+
| study info fields | consortia,outcome,fullName,type,Nsample,Ncase,Ncontrol,Nsnp|
+-------------------------------------------+------------------------------------------------------------+
| names of the header in the GWAS file | snpid,a1,a2,freq,pval,n,z,OR,se,code,imp,ncas,ncont |
+-------------------------------------------+------------------------------------------------------------+
.. | I don't know | altNcas,altNcont|
* it must contain the following columns:
Hard coded path (l.20-29 of JASS_Pre-processing/main_preprocessing.py)
Indices and tables
==================
......
......@@ -16,6 +16,24 @@ import pandas as pd
import seaborn as sns
import time
#Hard coded path (l.20-29 of JASS_Pre-processing/main_preprocessing.py)
#| variable name | description | current default value|
#|---------------|-------------|----------------------|
#| netPath | Main project folder, must end by "/" | /mnt/atlas/ |
#| GWAS_labels* | Path to the file describing the format of the individual GWASs files | netPath+'PCMA/1._DATA/RAW.GWAS/GWAS_labels.csv' |
#| GWAS_path* | Path to the folder containing the GWASs summ stat files, must end by "/" | netPath+'PCMA/1._DATA/RAW.GWAS/'|
#| diagnostic_folder | folder for histograms of sample size distribution among SNPs | /mnt/atlas/PCMA/1._DATA/sample_size_distribution/ |
#| ldscore_format | data formated to use LDscore, 1 file per study | /mnt/atlas/PCMA/1._DATA/ldscore_data/ |
#| REF_filename* | file containing the reference panel for imputation | netPath+'PCMA/0._REF/1KGENOME/summary_genome_Filter_part2.out' |
#| pathOUT | **unused in main_preprocessing.py** | netPath+'PCMA/1._DATA/RAW.summary/'|
#| ImpG_output_Folder | main ouput folder | netPath+ 'PCMA/1._DATA/preprocessing_test/' |
#+ Hard coded variable: perSS = 0.7: the proportion of the 90th percentile of the sample size used to filter the SNPs
perSS = 0.7
netPath = "/mnt/atlas/"
GWAS_labels = netPath+'PCMA/1._DATA/RAW.GWAS/GWAS_labels.csv'
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment