diff --git a/doc/build/html/_static/manhattan.png b/doc/build/html/_static/manhattan.png new file mode 100755 index 0000000000000000000000000000000000000000..af738508cf5b9ff5e56552b1e2db81e550a1876a Binary files /dev/null and b/doc/build/html/_static/manhattan.png differ diff --git a/doc/source/about.rst b/doc/source/about.rst index 90dc9db1de040572f1603ecda5cf3a9ca5cb1d87..c6aa632940deb4c64831fb469441ca5a16fd91ab 100644 --- a/doc/source/about.rst +++ b/doc/source/about.rst @@ -2,15 +2,15 @@ What is JASS: A python package to perform Multi-trait GWAS ========================================================== -JASS is a python package that handles the computation of joint statistics over sets of selected GWAS results through a command line tool. +JASS is a Python package that handles the computation of joint statistics over sets of selected GWAS results through a command-line tool. -More precisely, users can perform multi-trait GWAS, and display static plots of the results on their own data through the command line interface. Functionality of this package can be explored on a web server, https://jass.pasteur.fr/, which allows to perform multi-trait GWAS on set a databases of 184 summary statistics. +More precisely, users can perform multi-trait GWAS and display static plots of the results on their own data through the command-line interface. The functionality of this package can be explored on a web server, https://jass.pasteur.fr/, which allows users to perform multi-trait GWAS on a database of 184 summary statistics. -In this documentation, we cover the steps required for installing the software, and illustrate its usage through examples. +In this documentation, we cover the steps required for installing the software and illustrate its usage through examples. -We also briefly describe in the next section the pre-processing of raw GWAS data which can be performed through a companion python package provided on behalf of the JASS package. +We also briefly describe in the next section the pre-processing of raw GWAS data, which can be performed through a companion `Nextflow pipeline provided <https://gitlab.pasteur.fr/statistical-genetics/jass_suite_pipeline>`_ alongside the JASS package. -For method details and applications check out our publications with JASS or its accompanying packages (RAISS): +For method details and applications, check out our publications related to JASS or its accompanying packages (RAISS). JASS application paper :cite:`julienne2021multitrait` diff --git a/doc/source/data_import.rst b/doc/source/data_import.rst index 094f1c89c9498e9f07b5e81178f61b90c50dabe5..cb4bf63261ff24c1356ab215b2b4426d5bddb045 100644 --- a/doc/source/data_import.rst +++ b/doc/source/data_import.rst @@ -1,29 +1,32 @@ Data preparation ================ -The **first paragraph of this section describes the input data** that have to be provided in order to run JASS, as well as their format. -To generate these data, **you can use the procedure described in the second paragraph.** -The final paragraph describes an imputation tool compatible with JASS input format (optional preparation step). + +JASS requires GWAS summary statistics to be harmonized and formatted (see **JASS input data** section below) . + We advice the user to follow the methods provided in the **first section** of this page to harmonize their data. + The **second section** describes the different input data required by JASS. +The **third section** describes an imputation tool compatible with JASS input format (optional preparation step). +Finally, we provided a command line example to assemble input data into the JASS inittable (the database of curated summary statistics used to perform the multi-trait GWAS) How to generate input data for JASS ----------------------------------- -Option 1 nextflow pipeline: +Option 1 Nextflow pipeline: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ -Preprocessing steps for JASS (data harmonisation and imputation)have been gathered in one nextflow pipeline : `JASS pipeline Suite <https://gitlab.pasteur.fr/statistical-genetics/jass_suite_pipeline>`_. +Preprocessing steps for JASS (data harmonisation and imputation)have been gathered into a Nextflow pipeline : `JASS pipeline Suite <https://gitlab.pasteur.fr/statistical-genetics/jass_suite_pipeline>`_. While this option might have stronger installation requirements, it ensure reproducibility by leveraging docker containers (fixed version of JASS and accompanying packages). It will also be much more efficient is you a large number of heterogeneous data to handle and a computing cluster available. -Option 2 manually prepare input data: -~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Option 2 Prepare input data using the JASS pre-processing Python package: +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ To standardize the format of the input GWAS datasets, you can use the `JASS Pre-processing package <https://gitlab.pasteur.fr/statistical-genetics/JASS_Pre-processing>`_. The `JASS Pre-processing documentation <http://statistical-genetics.pages.pasteur.fr/JASS_Pre-processing/>`_ details the use of this tool. JASS input data --------------- -JASS data, from which all statistics can be computed, are stored in an HDF5 file. +JASS data, from Multi-trait GWAS can be computed, are stored in an HDF5 file. This file can be created with the procedure `create-inittable`. This procedure needs the following input files to complete: GWAS description @@ -51,6 +54,7 @@ GWAS results files in the tabular format by chromosome (tab separated) *all in t | rs6548219| 30762 | A | G | -1.133 | +----------+-------+------+-----+--------+ +**A0** is the effect allele. The name of file *MUST* follow this pattern : "z_{CONSORTIUM}_{TRAIT}_chr{chromosome number}.txt". The consortium and the trait must be capitalized and must *NOT* contain _ . diff --git a/doc/source/generating_joint_analysis.rst b/doc/source/generating_joint_analysis.rst index cdbea1d21eaacd0fd658492a720488789ee87832..517c617821204daeb4a4f19bcfe9ce191ad1e14c 100644 --- a/doc/source/generating_joint_analysis.rst +++ b/doc/source/generating_joint_analysis.rst @@ -1,24 +1,43 @@ Compute Multi-trait GWAS with JASS ================================== -Once the GWAS summary statistics are integrated in the inittable, -you can generate multi-trait GWAS for any set of traits +Once the GWAS summary statistics are integrated in the inittable (see Data preparation), +you can generate multi-trait GWAS for any set of traits and for several joint tests with the command jass create-project-data -(see command line usage for the detail of arguments). +(see command line reference for the detail of arguments). -Whatever the test used, the command will generate three output: + +Command Line example +-------------------- +Here is a mock up example of a command line to generate a multi-trait GWAS on 4 traits using the *Omnibus* test. +See command line usage for more details + +.. code-block:: console + + jass create-project-data --init-table-path init_table/init_table.hdf5 --phenotype z_MAGIC_GLUCOSE-TOLERANCE z_MAGIC_FAST-GLUCOSE z_MAGIC_FAST-INSULIN z_MAGIC_HBA1C --worktable-path ./work_glycemic.hdf5 --manhattan-plot-path ./manhattan_glycemic.png --quadrant-plot-path ./quadrant_glycemic.png + + +Generated Results +----------------- +Whatever the test used, the command will generate three outputs: * A **HDFStore containing several tables** (Each table can be extracted using the can extracted to a tsv using the `jass extract-tsv <https://statistical-genetics.pages.pasteur.fr/jass/command_line_usage.html#extract-tsv>`_ be read from the HDFStore with the pandas.read_hdf function): - - 'SumStatTab' : The results of the joint analysis by SNPs - - 'PhenoList' : the meta data of analysed GWAS - - 'COV' : The H0 covariance used to perform joint analysis - - 'GENCOV' (If present in the initTable): The genetic covariance as computed by the LDscore. - - 'Regions' : Results of the joint analysis summarised by LD regions (Notably Lead SNPs by regions) - - 'summaryTable': a double entry table summarizing the number of significant regions by test (univariate vs joint test) + + * 'SumStatTab' : The results of the joint analysis by SNPs + + * 'PhenoList' : the meta data of GWAS included in the multi-trait GWAS + + * 'COV' : The H0 covariance used to perform joint analysis + + * 'GENCOV' (If present in the initTable): The genetic covariance as computed by the LDscore. + + * 'Regions' : Results of the joint analysis summarised by LD-independent regions (notably Lead SNPs by regions) + + * 'summaryTable': a double entry table summarizing the number of significant regions by test (univariate vs joint test) * A **.png Manhattan plot** of the joint test p-values: -.. image:: ./_static/manhattan_glycemic_blood_asthma.png +.. image:: ./_static/manhattan.png * A **.png Quadrant plot** which is a scatter plot of the minimum p-value by region of the joint test with respect to the minimum p-value by region of the univariate tests. This plot provides an easy way to see if your joint analysis detected association not previously reported in the litterature. @@ -39,7 +58,7 @@ The Omnibus tests If no option is provided to specify the test, a Omnibus test analysis will be performed. For instance: -.. code-block:: console +.. code-block:: shell jass create-project-data --init-table-path inittable_name.hdf5 --phenotypes z_CONSORTIUM1_TRAIT1 z_CONSORTIUM1_TRAIT2 z_CONSORTIUM2_TRAIT1 --worktable-path worktable_name.hdf5 --manhattan-plot-path manhattan_name.png --quadrant-plot-path /quadrant_name.png --qq-plot-path QQplots_name.png @@ -86,11 +105,3 @@ For instance if you want to access the Regions table : Note that is you wish that the SumStatTab table to be saved as a csv file you can provide the command lines with the --csv-file-path option and a csv will be generated as well. Outputting a csv will lengthen the execution and require the appropriate storage space (several 10Gb depending of the number of traits). - -Command Line example --------------------- -Here is a mock up example of a command line to generate a multi-trait GWAS on 4 traits using the *Omnibus* test. -See command line usage for more details -.. code-block:: shell - - jass create-project-data --init-table-path init_table/init_table.hdf5 --phenotype z_MAGIC_GLUCOSE-TOLERANCE z_MAGIC_FAST-GLUCOSE z_MAGIC_FAST-INSULIN z_MAGIC_HBA1C --worktable-path ./work_glycemic.hdf5 --manhattan-plot-path ./manhattan_glycemic.png --quadrant-plot-path ./quadrant_glycemic.png diff --git a/doc/source/get_predicted_gain.rst b/doc/source/get_predicted_gain.rst index c714777798421ea303450eecb03c155ecf55f589..ffff58305137edfb7d5652850a4c006e703cbc88 100644 --- a/doc/source/get_predicted_gain.rst +++ b/doc/source/get_predicted_gain.rst @@ -36,4 +36,3 @@ When executed the command will created a report at --gain-path The last column provide the predicted gain ("the higher the more promising"). Note that extrapoling on new data might give lesser performances than reported in :cite:`suzuki2024trait`. -.. bibliography:: reference.bib \ No newline at end of file diff --git a/doc/source/install.rst b/doc/source/install.rst index 1cf522adeb1ed40877292977461a8c89d131335e..0d26d35e88c83a7ed3fa291e3ea3444a418cdc8f 100644 --- a/doc/source/install.rst +++ b/doc/source/install.rst @@ -1,9 +1,8 @@ Installation ============ -You can use JASS locally either using the command line interface in a terminal, or by running a web server. Deployment in a public server is also later discussed in this document. - -You need **python3** to install and use JASS. As of April 2022, JASS runs on python from 3.8 to 3.10. +You can use JASS locally through the command line interface in a terminal. +You need **python3** to install and use JASS. As of May 2025, JASS runs on python from 3.8 to 3.12. Installation with pip (recommended) ----------------------------------- diff --git a/jass/__init__.py b/jass/__init__.py index 1f1936152ef05f96cc5a43fe50a4c93359902bd4..989eb0d44f40125065ecb821b40a24cbc6759589 100644 --- a/jass/__init__.py +++ b/jass/__init__.py @@ -9,9 +9,5 @@ Submodules config models - server - tasks util """ - -#from jass.tasks import celery