From 894e05847554c1031a227fae48b779460a205dc2 Mon Sep 17 00:00:00 2001 From: Kenzo-Hugo Hillion <kenzo-hugo.hillion1@pasteur.fr> Date: Thu, 11 Jul 2019 17:40:10 +0200 Subject: [PATCH] add documentation for sunbeam --- README.md | 1 + simulation/README.md | 8 ++++---- sunbeam/README.md | 42 ++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 47 insertions(+), 4 deletions(-) create mode 100644 sunbeam/README.md diff --git a/README.md b/README.md index a058f72..3ea1c2a 100644 --- a/README.md +++ b/README.md @@ -5,6 +5,7 @@ Name |Â Description ---- | ----------- [Data simulation](simulation/) | Generate simulated metagenomics data for benchmarking +[Sunbeam](sunbeam/) | How to use sunbeam at Pasteur on TARS ## Projects and repository diff --git a/simulation/README.md b/simulation/README.md index 5904e19..1959af1 100644 --- a/simulation/README.md +++ b/simulation/README.md @@ -24,6 +24,9 @@ make the process a bit faster. Files can be found in the `example/` directory. All the path are preceded by the `/input` directory since we are going to mount our config files into this directory. +> **Warning** Generated paired-end reads are in one unique `fastq` file that need to be +splitted in `_R1` and `_R2` files. + ### With docker ```bash @@ -71,13 +74,10 @@ output_directory=out temp_directory=/tmp gsa=True # whether a gold standard assembly should be created pooled_gsa=True # whether a pooled gold standard over all samples is created -anonymous=False # whether the output is anonymized +anonymous=True # whether the output is anonymized (reads from genomes are shuffled) compress=1 # 0 is for no compression, 9 is maximum compression ``` -Since we do not need the data for a challenge, we can switch off the anonymous part of the process. _For the moment it seems that it [does not work without anonymization] -(https://github.com/CAMI-challenge/CAMISIM/issues/64) so it is better to keep the `anonymous` flag to `True`_. - #### Read Simulator ```ini diff --git a/sunbeam/README.md b/sunbeam/README.md new file mode 100644 index 0000000..c53f01f --- /dev/null +++ b/sunbeam/README.md @@ -0,0 +1,42 @@ +# Sunbeam + +[Sunbeam](https://github.com/sunbeam-labs/sunbeam) is a pipeline written in snakemake +that simplifies and automates many of the steps in metagenomic sequencing analysis. + +Here is a brief note to help you use sunbeam at Pasteur on TARS. You can also read the +full [documentation](https://sunbeam.readthedocs.io/en/latest/?badge=latest). + +## Install + +Procedure is very similar to what you would do locally. Install deal with conda install +if you are not using conda. If you want the latest available version: + +```bash +git clone https://github.com/eclarke/sunbeam +cd sunbeam +bash install.sh +source activate sunbeam +``` + +> **Warning** You need internet access to do that so you need to perform install on +the head of submission. + +## Run sunbeam + +You can follow the [documentation](https://sunbeam.readthedocs.io/en/latest/?badge=latest) +for the initialization of your analysis directory. This will generate a config file in this +directory that we will call `sunbeam_config.yml`. + +We also consider that we are working on the dedicated `atm` queue. + +Once done you can run sunbeam for the quality control on tars: + +```bash +sbatch --qos=atm -p atm -c 1 sunbeam run --configfile sunbeam_config.yml all_qc --jobs 10 --cluster-config cluster.yml --cluster "sbatch --qos=atm -p atm -c {threads}" +``` + +Refer to the [documentation](https://sunbeam.readthedocs.io/en/latest/?badge=latest) to +see all the different pipelines available. + +You can also personalize required resources by adding a `cluster.yml` file as described in +this [page](https://gitlab.pasteur.fr/metagenomics/snakemake/tree/master/workflows#using-on-tars) -- GitLab