diff --git a/README.md b/README.md index a058f7225091cc706704a6b949f61ae2a774c9ee..3ea1c2a84d846eade4939132ab212bde02024556 100644 --- a/README.md +++ b/README.md @@ -5,6 +5,7 @@ Name |Â Description ---- | ----------- [Data simulation](simulation/) | Generate simulated metagenomics data for benchmarking +[Sunbeam](sunbeam/) | How to use sunbeam at Pasteur on TARS ## Projects and repository diff --git a/simulation/README.md b/simulation/README.md index 5904e19592bf64c0129a39f6e0791f263865ca2a..1959af1caeb06fb3a03a9b224ec23e140ff2b34b 100644 --- a/simulation/README.md +++ b/simulation/README.md @@ -24,6 +24,9 @@ make the process a bit faster. Files can be found in the `example/` directory. All the path are preceded by the `/input` directory since we are going to mount our config files into this directory. +> **Warning** Generated paired-end reads are in one unique `fastq` file that need to be +splitted in `_R1` and `_R2` files. + ### With docker ```bash @@ -71,13 +74,10 @@ output_directory=out temp_directory=/tmp gsa=True # whether a gold standard assembly should be created pooled_gsa=True # whether a pooled gold standard over all samples is created -anonymous=False # whether the output is anonymized +anonymous=True # whether the output is anonymized (reads from genomes are shuffled) compress=1 # 0 is for no compression, 9 is maximum compression ``` -Since we do not need the data for a challenge, we can switch off the anonymous part of the process. _For the moment it seems that it [does not work without anonymization] -(https://github.com/CAMI-challenge/CAMISIM/issues/64) so it is better to keep the `anonymous` flag to `True`_. - #### Read Simulator ```ini diff --git a/sunbeam/README.md b/sunbeam/README.md new file mode 100644 index 0000000000000000000000000000000000000000..c53f01f53f0c852f22dd0a81083ef8d1bc69783f --- /dev/null +++ b/sunbeam/README.md @@ -0,0 +1,42 @@ +# Sunbeam + +[Sunbeam](https://github.com/sunbeam-labs/sunbeam) is a pipeline written in snakemake +that simplifies and automates many of the steps in metagenomic sequencing analysis. + +Here is a brief note to help you use sunbeam at Pasteur on TARS. You can also read the +full [documentation](https://sunbeam.readthedocs.io/en/latest/?badge=latest). + +## Install + +Procedure is very similar to what you would do locally. Install deal with conda install +if you are not using conda. If you want the latest available version: + +```bash +git clone https://github.com/eclarke/sunbeam +cd sunbeam +bash install.sh +source activate sunbeam +``` + +> **Warning** You need internet access to do that so you need to perform install on +the head of submission. + +## Run sunbeam + +You can follow the [documentation](https://sunbeam.readthedocs.io/en/latest/?badge=latest) +for the initialization of your analysis directory. This will generate a config file in this +directory that we will call `sunbeam_config.yml`. + +We also consider that we are working on the dedicated `atm` queue. + +Once done you can run sunbeam for the quality control on tars: + +```bash +sbatch --qos=atm -p atm -c 1 sunbeam run --configfile sunbeam_config.yml all_qc --jobs 10 --cluster-config cluster.yml --cluster "sbatch --qos=atm -p atm -c {threads}" +``` + +Refer to the [documentation](https://sunbeam.readthedocs.io/en/latest/?badge=latest) to +see all the different pipelines available. + +You can also personalize required resources by adding a `cluster.yml` file as described in +this [page](https://gitlab.pasteur.fr/metagenomics/snakemake/tree/master/workflows#using-on-tars)