RVDB-prot
This respository contains scrits to translate and cluster the original nucleic sequences of RVDB.
Requirements
Here are the requirements and the used versions.
- snakemake 5.4.0 https://bitbucket.org/snakemake/snakemake or
pip install snakemake
- golden 3.4.1 https://github.com/C3BI-pasteur-fr/golden
- silix 1.2.6 http://lbbe.univ-lyon1.fr/-SiLiX-
- blast+ 2.6.0 https://blast.ncbi.nlm.nih.gov
- Python 3.6 https://www.python.org
- hmmer 3.2 http://hmmer.org
- mafft 7.023 https://mafft.cbrc.jp/alignment/software/
Usage
User should edit these commands to fit the current RVDB version:
RVDB_VERSION=19.0
mkdir -p ${RVDB_VERSION} && cd ${RVDB_VERSION}
# a place where temporary files will be written
RVDB_SCRATCH_DIR=$myscratchdir/rvdb${RVDB_VERSION}
# taxadb can be generated or downloaded (please refer to taxadb website)
taxadb=/my/databases/taxadb_full.sqlite
# the directory containing this readme file and the rvdb-prot scripts
rvdbPipeline=$HOME/my/bioinformatics/rvdb-prot/
RVDB_ORIG_DIR=${PWD}
wget https://rvdb.dbi.udel.edu/download/U-RVDBv${RVDB_VERSION}.fasta.gz
mkdir -p ${RVDB_SCRATCH_DIR}
# here is an example running snakemake with a slurm cluster manager, adapt it to your needs
slurmpartition="common"
slurmqos="normal"
# this variable is needed for snakemake to launch subsequent jobs
export sbcmd="sbatch --cpus-per-task={threads} --mem={cluster.mem} --partition={cluster.partition} --qos={cluster.qos} --out={cluster.out} --error={cluster.error} --job-name={cluster.name} {cluster.extra}"
sbatch --qos ${slurmqos} --partition ${slurmpartition} -c 1 --job-name=rvdb${RVDB_VERSION} --wrap="snakemake --rerun-incomplete --snakefile $rvdbPipeline/pipeline/Snakefile -p -pr --local-cores 1 --jobs 1000 --cluster-config $rvdbPipeline/slurm.json --cluster \"$sbcmd\" --config origNucl=${RVDB_ORIG_DIR}/U-RVDBv${RVDB_VERSION}.fasta.gz scratchDir=${RVDB_SCRATCH_DIR} sampleName=rvdb${RVDB_VERSION} taxadb=$taxadb prefix=U-RVDBv${RVDB_VERSION}"