Skip to content
Snippets Groups Projects
Commit f4eee4b5 authored by Alexis  CRISCUOLO's avatar Alexis CRISCUOLO
Browse files

25.03

parent f6ab69a6
No related branches found
No related tags found
No related merge requests found
......@@ -44,7 +44,7 @@ You will need to install the required programs listed in the following table, or
| _fqconvert_ <br> _fqduplicate_ <br> _fqextract_ <br> _fqstats_ | [fqtools](http://ftp.pasteur.fr/pub/gensoft/projects/fqtools/) | &ge; 1.1a | [ftp.pasteur.fr/pub/gensoft/projects/fqtools](http://ftp.pasteur.fr/pub/gensoft/projects/fqtools/) |
| [_Musket_](http://musket.sourceforge.net/homepage.htm)<sup>&nbsp;&#x2726;</sup> | - | &ge; 1.1 | [sourceforge.net/projects/musket](https://sourceforge.net/projects/musket/) |
| [_ntCard_](https://www.bcgsc.ca/resources/software/ntcard) | - | > 1.2 | [github.com/bcgsc/ntCard](https://github.com/bcgsc/ntCard) |
| [_ROCK_](https://research.pasteur.fr/en/software/rock) | - | &ge; 1.9.3 | [gitlab.pasteur.fr/vlegrand/ROCK](https://gitlab.pasteur.fr/vlegrand/ROCK) |
| [_ROCK_](https://research.pasteur.fr/en/software/rock) | - | &ge; 2.1 | [gitlab.pasteur.fr/vlegrand/ROCK](https://gitlab.pasteur.fr/vlegrand/ROCK) |
</div>
......@@ -125,7 +125,7 @@ Run _fqCleanER_ without option to read the following documentation:
-b <string> base name for output files (mandatory option)
-a <infile> to set a file containing every alien oligonucleotide sequence (one per line) to
be clipped during step 'T' (see below)
-a <string> one or several key words (separated with commas), each corresponding to a set
-a <string> one or several key words (separated with commas), each corresponding to a set
of alien oligonucleotide sequences to be clipped during step 'T' (see below):
POLY nucleotide homopolymers
NEXTERA Illumina Nextera index Kits
......@@ -138,22 +138,22 @@ Run _fqCleanER_ without option to read the following documentation:
TRUSEQ_SMALLRNA Illumina TruSeq Small RNA Kits
Note that these sets of alien sequences are not exhaustive and will never
replace the exact oligos used for library preparation (default: "POLY")
-a AUTO to perform de novo inference of 3' alien oligonucleotide sequence(s) of at
least 20 nucleotide length; selected sequences are completed with those from
"POLY" (see above)
-A <infile> to set sequence or k-mer model file(s) to carry out contaminant read removal
during step 'C'; several comma-separated file names can be specified; allowed
-a AUTO to perform de novo inference of 3' alien oligonucleotide sequence(s) of at
least 20 nucleotide length; selected sequences are completed with those from
"POLY" (see above)
-A <infile> to set sequence or k-mer model file(s) to carry out contaminant read removal
during step 'C'; several comma-separated file names can be specified; allowed
file extensions: .fa, .fasta, .fna, .kmr or .kmz (default: phiX174 genome)
-d <string> displays the alien oligonucleotide sequences corresponding to the specified key
word(s); see option -a for the list of available key words
-q <int> quality score threshold; all bases with Phred score below this threshold are
-q <int> quality score threshold; all bases with Phred score below this threshold are
considered as non-confident (default: 15)
-l <int> minimum required length for a read (default: half the average read length)
-p <int> maximum allowed percentage of non-confident bases (as ruled by option -q) per
-p <int> maximum allowed percentage of non-confident bases (as ruled by option -q) per
read (default: 50)
-c <int> minimum allowed coverage depth for step 'L' or 'N' (default: 4)
-C <int> maximum allowed coverage depth for step 'R' or 'N' (default: 90)
-s <string> a sequence of tasks to be iteratively performed, each being defined by one of
-s <string> a sequence of tasks to be iteratively performed, each being defined by one of
the following uppercase characters:
C discarding [C]ontaminating reads (as ruled by option -A)
E correcting sequencing [E]rrors
......@@ -199,7 +199,7 @@ Run _fqCleanER_ without option to read the following documentation:
<span style="color:navy; font-size:1.1em;">**[T]**</span> &nbsp; Trimming and clipping (`-s T`) are performed using [_AlienTrimmer_](https://research.pasteur.fr/en/software/alientrimmer/) (Criscuolo and Brisse 2013). Clipping is carried out based on the specified alien oligonucleotides (option `-a`), where alien oligonucleotide sequences can be (i) set using precomputed standard library names, (ii) specified via user-defined FASTA-formatted file, or (iii) directly estimated from the input files using [_AlienDiscover_](https://gitlab.pasteur.fr/GIPhy/AlienDiscover) (option `-a AUTO`). When step T is run without setting option `-a`, clipping is carried out with the four homopolymers (`POLY`) as alien oligonucleotides. Trimming is carried out by deleting 5' and 3' regions containing many non-confident bases, where a base is considered as non-confident when its Phred score is lower than a Phred score threshold (set using option `-q`; default: 15). After trimming/clipping an HTS read, it can be discarded when the number of remaining bases is lower than a specified length threshold (option `-l`; default: half the average read length) or when the percentage of remaining non-confident bases is higher than another specified threshold (option `-p`; default: 50%). Note that when HTS read discarding breaks PE, singletons are written into dedicated output files ( _.S.fastq_ file extension).
* Each predefined set of alien oligonucleotide sequences can be displayed using option `-d`. Some sets of alien oligonucleotide sequences are derived from _'Illumina Adapter Sequences'_ [Document # 1000000002694 v16](https://emea.support.illumina.com/downloads/illumina-adapter-sequences-document-1000000002694.html), i.e. options `-a NEXTERA` (_Nextera DNA Indexes_), `-a IUDI` (_IDT for Illumina UD Indexes_), `-a AMPLISEQ` (_AmpliSeq for Illumina Panels_), `-a TRUSIGHT_PANCANCER` (_TruSight RNA Pan-Cancer Panel_), `-a TRUSEQ_UD` (_IDT for Illumina-TruSeq DNA and RNA UD Indexes_), `-a TRUSEQ_CD` (_TruSeq DNA and RNA CD Indexes_), `-a TRUSEQ_SINGLE` (_TruSeq Single Indexes_), and `-a TRUSEQ_SMALLRNA` (_TruSeq Small RNA_). <br> <sup><sub>**[Oligonucleotide sequences © 2021 Illumina, Inc. All rights reserved. Derivative works created by Illumina customers are authorized for use with Illumina instruments and products only. All other uses are strictly prohibited.]**</sub></sup>
* Each predefined set of alien oligonucleotide sequences can be displayed using option `-d`. Some sets of alien oligonucleotide sequences are derived from _'Illumina Adapter Sequences'_ [Document # 1000000002694 v20](https://support-docs.illumina.com/SHARE/AdapterSequences/Content/SHARE/FrontPages/AdapterSeq.htm), i.e. options `-a NEXTERA` (_Nextera DNA Indexes_), `-a IUDI` (_IDT for Illumina UD Indexes_), `-a AMPLISEQ` (_AmpliSeq for Illumina Panels_), `-a TRUSIGHT_PANCANCER` (_TruSight RNA Pan-Cancer Panel_), `-a TRUSEQ_UD` (_IDT for Illumina-TruSeq DNA and RNA UD Indexes_), `-a TRUSEQ_CD` (_TruSeq DNA and RNA CD Indexes_), `-a TRUSEQ_SINGLE` (_TruSeq Single Indexes_), and `-a TRUSEQ_SMALLRNA` (_TruSeq Small RNA_). <br> <sup><sub>**[Oligonucleotide sequences © 2021-2025 Illumina, Inc. All rights reserved. Derivative works created by Illumina customers are authorized for use with Illumina instruments and products only. All other uses are strictly prohibited.]**</sub></sup>
## References
......
This diff is collapsed.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment