diff --git a/README.md b/README.md index 71ddc6006f656b79bf85fe6a833a960ae5e69ea9..ca9648c16dd6f3ea6604fc555d6b70ea93a43e30 100644 --- a/README.md +++ b/README.md @@ -4,23 +4,24 @@ _fqCleanER_ (fastq Cleaning and Enhancing Routine) is a command line tool writte Eight standard HTS read processing steps can be carried out using _fqCleanER_: -+ contaminating HTS read removal, using [_AlienRemover_](https://gitlab.pasteur.fr/GIPhy/AlienRemover), +  ❶ contaminating HTS read removal, using [_AlienRemover_](https://gitlab.pasteur.fr/GIPhy/AlienRemover), -+ sequencing error correction, using [_Musket_](http://musket.sourceforge.net/homepage.htm) (Liu et al. 2013), +  ❷ sequencing error correction, using [_Musket_](http://musket.sourceforge.net/homepage.htm) (Liu et al. 2013), -+ HTS read deduplication, using [_fqduplicate_]( ) from the [_fqtools_](http://ftp.pasteur.fr/pub/gensoft/projects/fqtools/) package, +  ❸ HTS read deduplication, using _fqduplicate_ from the [fqtools](http://ftp.pasteur.fr/pub/gensoft/projects/fqtools/) package, -+ low-coverage HTS read removal, using [_ROCK_](https://gitlab.pasteur.fr/vlegrand/ROCK), +  ❹ low-coverage HTS read removal, using [_ROCK_](https://gitlab.pasteur.fr/vlegrand/ROCK), -+ digital normalization, using [_ROCK_](https://gitlab.pasteur.fr/vlegrand/ROCK), +  ❺ digital normalization, using [_ROCK_](https://gitlab.pasteur.fr/vlegrand/ROCK), -+ paired-ends HTS read merging, using [_FLASh_](https://ccb.jhu.edu/software/FLASH/) (Magoc and Salzberg 2011), +  ❻ paired-ends HTS read merging, using [_FLASh_](https://ccb.jhu.edu/software/FLASH/) (Magoc and Salzberg 2011), -+ high-coverage (redundant) HTS read reduction, using [_ROCK_](https://gitlab.pasteur.fr/vlegrand/ROCK), +  ❼ high-coverage (redundant) HTS read reduction, using [_ROCK_](https://gitlab.pasteur.fr/vlegrand/ROCK), -+ HTS read trimming and clipping, using [_AlienTrimmer_](https://research.pasteur.fr/en/software/alientrimmer/) (Criscuolo and Brisse 2013). +  ❽ HTS read trimming and clipping, using [_AlienTrimmer_](https://research.pasteur.fr/en/software/alientrimmer/) (Criscuolo and Brisse 2013). All these steps can be performed in any order on up to three paired- and/or single-end FASTQ files (compressed or not). + _fqCleanER_ runs on UNIX, Linux and most OS X operating systems. @@ -35,19 +36,19 @@ You will need to install the required programs listed in the following table, or | program | package | version | sources | |:-------------------------------------------------------------------------------------------------------------------------------- |:----------------------------------------------------------------:| --------:|:--------------------------------------------------------------------------------------------------------- | | [_gawk_](https://www.gnu.org/software/gawk/) | - | > 4.0.0 | [ftp.gnu.org/gnu/gawk](http://ftp.gnu.org/gnu/gawk/) | -| | | | | +| |||| | [_bzip2_](https://sourceware.org/bzip2/) | - | > 1.0.0 | [sourceware.org/bzip2/downloads.html](https://sourceware.org/bzip2/downloads.html) | -| [_DSRC_](http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&project=dsrc&subpage=about) | - | >= 2.0 | [github.com/lrog/dsrc](https://github.com/lrog/dsrc) | +| [_DSRC_](http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&project=dsrc&subpage=about) | - | ≥ 2.0 | [github.com/lrog/dsrc](https://github.com/lrog/dsrc) | | [_gzip_](https://www.gnu.org/software/gzip/) | - | > 1.5.0 | [ftp.gnu.org/gnu/gzip](https://ftp.gnu.org/gnu/gzip/) | -| | | | | -| [_AlienRemover_](https://gitlab.pasteur.fr/GIPhy/AlienRemover) | - | >= 1.0 | [gitlab.pasteur.fr/GIPhy/AlienRemover](https://gitlab.pasteur.fr/GIPhy/AlienRemover) | +| |||| +| [_AlienRemover_](https://gitlab.pasteur.fr/GIPhy/AlienRemover) | - | ≥1.0 | [gitlab.pasteur.fr/GIPhy/AlienRemover](https://gitlab.pasteur.fr/GIPhy/AlienRemover) | | [_AlienTrimmer_](https://research.pasteur.fr/en/software/alientrimmer/) | - | > 2.0 | [gitlab.pasteur.fr/GIPhy/AlienTrimmer](https://gitlab.pasteur.fr/GIPhy/AlienTrimmer) | -| [_FLASh_](https://ccb.jhu.edu/software/FLASH/) | - | > 1.2.10 | [sourceforge.net/projects/flashpage/](https://sourceforge.net/projects/flashpage/) | -| _fqconvert_ <br> _fqduplicate_ <br> _fqextract_ <br> _fqstats_ | [fqtools](http://ftp.pasteur.fr/pub/gensoft/projects/fqtools/) | 1.1a | [ftp.pasteur.fr/pub/gensoft/projects/fqtools](http://ftp.pasteur.fr/pub/gensoft/projects/fqtools/) | -| [_minion_](http://wwwdev.ebi.ac.uk/enright-dev/kraken/reaper/src/reaper-latest/doc/minion.html) | [kraken](https://www.ebi.ac.uk/research/enright/software/kraken) | 15-065 | [wwwdev.ebi.ac.uk/enright-dev/kraken/reaper/src/](http://wwwdev.ebi.ac.uk/enright-dev/kraken/reaper/src/) | -| [_Musket_](http://musket.sourceforge.net/homepage.htm) | - | >= 1.1 | [sourceforge.net/projects/musket](https://sourceforge.net/projects/musket/) | +| [_FLASh_](https://ccb.jhu.edu/software/FLASH/) | - | > 1.2.10 | [sourceforge.net/projects/flashpage](https://sourceforge.net/projects/flashpage/) | +| _fqconvert_ <br> _fqduplicate_ <br> _fqextract_ <br> _fqstats_ | [fqtools](http://ftp.pasteur.fr/pub/gensoft/projects/fqtools/) | ≥ 1.1a | [ftp.pasteur.fr/pub/gensoft/projects/fqtools](http://ftp.pasteur.fr/pub/gensoft/projects/fqtools/) | +| [_minion_](http://wwwdev.ebi.ac.uk/enright-dev/kraken/reaper/src/reaper-latest/doc/minion.html) | [kraken](https://www.ebi.ac.uk/research/enright/software/kraken) | 15-065 | [wwwdev.ebi.ac.uk/enright-dev/kraken/reaper/src](http://wwwdev.ebi.ac.uk/enright-dev/kraken/reaper/src/) | +| [_Musket_](http://musket.sourceforge.net/homepage.htm) | - | ≥ 1.1 | [sourceforge.net/projects/musket](https://sourceforge.net/projects/musket/) | | [_ntCard_](https://github.com/bcgsc/ntCard) | - | > 1.2 | [github.com/bcgsc/ntCard](https://github.com/bcgsc/ntCard) | -| [_ROCK_](https://gitlab.pasteur.fr/vlegrand/ROCK) | - | > 1.0 | [gitlab.pasteur.fr/vlegrand/ROCK](https://gitlab.pasteur.fr/vlegrand/ROCK) | +| [_ROCK_](https://gitlab.pasteur.fr/vlegrand/ROCK) | - | ≥ 1.9.3 | [gitlab.pasteur.fr/vlegrand/ROCK](https://gitlab.pasteur.fr/vlegrand/ROCK) | </div> @@ -71,8 +72,8 @@ chmod +x fqCleanER.sh ./fqCleanER.sh [options] ``` -**D.** If at least one of the required program (see Requirements) is not available on your `$PATH` variable (or if one compiled binary has a different default name), _fqCleanER_ will exit with an error message. -When running _fqCleanER_ without option, a documentation should be displayed; otherwise, the name of the missing program is displayed. +**D.** If at least one of the required program (see [Dependencies](#dependencies)) is not available on your `$PATH` variable (or if one compiled binary has a different default name), _fqCleanER_ will exit with an error message. +When running _fqCleanER_ without option, a documentation should be displayed; otherwise, the name of the missing program is displayed before exiting. In such a case, edit the file `fqCleanER.sh` and indicate the local path to the corresponding binary(ies) within the code block `REQUIREMENTS` (approximately lines 70-200). For each required program, the table below reports the corresponding variable assignment instruction to edit (if needed) within the code block `REQUIREMENTS` @@ -94,7 +95,7 @@ For each required program, the table below reports the corresponding variable as </div> Note that depending on the installation of some required programs, the corresponding variable can be assigned with complex commands. -For example, as _AlienTrimmer_ is a Java tool that can be run using a Java virtual machine, the executable jar file `AlienTrimmer.jar` can be used by _fqCleanER_ by editing the corresponding variable assignment instruction as follows: `ALIENTRIMMER_BIN="java -jar AlienTrimmer.jar"`. +For example, as _AlienTrimmer_ is a Java tool that can be run using a Java virtual machine, the executable jar file `AlienTrimmer.jar` can be used by _fqCleanER_ after editing the corresponding variable assignment instruction as follows: `ALIENTRIMMER_BIN="java -jar AlienTrimmer.jar"`. ## Usage @@ -163,7 +164,7 @@ Run _fqCleanER_ without option to read the following documentation: -z <string> compressed output file(s) using gzip ("gz"), bzip2 ("bz2") or DSRC ("dsrc") (default: not compressed) -t <int> number of threads (default: 12) - -w <dir> tmp directory (default: $TMPDIR, otherwise /tmp) + -w <dir> tmp directory (default: \$TMPDIR, otherwise /tmp) -h prints this help and exit EXAMPLES: @@ -178,7 +179,7 @@ Run _fqCleanER_ without option to read the following documentation: * Output files are defined by a specified prefix (mandatory option `-b`) and written in a specified output directory (mandatory option `-o`). Output files can be compressed using [_gzip_](https://www.gnu.org/software/gzip/), [_bzip2_](https://sourceware.org/bzip2/) or [_DSRC_](http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&project=dsrc&subpage=about) (option `-z`). -* Temporary files are written into a dedicated directory created into the `$TMPDIR` directory (when defined, otherwise `tmp/`). When possible, it is highly recommended to set a temp directory with large capacity (option `-w`). +* Temporary files are written into a dedicated directory created into the `$TMPDIR` directory (when defined, otherwise `/tmp`). When possible, it is highly recommended to set a temp directory with large capacity (option `-w`). * The cleaning/enhancing steps can be specified using option `-s` in any order. The same step can be specified several times (e.g. `-s DTDNEN`). @@ -188,11 +189,11 @@ Run _fqCleanER_ without option to read the following documentation: **[E]** Sequencing error correction (`-s E`) is performed using [_Musket_](http://musket.sourceforge.net/homepage.htm) (Liu et al. 2013) with _k_-mer length _k_ = 21. This step generally requires quite important running times and will benefit from a large number of threads (option `-t`). - **[L][N][R]** These three steps (`-s L`, `-s N`, `-s R`, respectively) are related to the digital normalization procedure (Brown et al. 2012), performed using [_ROCK_](https://gitlab.pasteur.fr/vlegrand/ROCK) with _k_-mer length _k_ = 25. Given a lower-bound and a upper-bound coverage depth thresholds (options `-c` and `-C`, respectively), the digital normalization selects a subset of HTS reads such that every sequenced base has a coverage depth between these two bounds. When setting a moderate upper-bound (that is lower than the overall average coverage depth; default: `-C 90`), every sequenced base from the selected HTS read subset is expected to have a coverage depth close to this bound. When setting a small lower-bound (default: `-c 4`), all HTS reads corresponding to a sequenced region with coverage depth lower than this bound will be discarded (e.g. artefactual or erroneous HTS read, low-coverage contaminating HTS read). Step N (`-s N`) uses the two bounds (options `-C` and `-c`), whereas steps L and R (`-s L` and `-s R`, respectively) use only the lower- and upper-bounds, respectively. + **[L][N][R]** These three steps (`-s L`, `-s N`, `-s R`, respectively) are related to the digital normalization procedure (e.g. Brown et al. 2012, Wedemeyer et al. 2017, Durai and Schulz 2019), performed using [_ROCK_](https://gitlab.pasteur.fr/vlegrand/ROCK) with _k_-mer length _k_ = 25. Given a lower-bound and a upper-bound coverage depth thresholds (options `-c` and `-C`, respectively), the digital normalization selects a subset of HTS reads such that every sequenced base has a coverage depth between these two bounds. When setting a moderate upper-bound (that is lower than the overall average coverage depth; default: `-C 90`), every sequenced base from the selected HTS read subset is expected to have a coverage depth close to this bound. When setting a small lower-bound (default: `-c 4`), all HTS reads corresponding to a sequenced region with coverage depth lower than this bound will be discarded (e.g. artefactual or erroneous HTS read, low-coverage contaminating HTS read). Step N (`-s N`) uses the two bounds (options `-C` and `-c`), whereas steps L/R (`-s L` and `-s R`, respectively) use only the lower-/upper-bound, respectively. **[M]** PE HTS read merging (`-s M`, only with PE input files) is performed using [_FLASh_](https://ccb.jhu.edu/software/FLASH/) (Magoc and Salzberg 2011) when the insert size is shorter than the sum of the two paired HTS read lengths. When using this step, dedicated output files are written (_.M.fastq_ file extension). - **[T]** Trimming and clipping (`-s T`) are performed using [_AlienTrimmer_](https://research.pasteur.fr/en/software/alientrimmer/) (Criscuolo and Brisse 2013). Clipping is carried out based on the specified alien oligonucleotides (option `-a`), where alien oligonucleotide sequences can be (i) set using precomputed standard library names, (ii) specified via user-defined FASTA-formatted file, or (iii) directly estimated from the input files using [_minion_](http://wwwdev.ebi.ac.uk/enright-dev/kraken/reaper/src/reaper-latest/doc/minion.html) (option `-a AUTO`). When step T is run without setting option `-a`, clipping is carried out with the four homopolymers as alien oligonucleotides. Trimming is carried out by deleting 5' and 3' regions containing many non-confident bases, where a base is considered as non-confident when its Phred score is lower than a Phred score threshold (set using option `-q`; default: 15). After trimming/clipping an HTS read, it can be discarded when the number of remaining bases is lower than a specified threshold (option `-l`; default: 50 bases) or when the percentage of remaining non-confident bases is higher than another specified threshold (option `-p`; default: 50%). Note that when HTS read discarding breaks a PE, singletons are written into dedicated output files (_.S.fastq_ file extension). + **[T]** Trimming and clipping (`-s T`) are performed using [_AlienTrimmer_](https://research.pasteur.fr/en/software/alientrimmer/) (Criscuolo and Brisse 2013). Clipping is carried out based on the specified alien oligonucleotides (option `-a`), where alien oligonucleotide sequences can be (i) set using precomputed standard library names, (ii) specified via user-defined FASTA-formatted file, or (iii) directly estimated from the input files using [_minion_](http://wwwdev.ebi.ac.uk/enright-dev/kraken/reaper/src/reaper-latest/doc/minion.html) (option `-a AUTO`). When step T is run without setting option `-a`, clipping is carried out with the four homopolymers as alien oligonucleotides. Trimming is carried out by deleting 5' and 3' regions containing many non-confident bases, where a base is considered as non-confident when its Phred score is lower than a Phred score threshold (set using option `-q`; default: 15). After trimming/clipping an HTS read, it can be discarded when the number of remaining bases is lower than a specified length threshold (option `-l`; default: 50 bases) or when the percentage of remaining non-confident bases is higher than another specified threshold (option `-p`; default: 50%). Note that when HTS read discarding breaks PE, singletons are written into dedicated output files (_.S.fastq_ file extension). * Each predefined set of alien oligonucleotide sequences can be displayed using option `-d`. Some sets of alien oligonucleotide sequences are derived from _'Illumina Adapter Sequences'_ [Document # 1000000002694 v16](https://emea.support.illumina.com/downloads/illumina-adapter-sequences-document-1000000002694.html), i.e. options `-a NEXTERA` (_Nextera DNA Indexes_), `-a IUDI` (_IDT for Illumina UD Indexes_), `-a AMPLISEQ` (_AmpliSeq for Illumina Panels_), `-a TRUSIGHT_PANCANCER` (_TruSight RNA Pan-Cancer Panel_), `-a TRUSEQ_UD` (_IDT for Illumina-TruSeq DNA and RNA UD Indexes_), `-a TRUSEQ_CD` (_TruSeq DNA and RNA CD Indexes_), `-a TRUSEQ_SINGLE` (_TruSeq Single Indexes_), and `-a TRUSEQ_SMALLRNA` (_TruSeq Small RNA_). <br> <sup><sub>**[Oligonucleotide sequences © 2021 Illumina, Inc. All rights reserved. Derivative works created by Illumina customers are authorized for use with Illumina instruments and products only. All other uses are strictly prohibited.]**</sub></sup> @@ -201,11 +202,15 @@ Run _fqCleanER_ without option to read the following documentation: Brown TC, Howe A, Zhang Q, Pyrkosz AB, Brom TH (2012) _A Reference-Free Algorithm for Computational Normalization of Shotgun Sequencing Data_. [arXiv:1203.4802](https://arxiv.org/abs/1203.4802). -Criscuolo A, Brisse S (2013) _AlienTrimmer: a tool to quickly and accurately trim off multiple short contaminant sequences from high-throughput sequencing reads_. Genomics, 102(5-6):500-506. [doi:10.1016/j.ygeno.2013.07.011](https://doi.org/10.1016/j.ygeno.2013.07.011). +Criscuolo A, Brisse S (2013) _AlienTrimmer: a tool to quickly and accurately trim off multiple short contaminant sequences from high-throughput sequencing reads_. **Genomics**, 102(5-6):500-506. [doi:10.1016/j.ygeno.2013.07.011](https://doi.org/10.1016/j.ygeno.2013.07.011). + +Durai DA, Schulz MH (2019) _Improving in-silico normalization using read weights_. **Scientific Reports**, 9:5133. [doi:10.1038/s41598-019-41502-9](https://doi.org/10.1038/s41598-019-41502-9). + +Liu Y, Schröder J, Schmidt B (2013) _Musket: a multistage k-mer spectrum-based error corrector for Illumina sequence data_. **Bioinformatics**, 29(3):308-315. [doi:10.1093/bioinformatics/bts690](https://doi.org/10.1093/bioinformatics/bts690). -Liu Y, Schröder J, Schmidt B (2013) _Musket: a multistage k-mer spectrum-based error corrector for Illumina sequence data_. Bioinformatics, 29(3):308-315. [doi:10.1093/bioinformatics/bts690](https://doi.org/10.1093/bioinformatics/bts690). +Magoc T, Salzberg S (2011) _FLASH: Fast length adjustment of short reads to improve genome assemblies_. **Bioinformatics**, 27:21:2957-2963. [doi:10.1093/bioinformatics/btr507](https://doi.org/10.1093/bioinformatics/btr507). -Magoc T, Salzberg S (2011) _FLASH: Fast length adjustment of short reads to improve genome assemblies_. Bioinformatics, 27:21:2957-2963. [doi:10.1093/bioinformatics/btr507](https://doi.org/10.1093/bioinformatics/btr507). +Roguski L, Deorowicz S (2014) _DSRC 2: Industry-oriented compression of FASTQ files_. **Bioinformatics**, 30(15):2213-2215. [doi:10.1093/bioinformatics/btu208](https://doi.org/10.1093/bioinformatics/btu208). -Roguski L, Deorowicz S (2014) _DSRC 2: Industry-oriented compression of FASTQ files_. Bioinformatics, 30(15):2213-2215. [doi:10.1093/bioinformatics/btu208](https://doi.org/10.1093/bioinformatics/btu208). +Wedemeyer A, Kliemann L, Srivastav A, Schielke C, Reusch TB, Rosenstiel P (2017) _An improved filtering algorithm for big read datasets and its application to single-cell assembly_. **BMC Bioinformatics**, 18:324. [doi:10.1186/s12859-017-1724-7](https://doi.org/10.1186/s12859-017-1724-7). diff --git a/fqCleanER.sh b/fqCleanER.sh index 7d4c0baff853b671ea9ae68ffae1b0c1b13168b3..d46fa2a0842dc427042cecb75ae4a9a02d67b3ed 100755 --- a/fqCleanER.sh +++ b/fqCleanER.sh @@ -32,7 +32,7 @@ # = VERSIONS = # # ============ # # # - VERSION=21.05ac # + VERSION=21.06ac # # + complete updating of the code # # # # VERSION=6.03.181008ac # @@ -146,11 +146,11 @@ # # # -- AlienRemover: discarding alien reads ----------------------------------------------------------------- # # # - ALIENREMOVER_BIN="AlienRemover"; + ALIENREMOVER_BIN=AlienRemover; if [ ! $(command -v $ALIENREMOVER_BIN) ]; then echo "$ALIENREMOVER_BIN not found" >&2 ; exit 1 ; fi ALIENREMOVER_STATIC_OPTIONS="-c 0.15 "; ALIENREMOVER="$ALIENREMOVER_BIN $ALIENREMOVER_STATIC_OPTIONS"; - K_ALIENREMOVER=25; + K_ALIENREMOVER=25; # k-mer length for AlienRemover # # # -- ntCard: estimating occurrences of distinct canonical k-mers ------------------------------------------ # # # @@ -203,8 +203,10 @@ CCCCCCCCCCCCCCCCCC EOF # # # -- NEXTERA ---------------------------------------------------------------------------------------------- # -# derived from: Illumina Adapter Sequences (Document # 1000000002694 v14; pp. 3-4, 16-17) -# https://emea.support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/experiment-design/illumina-adapter-sequences-1000000002694-14.pdf +# derived from: Illumina Adapter Sequences (Document # 1000000002694 v16; pp. 2-3, 28-30) # +# > Oligonucleotide sequences © 2021 Illumina, Inc. All rights reserved. Derivative works # +# created by Illumina customers are authorized for use with Illumina instruments and # +# products only. All other uses are strictly prohibited. # # # read -r -d '' NEXTERA <<-'EOF' >poly-A @@ -310,8 +312,10 @@ AATGATACGGCGACCACCGAGATCTACACttatgcgaTCGTCGGCAGCGTC EOF # # # -- IUDI: Illumina Unique Dual Indexes ------------------------------------------------------------------- # -# derived from: Illumina Adapter Sequences (Document # 1000000002694 v14; pp. 3-16) -# https://emea.support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/experiment-design/illumina-adapter-sequences-1000000002694-14.pdf +# derived from: Illumina Adapter Sequences (Document # 1000000002694 v16; pp. 2-27) # +# > Oligonucleotide sequences © 2021 Illumina, Inc. All rights reserved. Derivative works # +# created by Illumina customers are authorized for use with Illumina instruments and # +# products only. All other uses are strictly prohibited. # # # read -r -d '' IUDI <<-'EOF' >poly-A @@ -1865,8 +1869,10 @@ AATGATACGGCGACCACCGAGATCTACACggtggaatacTCGTCGGCAGCGTC EOF # # # -- AMPLISEQ --------------------------------------------------------------------------------------------- # -# derived from: Illumina Adapter Sequences (Document # 1000000002694 v14; pp. 17-19) -# https://emea.support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/experiment-design/illumina-adapter-sequences-1000000002694-14.pdf +# derived from: Illumina Adapter Sequences (Document # 1000000002694 v16; pp. 31-33) # +# > Oligonucleotide sequences © 2021 Illumina, Inc. All rights reserved. Derivative works # +# created by Illumina customers are authorized for use with Illumina instruments and # +# products only. All other uses are strictly prohibited. # # # read -r -d '' AMPLISEQ <<-'EOF' >poly-A @@ -1974,8 +1980,10 @@ AATGATACGGCGACCACCGAGATCTACACgtattatgTCGTCGGCAGCGTCAGATGTGTATAAGAGACAG EOF # # # -- TRUSIGHT_PANCANCER: TruSight RNA Pan-Cancer Panel ---------------------------------------------------- # -# derived from: Illumina Adapter Sequences (Document # 1000000002694 v14; pp. 25-26) -# https://emea.support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/experiment-design/illumina-adapter-sequences-1000000002694-14.pdf +# derived from: Illumina Adapter Sequences (Document # 1000000002694 v16; pp. 42-44) # +# > Oligonucleotide sequences © 2021 Illumina, Inc. All rights reserved. Derivative works # +# created by Illumina customers are authorized for use with Illumina instruments and # +# products only. All other uses are strictly prohibited. # # # read -r -d '' TRUSIGHT_PANCANCER <<-'EOF' >poly-A @@ -2035,8 +2043,10 @@ GATCGGAAGAGCACACGTCTGAACTCCAGTCACattcctTTATCTCGTATGCCGTCTTCTGCTTG EOF # # # -- TRUSEQ_UD: TruSeq DNA/RNA UD indexes ----------------------------------------------------------------- # -# derived from: Illumina Adapter Sequences (Document # 1000000002694 v14; pp. 27-30) -# https://emea.support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/experiment-design/illumina-adapter-sequences-1000000002694-14.pdf +# derived from: Illumina Adapter Sequences (Document # 1000000002694 v16; pp. 45-50) # +# > Oligonucleotide sequences © 2021 Illumina, Inc. All rights reserved. Derivative works # +# created by Illumina customers are authorized for use with Illumina instruments and # +# products only. All other uses are strictly prohibited. # # # read -r -d '' TRUSEQ_UD <<-'EOF' >poly-A @@ -2434,8 +2444,10 @@ AATGATACGGCGACCACCGAGATCTACACgtgtagacACACTCTTTCCCTACACGACGCTCTTCCGATCT EOF # # # -- TRUSEQ_CD: TruSeq DNA/RNA CD Indexes ----------------------------------------------------------------- # -# derived from: Illumina Adapter Sequences (Document # 1000000002694 v14; pp. 30-31) -# https://emea.support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/experiment-design/illumina-adapter-sequences-1000000002694-14.pdf +# derived from: Illumina Adapter Sequences (Document # 1000000002694 v16; pp. 50-51) # +# > Oligonucleotide sequences © 2021 Illumina, Inc. All rights reserved. Derivative works # +# created by Illumina customers are authorized for use with Illumina instruments and # +# products only. All other uses are strictly prohibited. # # # read -r -d '' TRUSEQ_CD <<-'EOF' >poly-A @@ -2489,8 +2501,10 @@ AATGATACGGCGACCACCGAGATCTACACgtactgacACACTCTTTCCCTACACGACGCTCTTCCGATCT EOF # # # -- TRUSEQ_SINGLE: TruSeq Single Indexes ----------------------------------------------------------------- # -# derived from: Illumina Adapter Sequences (Document # 1000000002694 v14; pp. 31-33) -# https://emea.support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/experiment-design/illumina-adapter-sequences-1000000002694-14.pdf +# derived from: Illumina Adapter Sequences (Document # 1000000002694 v14; pp. 51-53) # +# > Oligonucleotide sequences © 2021 Illumina, Inc. All rights reserved. Derivative works # +# created by Illumina customers are authorized for use with Illumina instruments and # +# products only. All other uses are strictly prohibited. # # # read -r -d '' TRUSEQ_SINGLE <<-'EOF' >poly-A @@ -2554,8 +2568,10 @@ GATCGGAAGAGCACACGTCTGAACTCCAGTCACattcctTTATCTCGTATGCCGTCTTCTGCTTG EOF # # # -- TRUSEQ_SMALLRNA: TruSeq Small RNA -------------------------------------------------------------------- # -# derived from: Illumina Adapter Sequences (Document # 1000000002694 v14; pp. 34-36) -# https://emea.support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/experiment-design/illumina-adapter-sequences-1000000002694-14.pdf +# derived from: Illumina Adapter Sequences (Document # 1000000002694 v16; pp. 54-58) # +# > Oligonucleotide sequences © 2021 Illumina, Inc. All rights reserved. Derivative works # +# created by Illumina customers are authorized for use with Illumina instruments and # +# products only. All other uses are strictly prohibited. # # # read -r -d '' TRUSEQ_SMALLRNA <<-'EOF' >poly-A @@ -2875,7 +2891,7 @@ randfile() { echo $rdf ; } # # -# -- randfileext ------------------------------------------------------------------------------------------ # +# -- randfilesfx ------------------------------------------------------------------------------------------ # # >> creates random files from specified basename $1 and specified extensions in $2 # # next returns the basename of the created files # # >> example: randfileext /tmp/foo fastq,fq,1.fq #