21.06

e8e62869 · Alexis CRISCUOLO · 7f763a42 · e8e62869 · e8e62869
Commit e8e62869 authored 3 years ago by Alexis CRISCUOLO
--- a/README.md
+++ b/README.md
@@ -4,23 +4,24 @@ _fqCleanER_ (fastq Cleaning and Enhancing Routine) is a command line tool writte

 Eight standard HTS read processing steps can be carried out using _fqCleanER_:

-+ contaminating HTS read removal, using [_AlienRemover_](https://gitlab.pasteur.fr/GIPhy/AlienRemover),
+&emsp; &#10102; &nbsp; contaminating HTS read removal, using [_AlienRemover_](https://gitlab.pasteur.fr/GIPhy/AlienRemover),

-+ sequencing error correction, using [_Musket_](http://musket.sourceforge.net/homepage.htm) (Liu et al. 2013),
+&emsp; &#10103; &nbsp; sequencing error correction, using [_Musket_](http://musket.sourceforge.net/homepage.htm) (Liu et al. 2013),

-+ HTS read deduplication, using [_fqduplicate_]( ) from the [_fqtools_](http://ftp.pasteur.fr/pub/gensoft/projects/fqtools/) package,
+&emsp; &#10104; &nbsp; HTS read deduplication, using _fqduplicate_ from the [fqtools](http://ftp.pasteur.fr/pub/gensoft/projects/fqtools/) package,

-+ low-coverage HTS read removal, using [_ROCK_](https://gitlab.pasteur.fr/vlegrand/ROCK),
+&emsp; &#10105; &nbsp; low-coverage HTS read removal, using [_ROCK_](https://gitlab.pasteur.fr/vlegrand/ROCK),

-+ digital normalization, using [_ROCK_](https://gitlab.pasteur.fr/vlegrand/ROCK),
+&emsp; &#10106; &nbsp; digital normalization, using [_ROCK_](https://gitlab.pasteur.fr/vlegrand/ROCK),

-+ paired-ends HTS read merging, using [_FLASh_](https://ccb.jhu.edu/software/FLASH/) (Magoc and Salzberg 2011),
+&emsp; &#10107; &nbsp; paired-ends HTS read merging, using [_FLASh_](https://ccb.jhu.edu/software/FLASH/) (Magoc and Salzberg 2011),

-+ high-coverage (redundant) HTS read reduction, using [_ROCK_](https://gitlab.pasteur.fr/vlegrand/ROCK),
+&emsp; &#10108; &nbsp; high-coverage (redundant) HTS read reduction, using [_ROCK_](https://gitlab.pasteur.fr/vlegrand/ROCK),

-+ HTS read trimming and clipping, using [_AlienTrimmer_](https://research.pasteur.fr/en/software/alientrimmer/) (Criscuolo and Brisse 2013).
+&emsp; &#10109; &nbsp; HTS read trimming and clipping, using [_AlienTrimmer_](https://research.pasteur.fr/en/software/alientrimmer/) (Criscuolo and Brisse 2013).

 All these steps can be performed in any order on up to three paired- and/or single-end FASTQ files (compressed or not). 
+
 _fqCleanER_ runs on UNIX, Linux and most OS X operating systems.


@@ -35,19 +36,19 @@ You will need to install the required programs listed in the following table, or
 | program                                                                                                                          | package                                                          | version  | sources                                                                                                   |
 |:-------------------------------------------------------------------------------------------------------------------------------- |:----------------------------------------------------------------:| --------:|:--------------------------------------------------------------------------------------------------------- |
 | [_gawk_](https://www.gnu.org/software/gawk/)                                                                                     | -                                                                | > 4.0.0  | [ftp.gnu.org/gnu/gawk](http://ftp.gnu.org/gnu/gawk/)                                                      |
-| | | | |
+| ||||
 | [_bzip2_](https://sourceware.org/bzip2/)                                                                                         | -                                                                | > 1.0.0  | [sourceware.org/bzip2/downloads.html](https://sourceware.org/bzip2/downloads.html)                        |
-| [_DSRC_](http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&project=dsrc&subpage=about)                                     | -                                                                | >= 2.0   | [github.com/lrog/dsrc](https://github.com/lrog/dsrc)                                                      |
+| [_DSRC_](http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&project=dsrc&subpage=about)                                     | -                                                                | &ge; 2.0   | [github.com/lrog/dsrc](https://github.com/lrog/dsrc)                                                      |
 | [_gzip_](https://www.gnu.org/software/gzip/)                                                                                     | -                                                                | > 1.5.0  | [ftp.gnu.org/gnu/gzip](https://ftp.gnu.org/gnu/gzip/)                                                     |
-| | | | |
-| [_AlienRemover_](https://gitlab.pasteur.fr/GIPhy/AlienRemover)                                                                   | -                                                                | >= 1.0   | [gitlab.pasteur.fr/GIPhy/AlienRemover](https://gitlab.pasteur.fr/GIPhy/AlienRemover)                      |
+| ||||
+| [_AlienRemover_](https://gitlab.pasteur.fr/GIPhy/AlienRemover)                                                                   | -                                                                | &ge;1.0   | [gitlab.pasteur.fr/GIPhy/AlienRemover](https://gitlab.pasteur.fr/GIPhy/AlienRemover)                      |
 | [_AlienTrimmer_](https://research.pasteur.fr/en/software/alientrimmer/)                                                          | -                                                                | > 2.0    | [gitlab.pasteur.fr/GIPhy/AlienTrimmer](https://gitlab.pasteur.fr/GIPhy/AlienTrimmer)                      |
-| [_FLASh_](https://ccb.jhu.edu/software/FLASH/)                                                                                   | -                                                                | > 1.2.10 | [sourceforge.net/projects/flashpage/](https://sourceforge.net/projects/flashpage/)                        |
-| _fqconvert_ <br> _fqduplicate_ <br> _fqextract_ <br> _fqstats_                                                                   | [fqtools](http://ftp.pasteur.fr/pub/gensoft/projects/fqtools/)   | 1.1a     | [ftp.pasteur.fr/pub/gensoft/projects/fqtools](http://ftp.pasteur.fr/pub/gensoft/projects/fqtools/)        |
-| [_minion_](http://wwwdev.ebi.ac.uk/enright-dev/kraken/reaper/src/reaper-latest/doc/minion.html)                                  | [kraken](https://www.ebi.ac.uk/research/enright/software/kraken) | 15-065   | [wwwdev.ebi.ac.uk/enright-dev/kraken/reaper/src/](http://wwwdev.ebi.ac.uk/enright-dev/kraken/reaper/src/) |
-| [_Musket_](http://musket.sourceforge.net/homepage.htm)                                                                           | -                                                                | >= 1.1   | [sourceforge.net/projects/musket](https://sourceforge.net/projects/musket/)                               |
+| [_FLASh_](https://ccb.jhu.edu/software/FLASH/)                                                                                   | -                                                                | > 1.2.10 | [sourceforge.net/projects/flashpage](https://sourceforge.net/projects/flashpage/)                        |
+| _fqconvert_ <br> _fqduplicate_ <br> _fqextract_ <br> _fqstats_                                                                   | [fqtools](http://ftp.pasteur.fr/pub/gensoft/projects/fqtools/)   | &ge; 1.1a     | [ftp.pasteur.fr/pub/gensoft/projects/fqtools](http://ftp.pasteur.fr/pub/gensoft/projects/fqtools/)        |
+| [_minion_](http://wwwdev.ebi.ac.uk/enright-dev/kraken/reaper/src/reaper-latest/doc/minion.html)                                  | [kraken](https://www.ebi.ac.uk/research/enright/software/kraken) | 15-065   | [wwwdev.ebi.ac.uk/enright-dev/kraken/reaper/src](http://wwwdev.ebi.ac.uk/enright-dev/kraken/reaper/src/) |
+| [_Musket_](http://musket.sourceforge.net/homepage.htm)                                                                           | -                                                                | &ge; 1.1   | [sourceforge.net/projects/musket](https://sourceforge.net/projects/musket/)                               |
 | [_ntCard_](https://github.com/bcgsc/ntCard)                                                                                      | -                                                                | > 1.2    | [github.com/bcgsc/ntCard](https://github.com/bcgsc/ntCard)                                                |
-| [_ROCK_](https://gitlab.pasteur.fr/vlegrand/ROCK)                                                                                | -                                                                | > 1.0    | [gitlab.pasteur.fr/vlegrand/ROCK](https://gitlab.pasteur.fr/vlegrand/ROCK)                                |
+| [_ROCK_](https://gitlab.pasteur.fr/vlegrand/ROCK)                                                                                | -                                                                | &ge; 1.9.3    | [gitlab.pasteur.fr/vlegrand/ROCK](https://gitlab.pasteur.fr/vlegrand/ROCK)                                |

 </div>

@@ -71,8 +72,8 @@ chmod +x fqCleanER.sh
 ./fqCleanER.sh  [options]
 ```

-**D.** If at least one of the required program (see Requirements) is not available on your `$PATH` variable (or if one compiled binary has a different default name), _fqCleanER_ will exit with an error message.
-When running _fqCleanER_ without option, a documentation should be displayed; otherwise, the name of the missing program is displayed.
+**D.** If at least one of the required program (see [Dependencies](#dependencies)) is not available on your `$PATH` variable (or if one compiled binary has a different default name), _fqCleanER_ will exit with an error message.
+When running _fqCleanER_ without option, a documentation should be displayed; otherwise, the name of the missing program is displayed before exiting.
 In such a case, edit the file `fqCleanER.sh` and indicate the local path to the corresponding binary(ies) within the code block `REQUIREMENTS` (approximately lines 70-200).
 For each required program, the table below reports the corresponding variable assignment instruction to edit (if needed) within the code block `REQUIREMENTS`

@@ -94,7 +95,7 @@ For each required program, the table below reports the corresponding variable as
 </div>

 Note that depending on the installation of some required programs, the corresponding variable can be assigned with complex commands. 
-For example, as _AlienTrimmer_ is a Java tool that can be run using a Java virtual machine, the executable jar file `AlienTrimmer.jar` can be used by _fqCleanER_ by editing the corresponding variable assignment instruction as follows: `ALIENTRIMMER_BIN="java -jar AlienTrimmer.jar"`.
+For example, as _AlienTrimmer_ is a Java tool that can be run using a Java virtual machine, the executable jar file `AlienTrimmer.jar` can be used by _fqCleanER_ after editing the corresponding variable assignment instruction as follows: `ALIENTRIMMER_BIN="java -jar AlienTrimmer.jar"`.


 ## Usage
@@ -163,7 +164,7 @@ Run _fqCleanER_ without option to read the following documentation:
  -z <string>   compressed output  file(s) using  gzip ("gz"),  bzip2 ("bz2")  or DSRC ("dsrc")
                (default: not compressed)
  -t <int>      number of threads (default: 12)
-  -w <dir>      tmp directory (default: $TMPDIR, otherwise /tmp)
+  -w <dir>      tmp directory (default: \$TMPDIR, otherwise /tmp)
  -h            prints this help and exit

 EXAMPLES:
@@ -178,7 +179,7 @@ Run _fqCleanER_ without option to read the following documentation:

 * Output files are defined by a specified prefix (mandatory option `-b`) and written in a specified output directory (mandatory option `-o`). Output files can be compressed using [_gzip_](https://www.gnu.org/software/gzip/), [_bzip2_](https://sourceware.org/bzip2/) or [_DSRC_](http://sun.aei.polsl.pl/REFRESH/index.php?page=projects&project=dsrc&subpage=about) (option `-z`).

-* Temporary files are written into a dedicated directory created into the `$TMPDIR` directory (when defined, otherwise `tmp/`). When possible, it is highly recommended to set a temp directory with large capacity (option `-w`).
+* Temporary files are written into a dedicated directory created into the `$TMPDIR` directory (when defined, otherwise `/tmp`). When possible, it is highly recommended to set a temp directory with large capacity (option `-w`).

 * The cleaning/enhancing steps can be specified using option `-s` in any order. The same step can be specified several times (e.g. `-s DTDNEN`).

@@ -188,11 +189,11 @@ Run _fqCleanER_ without option to read the following documentation:

  **[E]** &nbsp; Sequencing error correction (`-s E`) is performed using [_Musket_](http://musket.sourceforge.net/homepage.htm) (Liu et al. 2013) with _k_-mer length _k_ = 21. This step generally requires quite important running times and will benefit from a large number of threads (option `-t`).
  
-  **[L][N][R]** &nbsp; These three steps (`-s L`, `-s N`, `-s R`, respectively) are related to the digital normalization procedure (Brown et al. 2012), performed using [_ROCK_](https://gitlab.pasteur.fr/vlegrand/ROCK) with _k_-mer length _k_ = 25. Given a lower-bound and a upper-bound coverage depth thresholds (options `-c` and `-C`, respectively), the digital normalization selects a subset of HTS reads such that every sequenced base has a coverage depth between these two bounds. When setting a moderate upper-bound (that is lower than the overall average coverage depth; default: `-C 90`), every sequenced base from the selected HTS read subset is expected to have a coverage depth close to this bound. When setting a small lower-bound (default: `-c 4`), all HTS reads corresponding to a sequenced region with coverage depth lower than this bound will be discarded (e.g. artefactual or erroneous HTS read, low-coverage contaminating HTS read). Step N (`-s N`) uses the two bounds (options `-C` and `-c`), whereas steps L and R (`-s L` and `-s R`, respectively) use only the lower- and upper-bounds, respectively.
+  **[L][N][R]** &nbsp; These three steps (`-s L`, `-s N`, `-s R`, respectively) are related to the digital normalization procedure (e.g. Brown et al. 2012, Wedemeyer et al. 2017, Durai and Schulz 2019), performed using [_ROCK_](https://gitlab.pasteur.fr/vlegrand/ROCK) with _k_-mer length _k_ = 25. Given a lower-bound and a upper-bound coverage depth thresholds (options `-c` and `-C`, respectively), the digital normalization selects a subset of HTS reads such that every sequenced base has a coverage depth between these two bounds. When setting a moderate upper-bound (that is lower than the overall average coverage depth; default: `-C 90`), every sequenced base from the selected HTS read subset is expected to have a coverage depth close to this bound. When setting a small lower-bound (default: `-c 4`), all HTS reads corresponding to a sequenced region with coverage depth lower than this bound will be discarded (e.g. artefactual or erroneous HTS read, low-coverage contaminating HTS read). Step N (`-s N`) uses the two bounds (options `-C` and `-c`), whereas steps L/R (`-s L` and `-s R`, respectively) use only the lower-/upper-bound, respectively.
  
  **[M]** &nbsp; PE HTS read merging (`-s M`, only with PE input files) is performed using [_FLASh_](https://ccb.jhu.edu/software/FLASH/) (Magoc and Salzberg 2011) when the insert size is shorter than the sum of the two paired HTS read lengths. When using this step, dedicated output files are written (_.M.fastq_ file extension).
  
-  **[T]** &nbsp; Trimming and clipping (`-s T`) are performed using [_AlienTrimmer_](https://research.pasteur.fr/en/software/alientrimmer/) (Criscuolo and Brisse 2013). Clipping is carried out based on the specified alien oligonucleotides (option `-a`), where alien oligonucleotide sequences can be (i) set using precomputed standard library names, (ii) specified via user-defined FASTA-formatted file, or (iii) directly estimated from the input files using [_minion_](http://wwwdev.ebi.ac.uk/enright-dev/kraken/reaper/src/reaper-latest/doc/minion.html) (option `-a AUTO`). When step T is run without setting option `-a`, clipping is carried out with the four homopolymers as alien oligonucleotides. Trimming is carried out by deleting 5' and 3' regions containing many non-confident bases, where a base is considered as non-confident when its Phred score is lower than a Phred score threshold (set using option `-q`; default: 15). After trimming/clipping an HTS read, it can be discarded when the number of remaining bases is lower than a specified threshold (option `-l`; default: 50 bases) or when the percentage of remaining non-confident bases is higher than another specified threshold (option `-p`; default: 50%). Note that when HTS read discarding breaks a PE, singletons are written into dedicated output files (_.S.fastq_ file extension).
+  **[T]** &nbsp; Trimming and clipping (`-s T`) are performed using [_AlienTrimmer_](https://research.pasteur.fr/en/software/alientrimmer/) (Criscuolo and Brisse 2013). Clipping is carried out based on the specified alien oligonucleotides (option `-a`), where alien oligonucleotide sequences can be (i) set using precomputed standard library names, (ii) specified via user-defined FASTA-formatted file, or (iii) directly estimated from the input files using [_minion_](http://wwwdev.ebi.ac.uk/enright-dev/kraken/reaper/src/reaper-latest/doc/minion.html) (option `-a AUTO`). When step T is run without setting option `-a`, clipping is carried out with the four homopolymers as alien oligonucleotides. Trimming is carried out by deleting 5' and 3' regions containing many non-confident bases, where a base is considered as non-confident when its Phred score is lower than a Phred score threshold (set using option `-q`; default: 15). After trimming/clipping an HTS read, it can be discarded when the number of remaining bases is lower than a specified length threshold (option `-l`; default: 50 bases) or when the percentage of remaining non-confident bases is higher than another specified threshold (option `-p`; default: 50%). Note that when HTS read discarding breaks PE, singletons are written into dedicated output files (_.S.fastq_ file extension).


 * Each predefined set of alien oligonucleotide sequences can be displayed using option `-d`. Some sets of alien oligonucleotide sequences are derived from _'Illumina Adapter Sequences'_  [Document # 1000000002694 v16](https://emea.support.illumina.com/downloads/illumina-adapter-sequences-document-1000000002694.html), i.e. options `-a NEXTERA` (_Nextera DNA Indexes_), `-a  IUDI` (_IDT for Illumina UD Indexes_), `-a AMPLISEQ` (_AmpliSeq for Illumina Panels_), `-a TRUSIGHT_PANCANCER` (_TruSight RNA Pan-Cancer Panel_), `-a TRUSEQ_UD` (_IDT for Illumina-TruSeq DNA and RNA UD Indexes_), `-a TRUSEQ_CD` (_TruSeq DNA and RNA CD Indexes_), `-a TRUSEQ_SINGLE` (_TruSeq Single Indexes_), and `-a TRUSEQ_SMALLRNA` (_TruSeq Small RNA_). <br> <sup><sub>**[Oligonucleotide sequences © 2021 Illumina, Inc. All rights reserved. Derivative works created by Illumina customers are authorized for use with Illumina instruments and products only. All other uses are strictly prohibited.]**</sub></sup>
@@ -201,11 +202,15 @@ Run _fqCleanER_ without option to read the following documentation:

 Brown TC, Howe A, Zhang Q, Pyrkosz AB, Brom TH (2012) _A Reference-Free Algorithm for Computational Normalization of Shotgun Sequencing Data_. [arXiv:1203.4802](https://arxiv.org/abs/1203.4802).

-Criscuolo A, Brisse S (2013) _AlienTrimmer: a tool to quickly and accurately trim off multiple short contaminant sequences from high-throughput sequencing reads_. Genomics, 102(5-6):500-506. [doi:10.1016/j.ygeno.2013.07.011](https://doi.org/10.1016/j.ygeno.2013.07.011).
+Criscuolo A, Brisse S (2013) _AlienTrimmer: a tool to quickly and accurately trim off multiple short contaminant sequences from high-throughput sequencing reads_. **Genomics**, 102(5-6):500-506. [doi:10.1016/j.ygeno.2013.07.011](https://doi.org/10.1016/j.ygeno.2013.07.011).
+
+Durai DA, Schulz MH (2019) _Improving in-silico normalization using read weights_. **Scientific Reports**, 9:5133. [doi:10.1038/s41598-019-41502-9](https://doi.org/10.1038/s41598-019-41502-9).
+
+Liu Y, Schröder J, Schmidt B (2013) _Musket: a multistage k-mer spectrum-based error corrector for Illumina sequence data_. **Bioinformatics**, 29(3):308-315. [doi:10.1093/bioinformatics/bts690](https://doi.org/10.1093/bioinformatics/bts690).

-Liu Y, Schröder J, Schmidt B (2013) _Musket: a multistage k-mer spectrum-based error corrector for Illumina sequence data_. Bioinformatics, 29(3):308-315. [doi:10.1093/bioinformatics/bts690](https://doi.org/10.1093/bioinformatics/bts690).
+Magoc T, Salzberg S (2011) _FLASH: Fast length adjustment of short reads to improve genome assemblies_. **Bioinformatics**, 27:21:2957-2963. [doi:10.1093/bioinformatics/btr507](https://doi.org/10.1093/bioinformatics/btr507).

-Magoc T, Salzberg S (2011) _FLASH: Fast length adjustment of short reads to improve genome assemblies_. Bioinformatics, 27:21:2957-2963. [doi:10.1093/bioinformatics/btr507](https://doi.org/10.1093/bioinformatics/btr507).
+Roguski L, Deorowicz S (2014) _DSRC 2: Industry-oriented compression of FASTQ files_. **Bioinformatics**, 30(15):2213-2215. [doi:10.1093/bioinformatics/btu208](https://doi.org/10.1093/bioinformatics/btu208).

-Roguski L, Deorowicz S (2014) _DSRC 2: Industry-oriented compression of FASTQ files_. Bioinformatics, 30(15):2213-2215. [doi:10.1093/bioinformatics/btu208](https://doi.org/10.1093/bioinformatics/btu208).
+Wedemeyer A, Kliemann L, Srivastav A, Schielke C, Reusch TB, Rosenstiel P (2017) _An improved filtering algorithm for big read datasets and its application to single-cell assembly_. **BMC Bioinformatics**, 18:324. [doi:10.1186/s12859-017-1724-7](https://doi.org/10.1186/s12859-017-1724-7).

--- a/fqCleanER.sh
+++ b/fqCleanER.sh
@@ -32,7 +32,7 @@
 # = VERSIONS =                                                                                               #
 # ============                                                                                               #
 #                                                                                                            #
-  VERSION=21.05ac                                                                                            #
+  VERSION=21.06ac                                                                                            #
 # + complete updating of the code                                                                            #
 #                                                                                                            #
 # VERSION=6.03.181008ac                                                                                      #
@@ -146,11 +146,11 @@
 #                                                                                                            #
 # -- AlienRemover: discarding alien reads -----------------------------------------------------------------  #
 #                                                                                                            #
-  ALIENREMOVER_BIN="AlienRemover";
+  ALIENREMOVER_BIN=AlienRemover;
  if [ ! $(command -v $ALIENREMOVER_BIN) ]; then echo "$ALIENREMOVER_BIN not found"       >&2 ; exit 1 ; fi
  ALIENREMOVER_STATIC_OPTIONS="-c 0.15 ";
  ALIENREMOVER="$ALIENREMOVER_BIN $ALIENREMOVER_STATIC_OPTIONS";
-  K_ALIENREMOVER=25;
+  K_ALIENREMOVER=25; # k-mer length for AlienRemover
 #                                                                                                            #
 # -- ntCard: estimating occurrences of distinct canonical k-mers ------------------------------------------  #
 #                                                                                                            #
@@ -203,8 +203,10 @@ CCCCCCCCCCCCCCCCCC
 EOF
 #                                                                                                            #
 # -- NEXTERA ----------------------------------------------------------------------------------------------  #
-#  derived from: Illumina Adapter Sequences (Document # 1000000002694 v14; pp. 3-4, 16-17)
-#  https://emea.support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/experiment-design/illumina-adapter-sequences-1000000002694-14.pdf
+#  derived from: Illumina Adapter Sequences (Document # 1000000002694 v16; pp. 2-3, 28-30)                   #
+#  > Oligonucleotide sequences © 2021 Illumina, Inc. All rights reserved. Derivative works                   #
+#    created by Illumina  customers are authorized  for use with  Illumina instruments and                   #
+#    products only. All other uses are strictly prohibited.                                                  #
 #                                                                                                            #
 read -r -d '' NEXTERA <<-'EOF'
 >poly-A	  
@@ -310,8 +312,10 @@ AATGATACGGCGACCACCGAGATCTACACttatgcgaTCGTCGGCAGCGTC
 EOF
 #                                                                                                            #
 # -- IUDI: Illumina Unique Dual Indexes -------------------------------------------------------------------  #
-#  derived from: Illumina Adapter Sequences (Document # 1000000002694 v14; pp. 3-16)
-#  https://emea.support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/experiment-design/illumina-adapter-sequences-1000000002694-14.pdf
+#  derived from: Illumina Adapter Sequences (Document # 1000000002694 v16; pp. 2-27)                         #
+#  > Oligonucleotide sequences © 2021 Illumina, Inc. All rights reserved. Derivative works                   #
+#    created by Illumina  customers are authorized  for use with  Illumina instruments and                   #
+#    products only. All other uses are strictly prohibited.                                                  #
 #                                                                                                            #
 read -r -d '' IUDI <<-'EOF'
 >poly-A	  
@@ -1865,8 +1869,10 @@ AATGATACGGCGACCACCGAGATCTACACggtggaatacTCGTCGGCAGCGTC
 EOF
 #                                                                                                            #
 # -- AMPLISEQ ---------------------------------------------------------------------------------------------  #
-#  derived from: Illumina Adapter Sequences (Document # 1000000002694 v14; pp. 17-19)
-#  https://emea.support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/experiment-design/illumina-adapter-sequences-1000000002694-14.pdf
+#  derived from: Illumina Adapter Sequences (Document # 1000000002694 v16; pp. 31-33)                        #
+#  > Oligonucleotide sequences © 2021 Illumina, Inc. All rights reserved. Derivative works                   #
+#    created by Illumina  customers are authorized  for use with  Illumina instruments and                   #
+#    products only. All other uses are strictly prohibited.                                                  #
 #                                                                                                            #
 read -r -d '' AMPLISEQ <<-'EOF'
 >poly-A	  
@@ -1974,8 +1980,10 @@ AATGATACGGCGACCACCGAGATCTACACgtattatgTCGTCGGCAGCGTCAGATGTGTATAAGAGACAG
 EOF
 #                                                                                                            #
 # -- TRUSIGHT_PANCANCER: TruSight RNA Pan-Cancer Panel ----------------------------------------------------  #
-#  derived from: Illumina Adapter Sequences (Document # 1000000002694 v14; pp. 25-26)
-#  https://emea.support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/experiment-design/illumina-adapter-sequences-1000000002694-14.pdf
+#  derived from: Illumina Adapter Sequences (Document # 1000000002694 v16; pp. 42-44)                        #
+#  > Oligonucleotide sequences © 2021 Illumina, Inc. All rights reserved. Derivative works                   #
+#    created by Illumina  customers are authorized  for use with  Illumina instruments and                   #
+#    products only. All other uses are strictly prohibited.                                                  #
 #                                                                                                            #
 read -r -d '' TRUSIGHT_PANCANCER <<-'EOF'
 >poly-A	  
@@ -2035,8 +2043,10 @@ GATCGGAAGAGCACACGTCTGAACTCCAGTCACattcctTTATCTCGTATGCCGTCTTCTGCTTG
 EOF
 #                                                                                                            #
 # -- TRUSEQ_UD: TruSeq DNA/RNA UD indexes -----------------------------------------------------------------  #
-#  derived from: Illumina Adapter Sequences (Document # 1000000002694 v14; pp. 27-30)
-#  https://emea.support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/experiment-design/illumina-adapter-sequences-1000000002694-14.pdf
+#  derived from: Illumina Adapter Sequences (Document # 1000000002694 v16; pp. 45-50)                        #
+#  > Oligonucleotide sequences © 2021 Illumina, Inc. All rights reserved. Derivative works                   #
+#    created by Illumina  customers are authorized  for use with  Illumina instruments and                   #
+#    products only. All other uses are strictly prohibited.                                                  #
 #                                                                                                            #
 read -r -d '' TRUSEQ_UD <<-'EOF'
 >poly-A	  
@@ -2434,8 +2444,10 @@ AATGATACGGCGACCACCGAGATCTACACgtgtagacACACTCTTTCCCTACACGACGCTCTTCCGATCT
 EOF
 #                                                                                                            #
 # -- TRUSEQ_CD: TruSeq DNA/RNA CD Indexes -----------------------------------------------------------------  #
-#  derived from: Illumina Adapter Sequences (Document # 1000000002694 v14; pp. 30-31)
-#  https://emea.support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/experiment-design/illumina-adapter-sequences-1000000002694-14.pdf
+#  derived from: Illumina Adapter Sequences (Document # 1000000002694 v16; pp. 50-51)                        #
+#  > Oligonucleotide sequences © 2021 Illumina, Inc. All rights reserved. Derivative works                   #
+#    created by Illumina  customers are authorized  for use with  Illumina instruments and                   #
+#    products only. All other uses are strictly prohibited.                                                  #
 #                                                                                                            #
 read -r -d '' TRUSEQ_CD <<-'EOF'
 >poly-A	  
@@ -2489,8 +2501,10 @@ AATGATACGGCGACCACCGAGATCTACACgtactgacACACTCTTTCCCTACACGACGCTCTTCCGATCT
 EOF
 #                                                                                                            #
 # -- TRUSEQ_SINGLE: TruSeq Single Indexes -----------------------------------------------------------------  #
-#  derived from: Illumina Adapter Sequences (Document # 1000000002694 v14; pp. 31-33)
-#  https://emea.support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/experiment-design/illumina-adapter-sequences-1000000002694-14.pdf
+#  derived from: Illumina Adapter Sequences (Document # 1000000002694 v14; pp. 51-53)                        #
+#  > Oligonucleotide sequences © 2021 Illumina, Inc. All rights reserved. Derivative works                   #
+#    created by Illumina  customers are authorized  for use with  Illumina instruments and                   #
+#    products only. All other uses are strictly prohibited.                                                  #
 #                                                                                                            #
 read -r -d '' TRUSEQ_SINGLE <<-'EOF'
 >poly-A	  
@@ -2554,8 +2568,10 @@ GATCGGAAGAGCACACGTCTGAACTCCAGTCACattcctTTATCTCGTATGCCGTCTTCTGCTTG
 EOF
 #                                                                                                            #
 # -- TRUSEQ_SMALLRNA: TruSeq Small RNA --------------------------------------------------------------------  #
-#  derived from: Illumina Adapter Sequences (Document # 1000000002694 v14; pp. 34-36)
-#  https://emea.support.illumina.com/content/dam/illumina-support/documents/documentation/chemistry_documentation/experiment-design/illumina-adapter-sequences-1000000002694-14.pdf
+#  derived from: Illumina Adapter Sequences (Document # 1000000002694 v16; pp. 54-58)                        #
+#  > Oligonucleotide sequences © 2021 Illumina, Inc. All rights reserved. Derivative works                   #
+#    created by Illumina  customers are authorized  for use with  Illumina instruments and                   #
+#    products only. All other uses are strictly prohibited.                                                  #
 #                                                                                                            #
 read -r -d '' TRUSEQ_SMALLRNA <<-'EOF'
 >poly-A	  
@@ -2875,7 +2891,7 @@ randfile() {
  echo $rdf ;
 }
 #                                                                                                            #
-# -- randfileext ------------------------------------------------------------------------------------------  #
+# -- randfilesfx ------------------------------------------------------------------------------------------  #
 # >> creates random files from specified basename $1 and specified extensions in $2                          #
 #    next returns the basename of the created files                                                          #
 # >> example:               randfileext /tmp/foo fastq,fq,1.fq                                               #