diff --git a/README.md b/README.md index 89846469b74d4d0a81e8871fe7adde94870dd73c..b39bc39b06014a0dc7235f3ff5d4d2a8bfaa8338 100644 --- a/README.md +++ b/README.md @@ -122,8 +122,7 @@ OPTIONS: * By default, _ROCK_ uses _k_-mers of length _k_ = 25 (option `-k`). Increasing this length is not recommanded when dealing with large FASTQ files (e.g. average coverage depth > 500x from genome size > 1 Gbps), as the total number of canonical _k_-mers can quickly grow, therefore implying a very large CMS (i.e. many hashing functions) to maintains low FPP (e.g. ≤ 0.05). Using small _k_-mers (e.g. _k_ < 21) is also not recommanded, as this can negatively affect the overall specificity (i.e. too many identical _k_-mers arising from different sequenced genome region). -* All _ROCK_ steps are based on the usage of valid _k_-mers, i.e. _k_-mers that do not contain any undetermined base `N`. Valid _k_-mers can also be determined by bases associated to a Phred score greater than a specified threshold (option `-q`; Phred +33 offset, default: 0). A minimum number of valid _k_-mers can be specified to consider a SE/PE HTS read(s) (option `-m`; default: 1). All SE/PE HTS read(s) that do not contain enough valid _k_-mers are written into FASTQ file(s) with extension _.undetermined.fastq_. - +* All _ROCK_ steps are based on the usage of valid _k_-mers, i.e. _k_-mers that do not contain any undetermined base `N`. Valid _k_-mers can also be determined by bases associated to a Phred score greater than a specified threshold (option `-q`; Phred +33 offset, default: 0). A minimum number of valid _k_-mers can be specified to consider a SE/PE HTS read(s) (option `-m`; default: 1). ## References