Skip to content
Snippets Groups Projects
  1. Feb 24, 2020
  2. Nov 18, 2019
  3. Jun 08, 2018
  4. May 18, 2018
  5. Feb 02, 2018
    • Blaise Li's avatar
      Pipeline to process iCLIP data. · 53909846
      Blaise Li authored
      Currently goes from demultiplexing to mapping, via trimming and
      deduplicating. The mapping is performed on 3 read type:
      - adapt_nodedup (the adaptor was found, and the reads were trimmed but
        not deduplicated)
      - adapt_deduped (the adaptor was found, and the reads were trimmed and
        deduplicated)
      - noadapt_deduped (the adaptor was not found, and the reads were trimmed
        and deduplicated)
      
      The trim_and_dedup script currenly assumes that two low-diversity zones
      are present, and ignores them for deduplication:
      
      NNNNNGCACTANNNWWW[YYYY]NNNN
      1---5 : 5' UMI
           6--11: barcode (lower diversity)
                12-14: UMI
                  15-17: AT(or GC?)-rich (low diversity)
                      [fragment]
                             -4 -> -1: 3' UMI
      
      It may be a problem to deduplicate taking into account the end of the
      reads, which tends to be of lower quality. The reads with errors will be
      over-represented. That is why we decided to also look at the
      non-deduplicated reads.
      53909846
  6. Jan 18, 2018
  7. Aug 03, 2017
  8. Aug 02, 2017
  9. Aug 01, 2017
  10. Apr 12, 2017
Loading