Commits · master · Blaise LI / bioinfo_utils

Feb 24, 2020
- Adding licence. · 334fed2c
  Blaise Li authored 5 years ago
  
  334fed2c
Nov 18, 2019
- Added and reorganized trimming scripts. · 964ae912
  Blaise Li authored 5 years ago
  
  964ae912
Jun 08, 2018

Blaise Li authored 7 years ago

When low quality zones are kept for deduplication, they later need to be
trimmed.

5a2c310b

May 18, 2018

Removed the low-quality zones removal. · 71468ece

Blaise Li authored 7 years ago

Hopefully iCLIP data now has better quality and deduplication will be
more efficient taking into account those zones.

71468ece

Feb 02, 2018

Pipeline to process iCLIP data. · 53909846

Blaise Li authored 7 years ago

Currently goes from demultiplexing to mapping, via trimming and
deduplicating. The mapping is performed on 3 read type:
- adapt_nodedup (the adaptor was found, and the reads were trimmed but
  not deduplicated)
- adapt_deduped (the adaptor was found, and the reads were trimmed and
  deduplicated)
- noadapt_deduped (the adaptor was not found, and the reads were trimmed
  and deduplicated)

The trim_and_dedup script currenly assumes that two low-diversity zones
are present, and ignores them for deduplication:

NNNNNGCACTANNNWWW[YYYY]NNNN
1---5 : 5' UMI
     6--11: barcode (lower diversity)
          12-14: UMI
            15-17: AT(or GC?)-rich (low diversity)
                [fragment]
                       -4 -> -1: 3' UMI

It may be a problem to deduplicate taking into account the end of the
reads, which tends to be of lower quality. The reads with errors will be
over-represented. That is why we decided to also look at the
non-deduplicated reads.

53909846

Jan 18, 2018
- Homogenized GRO-seq trim_and_dedup with iCLIP. · c9217c51
  Blaise Li authored 7 years ago
  
  c9217c51
Aug 03, 2017
- Use external script PRO-seq_trim_and_dedup.sh · b9ba6fbd
  Blaise Li authored 7 years ago
  
  The code in the snakefile produced weird results.
  b9ba6fbd
Aug 02, 2017
- Snakefile for PRO-seq, fastx_clipper keeps Ns. · dcb9d6e3
  Blaise Li authored 7 years ago
  
  dcb9d6e3
- Can use cutadapt or fastx_clipper to trim adapter. · 480bbaaa
  Blaise Li authored 7 years ago
  
  480bbaaa
Aug 01, 2017
- Configurable 5' and 3' UMI sizes, count reads. · 4777a77c
  Blaise Li authored 7 years ago
  
  4777a77c
Apr 12, 2017
- Script that trims and deduplicate PRO-seq reads. · 84d9beba
  Blaise Li authored 8 years ago
  
  There are two deduplication flows: for the reads with or without the adapter.
  84d9beba