Skip to content
GitLab
Projects
Groups
Snippets
/
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
Metagenomics
snakemake
Commits
2c621522
Commit
2c621522
authored
Feb 10, 2020
by
Kenzo-Hugo Hillion
♻
Browse files
start Snakefile to split fasta
parent
e3b2c07b
Changes
2
Hide whitespace changes
Inline
Side-by-side
tools/utils/split_fasta/Snakefile
0 → 100644
View file @
2c621522
__split_fasta_number_sequences = config.get('split_fasta', {}).get('number_sequences', 1000000)
__split_fasta_prefix = config.get('split_fasta', {}).get('prefix', 'seq_chunk_')
EXPECTED_EXT = [f"{i:05d}" for i in range(0, int(9898412/__split_fasta_number_sequences) + 1)]
rule split_fasta:
"""
Split a FASTA file with the desired number of sequences per chunk
"""
input:
__split_fasta_input
output:
__split_fasta_output
params:
n_lines = __split_fasta_number_sequences * 2,
prefix = __split_fasta_prefix
shell:
"""
cat {input} | awk '/^>/ {{if(N>0) printf("\\n"); printf("%s\\n",$0);++N;next;}} {{ printf("%s",$0);}} END {{printf("\\n");}}' | split -l {params.n_lines} -a 5 -d - {params.prefix}
"""
tools/utils/split_fasta/config_example.yaml
0 → 100644
View file @
2c621522
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment