Skip to content
Snippets Groups Projects
Commit c0a5aaf5 authored by Amine  GHOZLANE's avatar Amine GHOZLANE
Browse files

Add README.md

parent 65490369
Branches
No related tags found
No related merge requests found
# Convert metaphlan marker_counts output to raw count matrix
1. Analyse your samples with metaphlan:
The marker_counts parameter is required to output the count per marker gene:
```
metaphlan --input_type fastq --bowtie2db metaphlan_db -t marker_counts -o sample_count.tsv metagenome_1.fastq,metagenome_2.fastq
```
2. Aggregate your counts at SGB level
The aggregation at SGB level can be performed with the following command:
```
python3 aggregate_SBG.py sample_count.tsv mpa_vOct22_CHOCOPhlAnSGB_202212_SGB_len.txt.gz sample_name sample_aggregated.tsv
```
The counts are normalized according to the length of the marker gene to a default length of 1000.
3. Build the count matrix and the taxonomy matrix
For each sample, we can aggregate them with the following command:
```
python3 build_matrix.py sample1_aggregated.tsv sample1_aggregated.tsv output_counts.tsv output_taxonomy.tsv
```
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment