Skip to content
Snippets Groups Projects
Select Git revision
  • main
1 result

metaphlan_script

  • Clone with SSH
  • Clone with HTTPS
  • Convert metaphlan marker_counts output to raw count matrix

    1. Analyse your samples with metaphlan:

    The marker_counts parameter is required to output the count per marker gene:

    metaphlan --input_type fastq --bowtie2db metaphlan_db -t marker_counts -o sample_count.tsv metagenome_1.fastq,metagenome_2.fastq
    1. Aggregate your counts at SGB level

    The aggregation at SGB level can be performed with the following command:

    python3 aggregate_SBG.py sample_count.tsv mpa_vOct22_CHOCOPhlAnSGB_202212_SGB_len.txt.gz sample_name sample_aggregated.tsv

    The counts are normalized according to the length of the marker gene to a default length of 1000.

    1. Build the count matrix and the taxonomy matrix

    For each sample, we can aggregate them with the following command:

    python3 build_matrix.py sample1_aggregated.tsv sample2_aggregated.tsv ... output_counts.tsv output_taxonomy.tsv