Skip to content
Snippets Groups Projects
Commit 99fc9fe8 authored by fltesson's avatar fltesson
Browse files

Add Cas genomic locus and distribution

parent 303fea3d
No related branches found
No related tags found
1 merge request!154Change distrib genomic
Pipeline #118722 passed
---
title: CRISPR-Cas
layout: article
tableColumns:
article:
doi: 10.1038/nrmicro3569
abstract: |
The evolution of CRISPR-cas loci, which encode adaptive immune systems in archaea and bacteria, involves rapid changes, in particular numerous rearrangements of the locus architecture and horizontal transfer of complete loci or individual modules. These dynamics complicate straightforward phylogenetic classification, but here we present an approach combining the analysis of signature protein families and features of the architecture of cas loci that unambiguously partitions most CRISPR-cas loci into distinct classes, types and subtypes. The new classification retains the overall structure of the previous version but is expanded to now encompass two classes, five types and 16 subtypes. The relative stability of the classification suggests that the most prevalent variants of CRISPR-Cas systems are already known. However, the existence of rare, currently unclassifiable variants implies that additional types and subtypes remain to be characterized.
---
# CRISPR-Cas
## Example of genomic structure
CRISPR-Cas systems have been classified in 6 different families :ref{doi=10.1038/s41579-019-0299-x}.
Each family is composed of different subtypes. For example, Type I CRISPR is composed of 7 subtypes: I-A to I-G.
Here is example of each of the 6 family found in the RefSeq database:
![cas_class1-subtype-i-e](/cas/CAS_Class1-Subtype-I-E.svg){max-width=750px}
The CAS_Class1-Subtype-I-E system in *Citrobacter sp. RHBSTW-00017* (GCF_013797615.1, NZ_CP056899) is composed of 8 proteins cas3_I_5 (WP_103284157.1) cas8e_I-E_1 (HV037_RS05730) cse2gr11_I-E_2 (HV037_RS05735) cas7_I-E_2 (HV037_RS05740) cas5_I-E_3 (HV037_RS05745) cas6e_I_II_III_IV_V_VI_1 (HV037_RS05750) cas1_I-E_1 (HV037_RS05755) cas2_I-E_2 (HV037_RS05760)
![cas_class2-subtype-ii-a](/cas/CAS_Class2-Subtype-II-A.svg){max-width=750px}
The CAS_Class2-Subtype-II-A system in *Streptococcus agalactiae* (GCF_001190885.1, NZ_CP011329) is composed of 4 proteins cas9_II-A_II-B_II-C_3 (SAH002_RS04760) cas1_I_II_III_IV_V_VI_5 (SAH002_RS04765) cas2_I_II_III_IV_V_VI_6 (SAH002_RS04770) csn2_II-A_4 (SAH002_RS04775)
![cas_class1-subtype-iii-a](/cas/CAS_Class1-Subtype-III-A.svg){max-width=750px}
The CAS_Class1-Subtype-III-A system in *Mycobacterium tuberculosis* (GCF_014900005.1, NZ_CP041828) is composed of 9 proteins cas2_I_II_III_IV_V_VI_5 (FPJ80_RS14760) cas1_I_II_III_IV_V_VI_8 (FPJ80_RS14765) csm6_III_2 (FPJ80_RS14770) csm5gr7_III-A_3 (FPJ80_RS14775) csm4gr5_III-A_3 (FPJ80_RS14780) csm3gr7_III-A_1 (FPJ80_RS14785) csm2gr11_III-A_1 (FPJ80_RS14790) cas10_III_7 (FPJ80_RS14795) cas6_I_II_III_IV_V_VI_15 (FPJ80_RS14800)
![cas_class1-subtype-iv-a](/cas/CAS_Class1-Subtype-IV-A.svg){max-width=750px}
The CAS_Class1-Subtype-IV-A system in *Shigella flexneri* (GCF_022353685.1, NZ_CP054978) is composed of 5 proteins csf1gr8_IV-A_3 (WP_038989757.1) cas6e_I_II_III_IV_V_VI_3 (WP_038989755.1) csf4_IV-A_1 (WP_016947078.1) csf3gr5_IV-A_1 (WP_004181864.1) csf2gr7_IV-A_1 (WP_029505552.1)
![cas_class2-subtype-v-a](/cas/CAS_Class2-Subtype-V-A.svg){max-width=750px}
The CAS_Class2-Subtype-V-A system in *Francisella tularensis* (GCF_001865695.1, NZ_CP016635) is composed of 4 proteins cas2_I_II_III_IV_V_VI_3 (N894_RS07580) cas1_I_II_III_IV_V_VI_1 (N894_RS07585) cas4_V_1 (N894_RS07590) cas12a_V-A_4 (N894_RS07595)
![cas_class2-subtype-vi-a](/cas/CAS_Class2-Subtype-VI-A.svg){max-width=750px}
The CAS_Class2-Subtype-VI-A system in *Leptotrichia shahii* (GCF_008327825.1, NZ_AP019827) is composed of 3 proteins cas13a_VI-A_1 (F1564_RS00570) cas1_I_II_III_IV_V_VI_5 (F1564_RS00575) cas2_I_II_III_IV_V_VI_11 (F1564_RS00580)
## Distribution of the system among prokaryotes
Among the 22,803 complete genomes of RefSeq, the AbiC is detected in 8581 genomes (37.63 %).
The system was detected in 2905 different species.
![cas](/cas/Distribution_Cas.svg){max-width=750px}
Proportion of genome encoding the AbiC system for the 14 phyla with more than 50 genomes in the RefSeq database.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
This diff is collapsed.
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment