Anti-CRISPR
Anti-CRISPR is a group of proteins found in phages, that inhibit the normal activity of CRISPR-Cas, the immune system of certain bacteria. CRISPR consists of genomic sequences that can be found in prokaryotic organisms, that come from bacteriophages that infected the bacteria beforehand, and are used to defend the cell from further viral attacks. Anti-CRISPR results from an evolutionary process occurred in phages in order to avoid having their genomes destroyed by the prokaryotic cells that they will infect.
Before the discovery of this type of family proteins, the acquisition of mutations was the only way known that phages could use to avoid CRISPR-Cas mediated shattering, by reducing the binding affinity of the phage and CRISPR. Nonetheless, bacteria have mechanisms to retarget the mutant bacteriophage, a process that it is called "priming adaptation". So, as far as researchers currently know, anti-CRISPR is the most effective way to ensure the survival of phages throughout the infection process of bacteria.
History
Anti-CRISPR systems were first seen in Pseudomonas aeruginosa prophages, which disabled type I-F CRISPR–Cas system, characteristic of some strains of these bacteria. After analysing the genomic sequences of these phages, genes codifying five different Anti-CRISPR proteins were discovered. Such proteins were AcrF1, AcrF2, AcrF3, AcrF4 and AcrF5. Research found none of these proteins disrupted the expression of Cas genes nor the assembling of CRISPR molecules, so it was thought that those type I-F proteins directly affected the CRISPR–Cas interference.Further investigation confirmed this hypothesis with the discovery of 4 other proteins, which were shown to impede Pseudomonas aeruginosa
Later on, it was seen that phages that produced such proteins also encoded a putative transcriptional regulator named Aca 1 which was genetically located really close to the anti-CRISPR genes. This regulatory protein is supposed to be responsible for the anti-CRISPR gene expression during the infectious cycle of the phage, therefore, both types of proteins seem to work together as a single mechanism.
After some studies, a similar amino-acid sequence to that of Aca1 was found, leading to the discovery of Aca2, a new family of Aca proteins. Aca2 also revealed the existence of five new groups of type I-F anti-CRISPR proteins due to their genomic proximity: AcrF6, AcrF7, AcrF8, AcrF9 and AcrF10. These proteins were not only present in Pseudomonas aeruginosa
Thanks to the use of bioinformatic tools, in 2016, AcrIIC1, AcrIIC2 and AcrIIC3 protein families were discovered in Neisseria meningitidis. Such proteins were the first inhibitors of type II CRISPR–Cas to be found. A year later, a study confirmed the presence of type II-A CRISPR–Cas9 inhibitors in Listeria monocytogenes. Two of those proteins were demonstrated to work properly against Streptococcus pyogenes type II-A defensive CRISPR system.
The result of all this research has been the discovery of 21 different Anti-CRISPR protein families, despite other inhibitors may exist due to the quick mutational process of phages. Thus, more research is needed to unravel the complexity of anti-CRISPR systems.
Types
Anti-CRISPR genes can be found in different parts of the phage DNA: in the capsid, the tail and at the extreme end. Moreover, it has been found that many MGEs have two or even three Acr genes in a single operon, which suggest that they could have been exchanged between MGEs.As all proteins, Acr family proteins are formed by the translation and transduction of the genes, and their classification is based on the type of CRISPR-Cas system they inhibit, due to the fact that each anti-CRISPR protein inhibits a specific CRISPR-Cas system. Although not many anti-CRISPR proteins have been discovered, these are the ones that have been found so far:
| Anti-CRISPR protein family | Characterized member | CRISPR system inhibited | Number of amino acids |
| AcrE1 | JBD5‑34 | I‑E | 100 |
| AcrE2 | JBD88a‑32 | I‑E | 84 |
| AcrE3 | DMS3‑30 | I‑E | 68 |
| AcrE4 | D3112‑31 | I‑E | 52 |
| AcrF1 | JBD30‑35 | I‑F | 78 |
| AcrF2 | D3112‑30 | I‑F | 90 |
| AcrF3 | JBD5‑35 | I‑F | 139 |
| AcrF4 | JBD26‑37 | I‑F | 100 |
| AcrF5 | JBD5‑36 | I‑F | 79 |
| AcrF6 | AcrF6Pae | I‑E and I‑F | 100 |
| AcrF7 | AcrF7Pae | I‑F | 67 |
| AcrF8 | AcrF8ZF40 | I‑F | 92 |
| AcrF9 | AcrF9Vpa | I‑F | 68 |
| AcrF10 | AcrF10Sxi | I‑F | 97 |
| AcrIIA1 | AcrIIA1Lmo | II‑A | 149 |
| AcrIIA2 | AcrIIA2Lmo | II‑A | 123 |
| AcrIIA3 | AcrIIA3Lmo | II‑A | 125 |
| AcrIIA4 | AcrIIA4Lmo | II‑A | 87 |
| AcrIIC1 | AcrIIC1Nme | II‑C | 85 |
| AcrIIC2 | AcrIIC2Nme | II‑C | 123 |
| AcrIIC3 | AcrIIC3Nme | II‑C | 116 |
So far, genes encoding anti-CRISPR proteins have been found in myophages, siphophages, putative conjugative elements and pathogenicity islands.
Attempts have been made to find common surrounding genetic features of anti-CRISPR genes, but without any success. Nevertheless, the presence of an aca gene just below anti-CRISPR genes has been observed.
The first Acr protein families to be discovered were AcrF1, AcrF2, AcrF3, AcrF4 and AcrF5. These inhibitors are mainly found in Pseudomonas phages, which are capable of infecting Pseudomonas aeruginosas possessing a type I‑F CRISPR–Cas system. Then, in another study, AcrE1, AcrE2, AcrE3 and AcrE4 protein families were found to also inhibit the type I‑F CRISPR–Cas in Pseudomonas aeruginosas.
Later on, AcrF6, AcrF7, AcrF8, AcrF9 and AcrF10 protein families, which were also able to inhibit type I‑F CRISPR–Cas, were found to be very common in Pseudomonadota MGEs.
The first inhibitors of a type II CRISPR–Cas system were then discovered: AcrIIC1, AcrIIC2 and AcrIIC3, that block the type II‑C CRISPR–Cas9 activity of Neisseria meningitidis.
Finally, AcrIIA1, AcrIIA2, AcrIIA3 and AcrIIA4 were found. These protein families have the ability to inhibit the type II‑A CRISPR–Cas system of Listeria monocytogenes.
As for the naming convention of Acr family proteins, it is established as follows: firstly, the type of system inhibited, then a numerical value referring to the protein family and finally the source of the specific anti-CRISPR protein. For example, AcrF9Vpa is active against the type I-F CRISPR–Cas system. It also was the ninth anti-CRISPR described for this system, and it is encoded in an integrated MGE in a Vibrio parahaemolyticus genome.
Structure
As exposed above, there is a wide spectrum of anti-CRISPR proteins, but few of these have been deeply studied. One of the most studied and well-defined Acrs is AcrIIA4, which inhibits Cas9, thus blocking the II-A CRISPR-Cas system of Streptococcus pyogenes.AcrIIA4
The protein was solved using nuclear magnetic resonance ; it contains 87 residues and its molecular weight is 10.182 kDa. AcrIIA4 contains:3 antiparallel β-strands that form a β-sheet. This represents a 16,1% of the total number of amino acids, as 14 of them form the β-strands.3 α-helices.1 310 helix placed between the first and second β-strands, which starts at residue 22 and end in residue 25. The total helical part is composed of 40 residues, which is a 50,6% of the protein.Loops joining the different secondary structures.There is a good definition of the secondary structures, as the three α-helices are packed near the three β-strands. Strikingly, between β3 strand, α2 and α3 helices, there is a hydrophobic core, originated by a cluster of aromatic side chains which are attracted by non-covalent interactions, such as pi stacking. Moreover, as it is an acidic protein, there is a high concentration of negatively charged residues in the loops between β3 and α2, between α2 and α3, and in the first part of α3, which may play an important role in the inhibition of Cas9, as negative charges might imitate phosphates of nucleic acids.
AcrF1
On the other hand, there is another Acr, AcrF1, which may not have been as studied as the explained above, although there is a good description of its structure. It inhibits the I-F CRISPR-Cas system of Pseudomonas aeruginosa. Maxwell et al. solved the 3D structure using NMR.The protein contains 78 residues, between which interact to form secondary structures. The structure of AcrF1 is formed of two anti-parallel α-helices and a β-sheet, which contains four anti-parallel β-strands. This β-sheet is placed in the contrary side of the α-helical part, which creates a hydrophobic core formed of 13 amino acids. Turns can also be found in different parts of the protein, for instance, joining the β-strands.
There are surface residues which actively participate in the active site of AcrF1, two of which are tyrosines and the third amino acid is a glutamic acid, as their mutation by an alanine causes a 100-fold decrease in the activity of the protein, and a 107-fold decrease when Y6 is mutated.
The different structures that form the protein create a strange combination, as Maxwell et al. conducted a DALI search in order to find similarities between other proteins, and they found no informative similarities.
Function
Avoiding destruction of the phage DNA
The principal function of anti-CRISPR proteins is to interact with specific components of CRISPR-Cas systems, such as the effector nucleases, to avoid the destruction of the phage DNA.A phage introduces its DNA into a prokaryotic cell, usually the cell detects a sequence known as "target", that activates CRISPR-Cas immune system, but the presence of an initial sequence encoding the formation of Acr proteins, avoids phage destruction. Acr proteins are formed before the target sequence is read. This way, the CRISPR-Cas system is blocked before it can develop a response.
The procedure starts with the CRISPR locus being transcribed into crRNAs. CrRNAs combine with Cas proteins forming a ribonucleoprotein complex called Cascade. This complex surveys the cell to find complementary sequences of the crRNA. When this sequence is found, the Cas3 nuclease is recruited to the Cascade, and the target DNA from the phage is cleaved. But, for instance, when AcrF1 and AcrF2 are found, these interact with Cas7f and Cas8f-Cas5f, respectively, not allowing the binding to the phage DNA. Moreover, the cleaving of the target is prevented by the union between AcrF3 and Cas3.
The majority of Acr genes are located next to anti-CRISPR-associated genes, which encode proteins with a helix-turn-helix DNA-binding motif. Aca genes are preserved, and researchers are using them to identify Acr genes, but the function of the proteins they encode is not totally clear. The Acr-associated promoter produces high levels of Acr transcription just after the phage DNA injection into the bacteria takes place and, afterward, Aca proteins repress the transcription. If this wasn't repressed, the constant transcription of the gene would be lethal to the phage. Therefore, Aca activity is essential to ensure its survival.
Phage-phage cooperation
Moreover, it has been verified that bacteria with CRISPR-Cas systems are still partially immune to Acr. Consequently, initial abortive phage infections may be unable to hamper CRISPR immunity, but phage-phage cooperation can increasingly boost Acr production and promote immunosuppression, which might produce an increase on the vulnerability of the host cell to reinfection, and finally allow a successful infection and spreading of a second phage. This cooperation creates an epidemiological tipping point, in which, depending on the initial density of Acr-phages and the strength of CRISPR/Acr binding, phages can either be eliminated or originate a phage epidemic.If the starting levels of phages are high enough, the density of immunosuppressed hosts reaches a critical point where there are more successful infections than unsuccessful ones. Then, an epidemic begins. If this point is not reached, phage extinction occurs, and immunosuppressed hosts recover their initial state.
Phage immune evasion
It has become clear that Acr proteins play an important role in allowing phage immune evasion, though it is still unclear how anti-CRISPR proteins synthesis can overcome the host's CRISPR-Cas system, which can shatter the phage genome within minutes after the infection.Mechanisms
Within all the Anti-CRISPR proteins that have been discovered so far, mechanisms have been described for only 15 of among them. These mechanisms can be divided into three different types: crRNA loading interference, DNA binding blockage and DNA cleavage prevention.CrRNA loading interference
CrRNA loading interference mechanism has been mainly associated with the AcrIIC2 protein family. In order to block Cas9 activity, it prevents the correct assembly of the crRNA-Cas9 complex.DNA binding blockage
AcrIIC2 has been shown not to be the only one capable of blocking DNA binding. There are 11 other Acr family proteins that can also carry it out. Some among those are AcrIF1, AcrIF2, and AcrIF10, which act on different subunits of the Cascade effector complex of the type I-F CRISPR-Cas system, preventing the DNA to bind to the complex.Furthermore, AcrIIC3 prevents DNA binding by promoting dimerization of Cas9 and AcrIIA2 mimics DNA, thereby blocking the PAM recognition residues and consequently preventing dsDNA recognition and binding.