TALE-likes
Transcription Activator-Like Effector-Likes are a group of bacterial DNA binding proteins named for the first and still best-studied group, the TALEs of Xanthomonas bacteria. TALEs are important factors in the plant diseases caused by Xanthomonas bacteria, but are known primarily for their role in biotechnology as programmable DNA binding proteins, particularly in the context of TALE nucleases. TALE-likes have additionally been found in many strains of the Ralstonia solanacearum bacterial species complex, in Paraburkholderia rhizoxinica strain HKI 454, and in two unknown marine bacteria. Whether or not all these proteins form a single phylogenetic grouping is as yet unclear.
The unifying feature of the TALE-likes are their tandem arrays of DNA binding repeats. These repeats are, with few exceptions, 33-35 amino acids in length, and composed of two alpha-helices on either side of a flexible loop containing the DNA base binding residues and with neighbouring repeats joined by flexible linker loops. Evidence for this common structure comes in part from solved crystal structures of TALEs and a Burkholderia TALE-like, but also from the conservation of the code that all TALE-likes use to recognise DNA-sequences. In fact, TALE, RipTAL, and BAT repeats can be mixed and matched to generate functional DNA-binding proteins with varying affinity.
TALEs
TALEs are the first identified, best-studied and largest group within the TALE-likes. TALEs are found throughout the bacterial genus Xanthomonas, comprising mostly plant pathogens. Those TALEs which have been studied have all been shown to be secreted as part of the Type III secretion system into host plant cells. Once inside the host cell they translocate to the nucleus, bind specific DNA sequences within host promoters and turn on downstream genes. Every part of this process is thought to be conserved across all TALEs. The single meaningful difference between individual TALEs, based on current understanding, is the specific DNA sequence that each TALE binds. TALEs from even closely related strains differ in the composition of repeats that make up their DNA binding domain. Repeat composition determines DNA binding preference. In particular position 13 of each repeat confers the DNA base preference of each repeat. During early research it was noted that almost all the differences between repeats of a single TALE repeat array are found in positions 12 and 13 and this finding led to the hypothesis that these residues determine base preference. In fact repeat positions 12 and 13, referred to jointly as the Repeat Variable Diresidue are commonly said to confer base specificity despite clear evidence that position 13 is the base determining residue. In addition to the repeat domain TALEs also possess a number of conserved features in the domains flanking the repeats. These include domains for type-III-secretion, nuclear localization and transcriptional activation. This allows TALEs to carry out their biological role as effector proteins secreted into host plant cells to activate expression of specific host genes.Diversity and evolution
Whilst the RVD positions are commonly the only variable positions within a single TALE repeat array, there are more differences when comparing repeat arrays of different TALEs. The diversity of TALEs across the Xanthomonas genus is considerable, but a particularly striking finding is that the evolutionary history one arrives at by comparing repeat compositions differs from that found when comparing non-repeat sequences. Repeat arrays of TALEs are thought to evolve rapidly, with a number of recombinatorial processes suggested to shape repeat array evolution. Recombination of TALE repeat arrays has been demonstrated in a forced-selection experiment. This evolutionary dynamism is thought to be made possible by the very high sequence identity of TALE repeats, which is a unique feature of TALEs as opposed to other TALE-likes.T-zero
Another unique feature of TALEs is a set of four repeat structures at the N-terminal flank of the core repeat array. These structures, termed non-canonical or degenerate repeats have been shown to be vital for DNA binding, though all but one do not contact DNA bases and thus make no contribution to sequence preference. The one exception is repeat -1, which encodes a fixed T-zero preference to all TALEs. This means that the target sequences of TALEs are always preceded by a thymine base. This is thought to be common to all TALEs, with the possible exception of TalC from Xanthomonas oryzae pv. oryzae strain AXO1947.RipTALs
Discovery and molecular properties
It was noted in the 2002 publication of the genome of reference strain Ralstonia solanacearum GMI1000 that its genome encodes a protein similar to Xanthomonas TALEs. Based on similar domain structure and repeat sequences it was presumed that this gene and homologs in other Ralstonia strains would encode proteins with the same molecular properties as TALEs, including sequence-specific DNA binding. In 2013 this was confirmed by two studies. These genes and the proteins they encode are referred to as RipTALs in line with the standard nomenclature of Ralstonia effectors. Whilst the DNA binding code of the core repeats is conserved with TALEs, RipTALs do not share the T-zero preference, instead they have a strict G-zero requirement. In addition repeats within a single RipTAL repeat array have multiple sequence differences beyond the RVD positions, unlike the near-identical repeats of TALEs.RipTALs have been found in all four phylotypes of R. solanacearum, making it an ancestral feature of this clade. Despite differences in the flanking domains, the sequences their RVDs target are highly similar.