16S ribosomal RNA
16S ribosomal RNA is the RNA component of the 30S subunit of a prokaryotic ribosome. It binds to the Shine-Dalgarno sequence and provides most of the SSU structure.
The genes coding for it are referred to as 16S rRNA genes and are used in reconstructing phylogenies, due to the slow rates of evolution of this region of the gene. Carl Woese and George E. Fox were two of the people who pioneered the use of 16S rRNA in phylogenetics in 1977. Multiple sequences of the 16S rRNA gene can exist within a single bacterium.
Terminology
The descriptor 16S refers to the size of these ribosomal subunits as reflected indirectly by the speed at which they sediment when samples are centrifuged. Thus 16S means 16 Svedberg units.Functions
- Like the large ribosomal RNA, it has a structural role, acting as a scaffold defining the positions of the ribosomal proteins.
- The 3-end contains the anti-Shine-Dalgarno sequence, which binds upstream to the AUG start codon on the mRNA. The 3-end of 16S RNA binds to the proteins S1 and S21 which are known to be involved in initiation of protein synthesis
- Interacts with 23S, aiding in the binding of the two ribosomal subunits
- Stabilizes correct codon-anticodon pairing in the A-site by forming a hydrogen bond between the N1 atom of adenine residues 1492 and 1493 and the 2OH group of the mRNA backbone.
Universal primers
The most common primer pair was devised by Weisburg et al. and is currently referred to as 27F and 1492R; however, for some applications shorter amplicons may be necessary, for example for 454 sequencing with titanium chemistry the primer pair 27F-534R covering V1 to V3.
Often 8F is used rather than 27F. The two primers are almost identical, but 27F has an M instead of a C. AGAGTTTGATCMTGGCTCAG compared with 8F.
| Primer name | Sequence | |
| 8F | AGA GTT TGA TCC TGG CTC AG | |
| 27F | AGA GTT TGA TCM TGG CTC AG | |
| 336R | ACT GCT GCS YCC CGT AGG AGT CT | |
| 337F | GAC TCC TAC GGG AGG CWG CAG | |
| 518R | GTA TTA CCG CGG CTG CTG G | |
| 533F | GTG CCA GCM GCC GCG GTA A | |
| 785F | GGA TTA GAT ACC CTG GTA | |
| 806R | GGA CTA CVS GGG TAT CTA AT | |
| 907R | CCG TCA ATT CCT TTR AGT TT | |
| 928F | TAA AAC TYA AAK GAA TTG ACG GG | |
| 1100F | YAA CGA GCG CAA CCC | |
| 1100R | GGG TTG CGC TCG TTG | |
| U1492R | GGT TAC CTT GTT ACG ACT T | |
| 1492R | CGG TTA CCT TGT TAC GAC TT |
PCR and NGS applications
In addition to highly conserved primer binding sites, 16S rRNA gene sequences contain hypervariable regions that can provide species-specific signature sequences useful for identification of bacteria.As a result, 16S rRNA gene sequencing has become prevalent in medical microbiology as a rapid and cheap alternative to phenotypic methods of bacterial identification. Although it was originally used to identify bacteria, 16S sequencing was subsequently found to be capable of reclassifying bacteria into completely new species, or even genera.
It has also been used to describe new species that have never been successfully cultured.
With third-generation sequencing coming to many labs, simultaneous identification of thousands of 16S rRNA sequences is possible within hours, allowing metagenomic studies, for example of gut flora. In samples collected from patients with confirmed infections, 16S rRNA next-generation sequencing demonstrated enhanced detection in 40% of cases compared to traditional culture methods; moreover, pre-sampling antibiotic consumption did not significantly affect the sensitivity of 16S NGS.
Hypervariable regions
The bacterial 16S gene contains nine hypervariable regions, ranging from about 30 to 100 base pairs long, that are involved in the secondary structure of the small ribosomal subunit. The degree of conservation varies widely between hypervariable regions, with more conserved regions correlating to higher-level taxonomy and less conserved regions to lower levels, such as genus and species. While the entire 16S sequence allows for comparison of all hypervariable regions, at approximately 1,500 base pairs long it can be prohibitively expensive for studies seeking to identify or characterize diverse bacterial communities. These studies commonly utilize the Illumina platform, which produces reads at rates 50-fold and 12,000-fold less expensive than 454 pyrosequencing and Sanger sequencing, respectively. While cheaper and allowing for deeper community coverage, Illumina sequencing only produces reads 75–250 base pairs long, and has no established protocol for reliably assembling the full gene in community samples. Full hypervariable regions can be assembled from a single Illumina run, however, making them ideal targets for the platform.While 16S hypervariable regions can vary dramatically between bacteria, the 16S gene as a whole maintains greater length homogeneity than its eukaryotic counterpart, which can make alignments easier. Additionally, the 16S gene contains highly conserved sequences between hypervariable regions, enabling the design of universal primers that can reliably produce the same sections of the 16S sequence across different taxa. Although no hypervariable region can accurately classify all bacteria from domain to species, some can reliably predict specific taxonomic levels. Many community studies select semi-conserved hypervariable regions like the V4 for this reason, as it can provide resolution at the phylum level as accurately as the full 16S gene. While lesser-conserved regions struggle to classify new species when higher order taxonomy is unknown, they are often used to detect the presence of specific pathogens. In one study by Chakravorty et al. in 2007, the authors characterized the V1–V8 regions of a variety of pathogens in order to determine which hypervariable regions would be most useful to include for disease-specific and broad assays. Amongst other findings, they noted that the V3 region was best at identifying the genus for all pathogens tested, and that V6 was the most accurate at differentiating species between all CDC-watched pathogens tested, including anthrax.
While 16S hypervariable region analysis is a powerful tool for bacterial taxonomic studies, it struggles to differentiate between closely related species. In the families Enterobacteriaceae, Clostridiaceae, and Peptostreptococcaceae, species can share up to 99% sequence similarity across the full 16S gene. As a result, the V4 sequences can differ by only a few nucleotides, leaving reference databases unable to reliably classify these bacteria at lower taxonomic levels. By limiting 16S analysis to select hypervariable regions, these studies can fail to observe differences in closely related taxa and group them into single taxonomic units, therefore underestimating the total diversity of the sample. Furthermore, bacterial genomes can house multiple 16S genes, with the V1, V2, and V6 regions containing the greatest intraspecies diversity. While not the most precise method of classifying bacterial species, analysis of the hypervariable regions remains one of the most useful tools available to bacterial community studies.