Promoter (genetics)


In genetics, a promoter is a sequence of DNA to which proteins bind to initiate transcription of a single RNA transcript from the DNA downstream of the promoter. The RNA transcript may encode a protein, or can have a function in and of itself, such as tRNA or rRNA. Promoters are located near the transcription start sites of genes, upstream on the DNA.
Promoters can be about 100–1000 base pairs long, the sequence of which is highly dependent on the gene and product of transcription, type or class of RNA polymerase recruited to the site, and species of organism.

Overview

For transcription to take place, the enzyme that synthesizes RNA, known as RNA polymerase, must attach to the DNA near a gene. Promoters contain specific DNA sequences such as response elements that provide a secure initial binding site for RNA polymerase and for proteins called transcription factors that recruit RNA polymerase. These transcription factors have specific activator or repressor sequences of corresponding nucleotides that attach to specific promoters and regulate gene expression.
;In bacteria: The promoter is recognized by RNA polymerase and an associated sigma factor, which in turn are often brought to the promoter DNA by an activator protein's binding to its own DNA binding site nearby.
;In eukaryotes: The process is more complicated, and at least seven different factors are necessary for the binding of an RNA polymerase II to the promoter.
;In archaea: The promoter resembles an eukaryotic one, though much more simplified. It contains BRE and TATA elements and are recognized by TFB and TBP.
Promoters represent critical elements that can work in concert with other regulatory regions to direct the level of transcription of a given gene.
A promoter is induced in response to changes in abundance or conformation of regulatory proteins in a cell, which enable activating transcription factors to recruit RNA polymerase.
Given the short sequences of most promoter elements, promoters can rapidly evolve from random sequences. For instance, in E. coli, ~60% of random sequences can evolve expression levels comparable to the wild-type lac promoter with only one mutation, and that ~10% of random sequences can serve as active promoters even without evolution.

Identification of relative location

As promoters are typically immediately adjacent to the gene in question, positions in the promoter are designated relative to the transcriptional start site, where transcription of DNA begins for a particular gene.

Elements

Bacterial

In bacteria, the promoter contains two short sequence elements approximately 10 and 35 nucleotides upstream from the transcription start site.
  • The sequence at -10 has the consensus sequence TATAAT.
  • The sequence at -35 has the consensus sequence TTGACA.
  • The above consensus sequences, while conserved on average, are not found intact in most promoters. On average, only 3 to 4 of the 6 base pairs in each consensus sequence are found in any given promoter. Few natural promoters have been identified to date that possess intact consensus sequences at both the -10 and -35; artificial promoters with complete conservation of the -10 and -35 elements have been found to transcribe at lower frequencies than those with a few mismatches with the consensus.
  • The optimal spacing between the -35 and -10 sequences is 17 bp. The spacer sequence affects promoter strength by up to 600-fold.
  • Some promoters contain one or more upstream promoter element subsites.
  • The transcription start site has the consensus sequence YRY.
The above promoter sequences are recognized only by RNA polymerase holoenzyme containing sigma-70. RNA polymerase holoenzymes containing other sigma factors recognize different core promoter sequences.
← upstream downstream →
5'-XXXXXXXPPPPPPXXXXXXPPPPPPXXXXGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGXXXX-3'
-35 -10 Gene to be transcribed

Probability of occurrence of each nucleotide

for -10 sequence
T A T A A T
77% 76% 60% 61% 56% 82%
for -35 sequence
T T G A C A
69% 79% 61% 56% 54% 54%

Bidirectional (prokaryotic)

Promoters can be very closely located in the DNA. Such "closely spaced promoters" have been observed in the DNAs of all life forms, from humans to prokaryotes and are highly conserved. Therefore, they may provide some advantages.
These pairs of promoters can be positioned in divergent, tandem, and convergent directions. They can also be regulated by transcription factors and differ in various features, such as the nucleotide distance between them, the two promoter strengths, etc.
The most important aspect of two closely spaced promoters is that they will, most likely, interfere with each other. Several studies have explored this using both analytical and stochastic models. There are also studies that measured gene expression in synthetic genes or from one to a few genes controlled by bidirectional promoters.
More recently, one study measured most genes controlled by tandem promoters in E. coli. In that study, two main forms of interference were measured. One is when an RNAP is on the downstream promoter, blocking the movement of RNAPs elongating from the upstream promoter. The other is when the two promoters are so close that when an RNAP sits on one of the promoters, it blocks any other RNAP from reaching the other promoter. These events are possible because the RNAP occupies several nucleotides when bound to the DNA, including in transcription start sites.
Similar events occur when the promoters are in divergent and convergent formations. The possible events also depend on the distance between them.

Eukaryotic

Gene promoters are typically located upstream of the gene and can have regulatory elements several kilobases away from the transcriptional start site. In eukaryotes, the transcriptional complex can cause the DNA to bend back on itself, which allows for placement of regulatory sequences far from the actual site of transcription. Eukaryotic RNA-polymerase-II-dependent promoters can contain a TATA box, which is recognized by the general transcription factor TATA-binding protein ; and a B recognition element, which is recognized by the general transcription factor TFIIB. The TATA element and BRE typically are located close to the transcriptional start site.
Eukaryotic promoter regulatory sequences typically bind proteins called transcription factors that are involved in the formation of the transcriptional complex. An example is the E-box, which binds transcription factors in the basic helix-loop-helix family. Some promoters that are targeted by multiple transcription factors might achieve a hyperactive state, leading to increased transcriptional activity.
  • Core promoter – the minimal portion of the promoter required to properly initiate transcription
  • * Includes the transcription start site and elements directly upstream
  • * A binding site for RNA polymerase
  • ** RNA polymerase I: transcribes genes encoding 18S, 5.8S and 28S ribosomal RNAs
  • ** RNA polymerase II: transcribes genes encoding messenger RNA and certain small nuclear RNAs and microRNA
  • ** RNA polymerase III: transcribes genes encoding transfer RNA, 5s ribosomal RNAs and other small RNAs
  • * General transcription factor binding sites, e.g. TATA box, B recognition element.
  • * Many other elements/motifs may be present. There is no such thing as a set of "universal elements" found in every core promoter.
  • Proximal promoter – the proximal sequence upstream of the gene that tends to contain primary regulatory elements
  • * Approximately 250 base pairs upstream of the start site
  • * Specific transcription factor binding sites
  • Distal promoter – the distal sequence upstream of the gene that may contain additional regulatory elements, often with a weaker influence than the proximal promoter
  • * Anything further upstream
  • * Specific transcription factor binding sites

    Mammalian promoters

Up-regulated expression of genes in mammals is initiated when signals are transmitted to the promoters associated with the genes. Promoter DNA sequences may include different elements such as CpG islands, a TATA box, initiator , upstream and downstream TFIIB recognition elements , and downstream core promoter element . The presence of multiple methylated CpG sites in CpG islands of promoters causes stable silencing of genes. However, the presence or absence of the other elements have relatively small effects on gene expression in experiments. Two sequences, the TATA box and Inr, caused small but significant increases in expression. The BREu and the BREd elements significantly decreased expression by 35% and 20%, respectively, and the DPE element had no detected effect on expression.
Cis-regulatory modules that are localized in DNA regions distant from the promoters of genes can have very large effects on gene expression, with some genes undergoing up to 100-fold increased expression due to such a cis-regulatory module. These cis-regulatory modules include enhancers, silencers, insulators and tethering elements. Among this constellation of elements, enhancers and their associated transcription factors have a leading role in the regulation of gene expression.
Enhancers are regions of the genome that are major gene-regulatory elements. Enhancers control cell-type-specific gene expression programs, most often by looping through long distances to come in physical proximity with the promoters of their target genes. In a study of brain cortical neurons, 24,937 loops were found, bringing enhancers to promoters. Multiple enhancers, each often at tens or hundred of thousands of nucleotides distant from their target genes, loop to their target gene promoters and coordinate with each other to control expression of their common target gene.
The schematic illustration in this section shows an enhancer looping around to come into close physical proximity with the promoter of a target gene. The loop is stabilized by a dimer of a connector protein, with one member of the dimer anchored to its binding motif on the enhancer and the other member anchored to its binding motif on the promoter. Several cell function specific transcription factors generally bind to specific motifs on an enhancer and a small combination of these enhancer-bound transcription factors, when brought close to a promoter by a DNA loop, govern the level of transcription of the target gene. Mediator communicates regulatory signals from enhancer DNA-bound transcription factors directly to the RNA polymerase II enzyme bound to the promoter.
Enhancers, when active, are generally transcribed from both strands of DNA with RNA polymerases acting in two different directions, producing two eRNAs as illustrated in the Figure. An inactive enhancer may be bound by an inactive transcription factor. Phosphorylation of the transcription factor may activate it and that activated transcription factor may then activate the enhancer to which it is bound. An activated enhancer begins transcription of its RNA before activating a promoter to initiate transcription of messenger RNA from its target gene.