Gene expression


Gene expression is the process by which the information contained within a gene is used to produce a functional gene product, such as a protein or a functional RNA molecule. This process involves multiple steps, including the transcription of the gene's sequence into RNA. For protein-coding genes, this RNA is further translated into a chain of amino acids that folds into a protein, while for non-coding genes, the resulting RNA itself serves a functional role in the cell. Gene expression enables cells to utilize the genetic information in genes to carry out a wide range of biological functions. While expression levels can be regulated in response to cellular needs and environmental changes, some genes are expressed continuously with little variation.

Mechanism

Transcription

The production of a RNA copy from a DNA strand is called transcription, and is performed by RNA polymerases, which add one ribonucleotide at a time to a growing RNA strand as per the complementarity law of the nucleotide bases. This RNA is complementary to the template 3′ → 5′ DNA strand, with the exception that thymines are replaced with uracils in the RNA and possible errors.
In bacteria transcription is carried out by a single type of RNA polymerase, which needs to bind a DNA sequence called a Pribnow box with the help of the sigma factor protein to start transcription. In eukaryotes, transcription is performed in the nucleus by three types of RNA polymerases, each of which needs a special DNA sequence called the promoter and a set of DNA-binding proteins—transcription factors—to initiate the process. RNA polymerase I is responsible for transcription of ribosomal RNA genes. RNA polymerase II transcribes all protein-coding genes but also some non-coding RNAs. RNA polymerase III transcribes 5S rRNA, transfer RNA genes, and some small non-coding RNAs. Transcription ends when the polymerase encounters a sequence called the terminator.

mRNA processing

While transcription of prokaryotic protein-coding genes creates messenger RNA that is ready for translation into protein, transcription of eukaryotic genes leaves a primary transcript of RNA, which first has to undergo a series of modifications to become a mature RNA. Types and steps involved in the maturation processes vary between coding and non-coding preRNAs; i.e. even though preRNA molecules for both mRNA and tRNA undergo splicing, the steps and machinery involved are different. The processing of non-coding RNA is described below.
The processing of pre-mRNA include 5′ capping, which is set of enzymatic reactions that add 7-methylguanosine to the 5′ end of pre-mRNA and thus protect the RNA from degradation by exonucleases. The m7G cap is then bound by cap binding complex heterodimer, which aids in mRNA export to cytoplasm and also protect the RNA from decapping.
Another modification is 3′ cleavage and polyadenylation. They occur if polyadenylation signal sequence is present in pre-mRNA, which is usually between protein-coding sequence and terminator. The pre-mRNA is first cleaved and then a series of ~200 adenines are added to form poly tail, which protects the RNA from degradation. The poly tail is bound by multiple poly-binding proteins necessary for mRNA export and translation re-initiation. In the inverse process of deadenylation, poly tails are shortened by the CCR4-Not 3′-5′ exonuclease, which often leads to full transcript decay.
Image:Pre-mRNA.svg|right|thumbnail|404x404px|alt=Pre-mRNA is spliced to form of mature mRNA.|Illustration of exons and introns in pre-mRNA and the formation of mature mRNA by splicing. The UTRs are non-coding parts of exons at the ends of the mRNA.
A very important modification of eukaryotic pre-mRNA is RNA splicing. The majority of eukaryotic pre-mRNAs consist of alternating segments called exons and introns. During the process of splicing, an RNA-protein catalytical complex known as spliceosome catalyzes two transesterification reactions, which remove an intron and release it in form of lariat structure, and then splice neighbouring exons together. In certain cases, some introns or exons can be either removed or retained in mature mRNA. This so-called alternative splicing creates series of different transcripts originating from a single gene. Because these transcripts can be potentially translated into different proteins, splicing extends the complexity of eukaryotic gene expression and the size of a species proteome.
Extensive RNA processing may be an evolutionary advantage made possible by the nucleus of eukaryotes. In prokaryotes, transcription and translation happen together, whilst in eukaryotes, the nuclear membrane separates the two processes, giving time for RNA processing to occur.

Non-coding RNA maturation

In most organisms non-coding genes are transcribed as precursors that undergo further processing. In the case of ribosomal RNAs, they are often transcribed as a pre-rRNA that contains one or more rRNAs. The pre-rRNA is cleaved and modified at specific sites by approximately 150 different small nucleolus-restricted RNA species, called snoRNAs. SnoRNAs associate with proteins, forming snoRNPs. While snoRNA part basepair with the target RNA and thus position the modification at a precise site, the protein part performs the catalytical reaction. In eukaryotes, in particular a snoRNP called RNase, MRP cleaves the 45S pre-rRNA into the 28S, 5.8S, and 18S rRNAs. The rRNA and RNA processing factors form large aggregates called the nucleolus.
In the case of transfer RNA, for example, the 5′ sequence is removed by RNase P, whereas the 3′ end is removed by the tRNase Z enzyme and the non-templated 3′ CCA tail is added by a nucleotidyl transferase. In the case of micro RNA, miRNAs are first transcribed as primary transcripts or pri-miRNA with a cap and poly-A tail and processed to short, 70-nucleotide stem-loop structures known as pre-miRNA in the cell nucleus by the enzymes Drosha and Pasha. After being exported, it is then processed to mature miRNAs in the cytoplasm by interaction with the endonuclease Dicer, which also initiates the formation of the RNA-induced silencing complex, composed of the Argonaute protein.
Even snRNAs and snoRNAs themselves undergo series of modification before they become part of functional RNP complex. This is done either in the nucleoplasm or in the specialized compartments called Cajal bodies. Their bases are methylated or pseudouridinilated by a group of small Cajal body-specific RNAs, which are structurally similar to snoRNAs.

Translation

For some non-coding RNA, the mature RNA is the final gene product. In the case of messenger RNA the RNA is an information carrier coding for the synthesis of one or more proteins. mRNA carrying a single protein sequence is monocistronic whilst mRNA carrying multiple protein sequences is known as polycistronic.
Every mRNA consists of three parts: a 5′ untranslated region, a protein-coding region or open reading frame, and a 3′ untranslated region. The coding region carries information for protein synthesis encoded by the genetic code to form triplets. Each triplet of nucleotides of the coding region is called a codon and corresponds to a binding site complementary to an anticodon triplet in transfer RNA. Transfer RNAs with the same anticodon sequence always carry an identical type of amino acid. Amino acids are then chained together by the ribosome according to the order of triplets in the coding region. The ribosome helps transfer RNA to bind to messenger RNA and takes the amino acid from each transfer RNA and makes a structure-less protein out of it. Each mRNA molecule is translated into many protein molecules, on average ~2800 in mammals.
In prokaryotes translation generally occurs at the point of transcription, often using a messenger RNA that is still in the process of being created. In eukaryotes translation can occur in a variety of regions of the cell depending on where the protein being written is supposed to be. Major locations are the cytoplasm for soluble cytoplasmic proteins and the membrane of the endoplasmic reticulum for proteins that are for export from the cell or insertion into a cell membrane. Proteins that are supposed to be produced at the endoplasmic reticulum are recognised part-way through the translation process. This is governed by the signal recognition particle—a protein that binds to the ribosome and directs it to the endoplasmic reticulum when it finds a signal peptide on the growing amino acid chain.

Regulation

Regulation of gene expression is the control of the amount and timing of appearance of the functional product of a gene. Control of expression is vital to allow a cell to produce the gene products it needs when it needs them; in turn, this gives cells the flexibility to adapt to a variable environment, external signals, damage to the cell, and other stimuli. More generally, gene regulation gives the cell control over all structure and function, and is the basis for cellular differentiation, morphogenesis and the versatility and adaptability of any organism.
Numerous terms are used to describe types of genes depending on how they are regulated; these include:
  • A constitutive gene is a gene that is transcribed continually as opposed to a facultative gene, which is only transcribed when needed.
  • A housekeeping gene is a gene that is required to maintain basic cellular function and so is typically expressed in all cell types of an organism. Examples include actin, GAPDH and ubiquitin. Some housekeeping genes are transcribed at a relatively constant rate and these genes can be used as a reference point in experiments to measure the expression rates of other genes.
  • A facultative gene is a gene only transcribed when needed as opposed to a constitutive gene.
  • An inducible gene is a gene whose expression is either responsive to environmental change or dependent on the position in the cell cycle.
Any step of gene expression may be modulated, from the DNA-RNA transcription step to post-translational modification of a protein. The stability of the final gene product, whether it is RNA or protein, also contributes to the expression level of the gene—an unstable product results in a low expression level. In general gene expression is regulated through changes in the number and type of interactions between molecules that collectively influence transcription of DNA and translation of RNA.
Some simple examples of where gene expression is important are: