RNA-binding protein


RNA-binding proteins are proteins that bind to the double or single stranded RNA in cells and participate in forming ribonucleoprotein complexes.
RBPs contain various structural motifs, such as RNA recognition motif, dsRNA binding domain, zinc finger and others.
They are cytoplasmic and nuclear proteins. However, since most mature RNA is exported from the nucleus relatively quickly, most RBPs in the nucleus exist as complexes of protein and pre-mRNA called heterogeneous ribonucleoprotein particles.
RBPs have crucial roles in various cellular processes such as: cellular function, transport and localization. They especially play a major role in post-transcriptional control of RNAs, such as: splicing, polyadenylation, mRNA stabilization, mRNA localization and translation. Eukaryotic cells express diverse RBPs with unique RNA-binding activity and protein–protein interaction. According to the Eukaryotic RBP Database, there are 2961 genes encoding RBPs in humans. During evolution, the diversity of RBPs greatly increased with the increase in the number of introns. Diversity enabled eukaryotic cells to utilize RNA exons in various arrangements, giving rise to a unique RNP for each RNA. Although RBPs have a crucial role in post-transcriptional regulation in gene expression, relatively few RBPs have been studied systematically. It has now become clear that RNA–RBP interactions play important roles in many biological processes among organisms.

Structure

Many RBPs have modular structures and are composed of multiple repeats of just a few specific basic domains that often have limited sequences. Different RBPs contain these sequences arranged in varying combinations. A specific protein's recognition of a specific RNA has evolved through the rearrangement of these few basic domains. Each basic domain recognizes RNA, but many of these proteins require multiple copies of one of the many common domains to function.

Diversity

As nuclear RNA emerges from RNA polymerase, RNA transcripts are immediately covered with RNA-binding proteins that regulate every aspect of RNA metabolism and function including RNA biogenesis, maturation, transport, cellular localization and stability. All RBPs bind RNA, however they do so with different RNA-sequence specificities and affinities, which allows the RBPs to be as diverse as their targets and functions. These targets include mRNA, which codes for proteins, as well as a number of functional non-coding RNAs. NcRNAs almost always function as ribonucleoprotein complexes and not as naked RNAs. These non-coding RNAs include microRNAs, small interfering RNAs, as well as spliceosomal small nuclear RNAs.

Function

RNA processing and modification

Alternative splicing

is a mechanism by which different forms of mature mRNAs are generated from the same gene. It is a regulatory mechanism by which variations in the incorporation of the exons into mRNA leads to the production of more than one related protein, thus expanding possible genomic outputs. RBPs function extensively in the regulation of this process. Some binding proteins such as neuronal specific RNA-binding proteins, namely NOVA1, control the alternative splicing of a subset of hnRNA by recognizing and binding to a specific sequence in the RNA. These proteins then recruit splicesomal proteins to this target site. SR proteins are also well known for their role in alternative splicing through the recruitment of snRNPs that form the splicesome, namely U1 snRNP and U2AF snRNP. However, RBPs are also part of the splicesome itself. The splicesome is a complex of snRNA and protein subunits and acts as the mechanical agent that removes introns and ligates the flanking exons. Other than core splicesome complex, RBPs also bind to the sites of Cis-acting RNA elements that influence exons inclusion or exclusion during splicing. These sites are referred to as exonic splicing enhancers, exonic splicing silencers, intronic splicing enhancers and intronic splicing silencers and depending on their location of binding, RBPs work as splicing silencers or enhancers.

RNA editing

The most extensively studied form of RNA editing involves the ADAR protein. This protein functions through post-transcriptional modification of mRNA transcripts by changing the nucleotide content of the RNA. This is done through the conversion of adenosine to inosine in an enzymatic reaction catalyzed by ADAR. This process effectively changes the RNA sequence from that encoded by the genome and extends the diversity of the gene products. The majority of RNA editing occurs on non-coding regions of RNA; however, some protein-encoding RNA transcripts have been shown to be subject to editing resulting in a difference in their protein's amino acid sequence. An example of this is the glutamate receptor mRNA where glutamine is converted to arginine leading to a change in the functionality of the protein.

Polyadenylation

is the addition of a "tail" of adenylate residues to an RNA transcript about 20 bases downstream of the AAUAAA sequence within the three prime untranslated region. Polyadenylation of mRNA has a strong effect on its nuclear transport, translation efficiency, and stability. All of these as well as the process of polyadenylation depend on binding of specific RBPs. All eukaryotic mRNAs with few exceptions are processed to receive 3' poly tails of about 200 nucleotides. One of the necessary protein complexes in this process is CPSF. CPSF binds to the 3' tail sequence and together with another protein called poly-binding protein, recruits and stimulates the activity of poly polymerase. Poly polymerase is inactive on its own and requires the binding of these other proteins to function properly.

Export

After processing is complete, mRNA needs to be transported from the cell nucleus to cytoplasm. This is a three-step process involving the generation of a cargo-carrier complex in the nucleus followed by translocation of the complex through the nuclear pore complex and finally release of the cargo into cytoplasm. The carrier is then subsequently recycled. TAP/NXF1:p15 heterodimer is thought to be the key player in mRNA export. Over-expression of TAP in Xenopus laevis frogs increases the export of transcripts that are otherwise inefficiently exported. However TAP needs adaptor proteins because it is unable to interact directly with mRNA. Aly/REF protein interacts and binds to the mRNA recruiting TAP.

mRNA localization

mRNA localization is critical for regulation of gene expression by allowing spatially regulated protein production. Through mRNA localization proteins are translated in their intended target site of the cell. This is especially important during early development when rapid cell cleavages give different cells various combinations of mRNA which can then lead to drastically different cell fates. RBPs are critical in the localization of this mRNA that insures proteins are only translated in their intended regions. One of these proteins is ZBP1. ZBP1 binds to beta-actin mRNA at the site of transcription and moves with mRNA into the cytoplasm. It then localizes this mRNA to the lamella region of several asymmetric cell types where it can then be translated. In 2008 it was proposed that FMRP was involved in the stimulus-induced localization of several dendritic mRNAs in the neuronal dendrites of cultured hippocampal neurons. More recent studies of FMRP-bound RNAs present in microdissected dendrites of CA1 hippocampal neurons revealed no changes in localization in wild type versus FMRP-null mouse brains.

Translation

Translational regulation provides a rapid mechanism to control gene expression. Rather than controlling gene expression at the transcriptional level, mRNA is already transcribed but the recruitment of ribosomes is controlled. This allows rapid generation of proteins when a signal activates translation. ZBP1 in addition to its role in the localization of B-actin mRNA is also involved in the translational repression of beta-actin mRNA by blocking translation initiation. ZBP1 must be removed from the mRNA to allow the ribosome to properly bind and translation to begin.

Protein–RNA interactions

RNA-binding proteins exhibit highly specific recognition of their RNA targets by recognizing their sequences, structures, motifs and RNA modifications. Specific binding of the RNA-binding proteins allow them to distinguish their targets and regulate a variety of cellular functions via control of the generation, maturation, and lifespan of the RNA transcript. This interaction begins during transcription as some RBPs remain bound to RNA until degradation whereas others only transiently bind to RNA to regulate RNA splicing, processing, transport, and localization. Cross-linking immunoprecipitation methods are used to stringently identify direct RNA binding sites of RNA-binding proteins in a variety of tissues and organisms. In this section, three classes of the most widely studied RNA-binding domains will be discussed.

RNA-recognition motif (RRM)

The RNA recognition motif, which is the most common RNA-binding motif, is a small protein domain of 75–85 amino acids that forms a four-stranded β-sheet against the two α-helices. This recognition motif exerts its role in numerous cellular functions, especially in mRNA/rRNA processing, splicing, translation regulation, RNA export, and RNA stability. Ten structures of an RRM have been identified through NMR spectroscopy and X-ray crystallography. These structures illustrate the intricacy of protein–RNA recognition of RRM as it entails RNA–RNA and protein–protein interactions in addition to protein–RNA interactions. Despite their complexity, all ten structures have some common features. All RRMs' main protein surfaces' four-stranded β-sheet was found to interact with the RNA, which usually contacts two or three nucleotides in a specific manner. In addition, strong RNA binding affinity and specificity towards variation are achieved through an interaction between the inter-domain linker and the RNA and between RRMs themselves. This plasticity of the RRM explains why RRM is the most abundant domain and why it plays an important role in various biological functions.