Circular RNA


In molecular biology, circular ribonucleic acid is a type of single-stranded RNA which, unlike linear RNA, forms a covalently closed continuous loop. In circular RNA, the 3' and 5' ends normally present in an RNA molecule have been joined together. This feature confers numerous properties to circular RNA, many of which have only recently been identified.
Many types of circular RNA arise from otherwise protein-coding genes. Some circular RNA have been shown to code for proteins. Some types of circular RNA have also recently shown potential as gene regulators. The biological function of most circular RNA is unclear.
Because circular RNA do not have 5' or 3' ends, they are resistant to exonuclease-mediated degradation and are presumably more stable than most linear RNA in cells. Circular RNA has been linked to some diseases such as cancer.

RNA splicing

In contrast to genes in bacteria, eukaryotic genes are split by non-coding sequences called introns. In eukaryotes, as a gene is transcribed from DNA into a messenger RNA transcript, intervening introns are removed, leaving only exons in the mature mRNA, which can subsequently be translated to produce the protein product. The spliceosome, a protein-RNA complex located in the nucleus, catalyzes splicing in the following manner:
  1. The spliceosome recognizes an intron, which is flanked by specific sequences at its 5' and 3' ends, known as a donor splice site and an acceptor splice site, respectively.
  2. The 5' splice site sequence is then subjected to a nucleophilic attack by a downstream sequence called the branch point, resulting in a circular structure called a lariat.
  3. The free 5' exon then attacks the 3' splice site, joining the two exons and releasing a structure known as an intron lariat. The intron lariat is subsequently de-branched and quickly degraded.

    Alternative splicing

is a phenomenon through which one RNA transcript can yield different protein products based on which segments are considered "introns" and "exons" during a splicing event. Although not specific to humans, it is a partial explanation for the fact that humans and other much simpler species have similar numbers of genes. One of the most striking examples of alternative splicing is in the Drosophila DSCAM gene, which can give rise to approximately 30 thousand distinct alternatively spliced isoforms.

Non-canonical splicing

Exon scrambling

Exon scrambling, also called exon shuffling, describes an event in which exons are spliced in a "non-canonical" order. There are three ways in which exon scrambling can occur:
  1. Tandem exon duplication in the genome, which often occurs in cancers.
  2. Trans-splicing, in which two RNA transcripts fuse, resulting in a linear transcript containing exons that, for example, may be derived from genes encoded on two different chromosomes. Trans-splicing is very common in C. elegans
  3. A splice donor site being joined to a splice acceptor site further upstream in the primary transcript, yielding a circular transcript.
The notion that circularized transcripts are byproducts from imperfect splicing is supported by the low abundance and the lack of sequence conservation of most circRNAs, but has been challenged.

Alu elements impact circRNA splicing

Repetitive Alu sequences represent approximately 10% of the human genome. The presence of Alu elements in flanking introns of protein-coding genes adjacent to the first and last exons that form circRNAs, influence the formation of circRNAs. It is important that the flanking intronic Alu elements are complementary, as this enables RNA pairing, which in turn facilitates circRNA synthesis.

Impact of RNA editing on circRNA formation

RNAs can undergo base modification by RNA editing after transcription. RNA editing occurs mainly in Alu elements of protein-coding genes.
ADAR1 and ADAR2 have been shown to regulate circRNA biogenesis in cancer cells in a bidirectional manner, repressing or promoting the formation of different circRNAs through both RNA editing–dependent and –independent mechanisms. Editing of key adenosines in RCMs was shown to stabilize or destabilize base‑pairing and RNA secondary structure, thereby promoting or repressing circRNA formation, respectively. Furthermore, RNA editing can affect the binding of splicing factors, adding an additional layer of regulation to circRNA production.
Another study demonstrated that A-to-I RNA editing in up- and downstream intronic Alu elements flanking the back-splice site can reduces the formation of circRNAs in the human heart. In the failing human heart, a predominant reduction in A-to-I RNA editing leads to an increased formation of circRNAs, which is presumably mediated by better complementary pairing of RNA of the Alu elements flanking the back-splice site.

Characteristics of circular RNA

Early discoveries of circRNAs

Early discoveries of circular RNAs led to the belief that they lacked significance due to their rarity. These early discoveries included the analysis of genes like the DCC and Sry genes, and the recent discovery of the human non-coding RNA ANRIL, all of which expressed circular isoforms. CircRNA producing genes like the human ETS-1 gene, the human and rat cytochrome P450 genes, the rat androgen binding protein gene, and the human dystrophin gene were also discovered.

Genome-wide identification of circRNAs

Scrambled isoforms and circRNAs

In 2012, in an effort to initially identify cancer-specific exon scrambling events, scrambled exons were discovered in large numbers in both normal and cancer cells. It was found that scrambled exon isoforms comprised about 10% of the total transcript isoforms in leukocytes, with 2,748 scrambled isoforms in HeLa and H9 embryonic stem cells being identified. Additionally, about 1 in 50 expressed genes produced scrambled transcript isoforms at least 10% of the time. Tests used to recognize circularity included treating samples with RNase R, an enzyme that degrades linear but not circular RNAs, and testing for the presence of poly-A tails, which are not present in circular molecules. Overall, 98% of scrambled isoforms were found to represent circRNAs, circRNAs were found to be located in the cytoplasm, and circRNAs were found to be abundant.

Discovery of a higher abundance of circRNAs

In 2013, a higher abundance of circRNAs was discovered. Human fibroblast RNA was treated with RNase R to enrich for circular RNAs, followed by the categorization of circular transcripts based on their abundance. Approximately 1 in 8 expressed genes were found to produce detectable levels of circRNAs, including those of low abundance, which was significantly higher than previously suspected, and was attributed to greater sequencing depth.

CircRNAs tissue specificity and antagonist activity

At the same time, a computational method to detect circRNAs was developed, leading to de novo detection of circRNAs in humans, mice, and C. elegans, and extensively validating them. The expression of circRNAs was often found to be tissue/developmental stage specific. Additionally, circRNAs were found to have the ability to act as antagonists of miRNAs, microRNAs which interfere with translation of mRNAs, as exemplified by the circRNA CDR1as, which has miRNA binding sites.

CircRNAs and ENCODE Ribozero RNA-seq data

In 2014, human circRNAs were identified and quantified from ENCODE Ribozero RNA-seq data. Most circRNAs were found to be minor splice isoforms and to be expressed in only a few cell types, with 7,112 human circRNAs having circular fractions of at least 10%. CircRNAs were also found to be no more conserved than their linear controls and, according to ribosome profiling, are not translated.< As previously noted, circRNAs have the ability to act as antagonists of miRNA, which is also known as the potential to act as microRNA sponges. Aside from CDR1as, very few circRNAs have the potential to act as microRNA sponges. As a whole, the majority of circular RNAs were found to be inconsequential side-products of imperfect splicing.

CircRNAs and CIRCexplorer

In the same year, CIRCexplorer, a tool used to identify thousands of circRNAs in humans without RNase R RNA-seq data, was developed. The vast majority of identified highly expressed exonic circular RNAs were found to be processed from exons located in the middle of RefSeq genes, suggesting that the circular RNA formation is generally coupled to RNA splicing. It was determined that most circular RNAs contain multiple, most commonly, two to three, exons. Exons from circRNAs with only one circularized exon were found to be much longer than those from circRNAs with multiple circularized exons, indicating that processing may prefer a certain length to maximize exon circularization. The introns of circularized exons generally contain high Alu densities that can form inverted repeated Alu pairs. IRAlus, either convergent or divergent, are juxtaposed across flanking introns of circRNAs in a parallel way with similar distances to adjacent exons. IRAlus, and other non-repetitive, but complementary, sequences were also found to promote circular RNA formation. On the other hand, exon circularization efficiency was determined to be affected by the competition of RNA pairing, such that alternative RNA pairing, and its competition, leads to alternative circularization. Finally, both exon circularization and its regulation were found to be evolutionarily dynamic.

Genome-wide calling of circRNA in Alzheimer disease cases

The Cruchaga lab performed the first large scale analyses of circRNA in Alzheimer disease and demonstrated the role of circRNAs in health and disease. A total of 148 circRNAs were found to be significantly associated in multiple datasets with Alzheimer's disease status and clinical dementia rating at death after false discovery rate correction. The expression of circRNAs was independent of the lineal form and that circRNA expression was also corrected by cell proportion. CircRNAs were also found to be co-expressed with known causal Alzheimer genes, such as APP and PSEN1, indicating that some circRNAs are also part of the causal pathway. Altogether, circRNA brain expression was found to explain more about Alzheimer's clinical manifestations than the number of APOε4 alleles, suggesting that circRNAs could be used as a potential biomarker for Alzheimer's.