CSTF2


Cleavage stimulation factor 64 kDa subunit is a protein that in humans is encoded by the CSTF2 gene,which is located on the X-chromosome in Homo sapiens, but has an autosomal paralog CSTF2T on chromosome 19 coding for the protein CstF64τ.
This gene encodes a nuclear protein with an RRM domain. The protein is a member of the cleavage stimulation factor complex that is involved in the 3' end cleavage and polyadenylation of pre-mRNAs. Specifically, this protein binds GU-rich elements within the 3'-untranslated region of mRNAs.

Tissue distribution

CSTF2 has broad distribution across human tissues, with generally higher mRNA and protein levels in hematopoietic/immune tissues, reproductive organs, and many epithelial organs. It is detectable in essentially all sampled normal tissues, but is particularly enriched in testis, ovary, and various mucosal/secretory epithelia. CSTF2 expression is also observed in subsets of brain and immune cells. Across tumors, CSTF2 expression is frequently elevated relative to matched normal tissues in multiple cancer types, consistent with enrichment in highly proliferative cell populations.

Testes

Cleavage stimulation factor 2 is encoded by an X-chromosomal gene and is transcriptionally silenced during male meiosis due to X-chromosome inactivation. As a result, CSTF2 is not expressed in mammalian testes. Instead, its autosomal paralog, CSTF2T, is specifically expressed in testes and serves as the predominant CSTF2 family member in this tissue.

Central nervous system

Multiple CSTF2 family isoforms are expressed in the mammalian nervous system, including CSTF-64, CSTF-64τ, and βCSTF-64. βCSTF-64 is broadly expressed across all regions of the brain and in peripheral nerves in vertebrates. Minor levels of CSTF-64τ are also detectable in immune cells.

Immune system

CSTF2 is expressed in immune cells, including B lymphocytes, where its abundance varies with differentiation state and activation status. Minor expression of CSTF-64τ has also been reported in immune cell populations.

Structure

The CSTF2 mRNA is roughly 2000 base pairs long when isolated from HeLa cells, excluding the polyadenylated region. In the translated protein, the RNA recognition motif is composed of multiple β-sheets and is located at the N-terminus of the protein. This domain recognizes guanine and uracil bases.
In addition to the RRM, CSTF2 contains a hinge domain and a C-terminal domain. The hinge domain consists of approximately 40–50 amino acid residues and mediates binding to the cleavage specificity factor subunit symplekin. The C-terminal domain remains comparatively under-researched; however, when the cleavage specificity factor complex is bound to cleavage stimulation factor 3 and associated with the rest of the cleavage stimulation factor complex through the hinge domain of CSTF2, the assembled complexes are transported into the nucleus.
Human CSTF2 and CSTF3 share strong homology with corresponding proteins in yeast and Drosophila, indicating evolutionary conservation of both structure and function. Notably, the cleavage stimulation factor 3 subunit in Homo sapiens is highly similar to a Drosophila modifier gene, suggesting conserved roles in gene regulation and vertebrate cell growth.

Function

Cleavage stimulation factor 2 is one of three subunits that assemble with cleavage stimulation factor 1 and cleavage stimulation factor 3 to form the heterotrimeric cleavage stimulation factor complex. CSTF is an essential component of mRNA maturation due to its regulatory role in pre-mRNA cleavage and polyadenylation. When assembled, cleavage stimulation factor 2 binds pre-mRNA near the 3′ end of the transcript, recognizing G/U-rich sequence elements downstream of the cleavage site through its ribonucleoprotein-type RNA binding domain. This interaction helps define cleavage site selection and enables recruitment of additional processing machinery.
Cleavage and polyadenylation require coordinated binding of multiple complexes on both sides of the cleavage site. The cleavage specificity factor complex—comprising CPSF-160, CPSF-100, CPSF-73, CPSF-30, FIP1, and WDR33—binds upstream, while CSTF binds downstream. Together, these complexes regulate cleavage efficiency and contribute to determining poly tail length. All three CSTF subunits must be assembled for cleavage and polyadenylation to occur, and CSTF acts as the primary regulatory unit of the process. Without cleavage stimulation factor 2 bound to the U/G-rich RNA sequence, neither cleavage nor polyadenylation occurs, even though CSTF does not directly cleave RNA. This requirement applies to both constitutive and alternative cleavage and polyadenylation events. Additionally, the C-terminal domain of cleavage stimulation factor 3 modulates RNA recognition by the RRM of cleavage stimulation factor 2 without directly contacting the RNA-binding domain.
Cleavage stimulation factor 2, and potentially cleavage stimulation factor 3, also participate in histone pre-mRNA processing through the heat-labile factor. CSTF2 is regulated by the cell cycle, and depletion of CSTF2 slows progression through the S-phase. This effect results from impaired pre-mRNA cleavage and maturation, which inhibits transcriptional completion. Histone RNA processing, including cleavage and maturation of non-polyadenylated RNA, is similarly dependent on CSTF2 levels during the transition from the G1 phase to the S phase.
Cleavage stimulation factor 2 is also essential for regulated gene expression. In its absence, cleavage stimulation factors 1 and 3 cannot independently regulate gene expression due to unstable association with pre-mRNA. When CSTF2 is present, gene expression increases substantially, largely through enhanced recruitment and activity of cleavage stimulation factor 3. The RNA recognition motif of CSTF2 performs two functions: recognition of the U/G-rich downstream sequence element and association with cleavage stimulation factor 3 to regulate localization of the CSTF complex between the nucleus and cytoplasm. The RRM of CSTF2, as well as its bacterial counterpart Rna15, does not bind a strict consensus sequence but instead interacts with U/G-rich regions through resonance structure–dependent binding mechanisms.
CSTF2 functions as a core component of the cleavage stimulation factor complex, which regulates cleavage and polyadenylation of pre-mRNA 3′ ends. Through modulation of polyadenylation site selection, CSTF2 influences transcript stability, localization, and translational efficiency, particularly in rapidly dividing or highly differentiated cells.

Spermatogenesis

In mammals, CSTF2T functionally replaces CSTF2 during spermatogenesis. Mouse models lacking CSTF-64τ show severe defects in spermatogenesis, indicating that CSTF2 family proteins are essential for proper germ cell development. Although CSTF2 and CSTF-64τ regulate a relatively limited set of target genes, loss of either significantly reduces expression of spermatogenesis-associated transcripts, leading to impaired testicular development and infertility.

Nervous system polyadenylation

The βCSTF-64 isoform contains an additional 49 amino acids within the proline/glycine-rich domain and is generated by inclusion of nervous-system-specific exons from the CSTF2 gene. βCSTF-64 differs functionally from canonical CSTF-64 in its response to polyadenylation signal strength, showing reduced sensitivity to strong polyadenylation sites. This property enables fine regulation of alternative polyadenylation in neurons and contributes to nervous-system-specific transcript diversity.

Immunoglobulin class switching

In B cells, CSTF2 levels directly influence immunoglobulin heavy-chain mRNA processing. CSTF2 regulates the choice between upstream and downstream cleavage and polyadenylation sites, thereby determining whether immunoglobulin transcripts encode secreted or membrane-bound antibody forms. Increased CSTF2 favors usage of upstream polyadenylation sites, promoting production of secreted immunoglobulins, whereas lower CSTF2 levels favor membrane-bound forms.

Overlap between CSTF-64 and CSTF-64τ

Cleavage stimulation factor 2 has its paralog CSTF-64τ that is capable of identifying a cleavage site for alternative polyadenylation. There is overlap between both the general CSTF-64 transcription factor and its alternative. In alternative polyadenylation, the utility of affected proteins are changed because of the alteration to the mRNA sequence. CSTF-64τ also consists of a N-terminal, a C-terminal, a RNA recognition motif, a hinge domain, and a P/G rich domain. Cleavage stimulation factor 2 and CSTF-64τ differ in their affinity for binding to the symplekin protein complex involved in polyadenylation. Symplekin protein complex is a scaffolding nuclear protein required for recruiting other polyadenylation machinery to the cleavage site. Cleavage stimulation factor 2 binds with more affinity to the symplekin complex than CSTF-64τ. This difference in binding affinity is due to size and presence of the P/G rich domain present in the CSTF-64τ variant that block the hinge domain from binding to the symplekin complex. Both protein variants bind to U/G rich sequence elements downstream of the cleavage site and play important roles in regulating the identification of cleavage and polyadenylation. But both variations will negatively regulate each other.
If cleavage stimulation factor 2 in its original form is present in the nucleus, then it will bind to the rest of the cleavage stimulation factor complex and the U/G rich region. Though in instances of cleavage stimulation factor 2 absence, if CSTF-64τ is present, then it will bind instead.  Both paralogs are redundant in their ability to bind both the rest of the cleavage stimulation factor complex, and to the RNA sequence itself. CSTF-64τ can also regulate alternative cleavage and polyadenylation in the same manner that its original can.  However, if both CSTF-64τ and cleavage stimulation factor 2 are depleted in the cell then the rate of cleavage and polyadenylation will decrease exponentially.

Regulation of alternative polyadenylation

Alternative polyadenylation is caused by the presence of one or more Poly A signal sequence. The polyadenylation site will have a stronger effect on gene expression if its affinity for binding transcription factors like cleavage specificity factor complex and cleavage stimulation factor complex. Polyadenylation and cleavage sites are determined in part by both the cleavage specificity factor complex and cleavage stimulation factor 2, which targets proximal potential sites containing the U/G rich region capable of binding to its RNA recognition motif. If both cleavage stimulation factor 2 and its CSFT-64τ variant are depleted, then there is a stronger selection for distal polyadenylation sites.
The subsequent alternative polyadenylation then causes lengthening or shortening of mature mRNA sequences that alter sequence elements like 3' untranslated regions(UTR). The variation of alternative polyadenylation has been linked to markers in certain stages of human cancers and other diseases. Specifically in colon cancer tissues, cleavage specificity factor 3 and cleavage stimulation factor 2 are expressed at a higher frequency than usual. 4 other cleavage and polyadenylation factors were over expressed in cancerous tissue as well, which differed from the normal regulation of alternative polyadenylation factor gene expression. The exact relation of these post-transcriptional factors to oncogenesis is not yet characterized.

Interactions

CSTF2 has been shown to interact with CSTF3, SUB1, SYMPK, BARD1 and BRCA1.