C20orf27


UPF0687 protein C20orf27 is a protein that in humans is encoded by the C20orf27 gene. It is expressed in the majority of the human tissues. One study on this protein revealed its role in regulating cell cycle, apoptosis, and tumorigenesis via promoting the activation of NFĸB pathway.

Gene

The UPF0687 Protein C20orf27 has four other aliases, Chromosome 20 Open Reading Frame 27, Hypothetical Protein LOC54976, C20orf27, and FLJ20550. It is located on the minus strand at 20p13. It consists of 7 exons and 12 introns. This most updated annotation shows that gene C20orf27 starts at 3,753,499 bp to 3,768,388 bp on Chromosome 20.

Transcription

Known isoforms

The C20orf27 gene has 5 transcript isoforms, C20orf27 transcript variant 1, C20orf27 transcript variant 2, C20orf27 transcript variant 3, and C20orf27 transcript variant 4.
Transcript variant 1 encodes for the longest protein isoform, with a size of 1327 bases and 6 exons.
Transcript variant 2 maintains the reading frames and 6 exons compared to transcript variant 1, but it has an alternative spliced site in the coding region. It has a size of 1252 bases.
Transcription variant 3 has a size of 1706 bases and 6 exons. This variant has an alternative spliced site in the coding region and differs in the 5' UTR, but it still maintains the reading frame seen in transcript variant 1. Despite their differences in size, variant 2 and variant 3 encodes the same protein isoform and this second protein isoform is shorted than the protein isoform encoded by transcript variant 1.
Transcript variant 4 has a size of 1457 bases with 6 exons. Compared to variant 1, it uses an alternative 5'-most exon and an alternative splice site. Because of the presence of an upstream ORF that is predicted to interfere with translation of this variant, the transcription variant 4 does not encode any protein.
The information on transcript variant X1 comes from GRCh38.p13 Primary Assembly. This variant has a size of 1195 bases, and the number of exons in this variant remains unknown.

Proteins

Physical features

The human gene C20orf27 has three known isoforms.
Isoform 1 has 199 amino acid residues and a domain named DUF4517. Isoform 2 has 174 amino acid residues, and isoform X1 has 154 amino acid residues. All three isoforms contain the same domain DUF4517. The function of domain DUF4517 requires future research.
The predicted isoelectric point of unmodified protein C20orf27 is 6.89.
The percentage of each amino acid residue is about its average percentage among human proteins. Overall, the positively charged amino acid residues in human protein C20orf27 outnumbers the negatively charged amino acid residues. Protein C20orf27 has no high scoring hydrophobic regions, no highly charged regions, and no transmembrane regions.
SPAS predicts two repetitive structures. The first repetitive structure is amino acid alphabet structures with a core block length of 4. The total number of this structure in human protein C20orf27 is 15. The second repetitive structure is an 11-letter reduced alphabet structure with a core block length of 8. This charged alphabet structure predicts to appear 8 times in human protein C20orf27. There are no predicted clusters of amino acid multiples.

Post-translation modifications

The predicted molecular weight of C20orf27 is 21.6 kDa. A Western Blot binding pattern on protein C20orf27 with its polyclonal antibody reveals that the experimental molecular weight of protein C20orf27 is about 22 kDa. This suggests that there are relatively few post-translation modifications on protein C20orf27.
There is no predicted signal peptide or cleavage site.
There are many predicted phosphorylation sites along the sequence of protein C20orf27, including four sites for protein kinase A, two sites for protein kinase C, three sites for casein kinase 2, one site for ribosomal S6 kinase, one site for cGMP-dependent protein kinase or Protein Kinase G, and one site for ataxia-telangiectasia mutated (ATM) serine/threonine protein kinase.
Protein C20orf27 is predicted to have other post-translation modification sites including five palmitoylation sites, one c-mannosylation site, and two sumoylation sites.

Structure

Three stretches of beta sheet from amino acid 62 to 67, 76 to 87, and 92 to 100 are predicted with the highest confidence using CFSSP and Phyre2. A model predicted by I-TASSER shows that the tertiary structure of human protein C20orf27 is a combination of many beta sheets. This confirms the predictions made by CFSSP and Phyre2.

Subcellular Localization

This protein is expected to be found in cytosol and nucleus, but not in nuclei. Additional computational analysis predicts that this protein is most likely to be in cytosol.

Expression

Protein C20orf27 is expressed ubiquitously throughout different human tissues. Microarray-assessed tissue expression pattern suggests caudate nucleus has the highest expression of protein C20orf27.
Other than caudate nucleus, protein C20orf27 expression measure ranks at the top 25% among 100 proteins in pons, fetal brain, BM- CD105+ endothelial, BM- CD34+, bone marrow, adipocyte, uterus corpus, 721 BLymphoblast, PB- CD56+NK cells, BM- CD33+ myeloid, colorectal adenocarcinoma, leukemia chronic Myelogenous K-562, leukemia lymphoblastic, and leukemia promyelocytic-HL-60.
In situ hybridization data has shown that the expression of C20orf27 in airway epithelial cells can be correlated to chronic lung diseases. After AECs are treated with IL-13, which is a cytokine expressed by CD4 T helper cells, AECs begin to secrete excess mucous, and excess mucous secretion in the airway is a mark of chronic lung diseases.

Regulation of expression

Gene level expression

There are three promoter regions in gene C20orf27.
Five transcription factors that bind to the promoter region of gene C20orf27 have been discovered, including MITF, JUN, ZNF282, FOXA1, and TCF7L2.
Using genomatix, more transcription factor binding sites are predicted. Transcription binding matrix, like EGR/nerve growth factor induced protein C & related factors, GC-Box factors SP1/GC, Krueppel like transcription factors, Myc associated zinc fingers, vertebrate homologues of enhancer of split complex, E-box binding factors, E2F-myc activator/cell cycle regulator, and BED subclass of zinc-finger proteins, are predicted to give the highest matrix similarity.

Transcript level regulation

Predicted miRNA binding sites in 3' end of C20orf27 mRNA which sequences are also conserved evolutionarily are hsa-miR-7856-5p, hsa-miR-671-5p, hsa-miR-4768, hsa-miR-6791-3p, hsa-miR-6829-3p, hsa-miR-548d-3p, hsa-miR-548-3p, hsa-miR-548z, and hsa-miR-548h-3p.
The formation of three stem loops is conserved in different predicted models. The three stem loops start from the 5' end of C20orf27 mRNA base 1 to base 27, base 56 to base 74, and base 116 to base 130.
The mRNA of C20orf27 has about 23 predicted mRNA binding protein binding sites which sequences are also conserved in evolution. The names of these mRNA binding proteins are BRUNOL5, BRUNOL6, PCBP2, TARDBP, MBNL1, CUG-BP, PCBP3, PTBP1, RBM5, SRSF1, HNRNPH2, FMR1, HNRNPF, LIN28A, CPEB4, HNRNPC, HNRNPCL1, HNRNPM, HuR, RALY, PABPC1, PABPC4, SART3, and SRSF10.

Function and clinical significance

Interacting proteins

Interactors of protein C20orf27 found in Y2H screen are replicase polyprotein 1ab from coronavirus, RAIYL, PHKB, FERMT2 from human. The function of replicase polyprotein 1ab is transcribing and replicating viral RNAs, and it contains the proteinases responsible for the cleavages of the polyprotein. The function of RAIYL, PHKB, and FERMT2 remain unknown.
Other interactors that discovered by pull-down assays include PPP1CA, PPP1CB, PPP1CC, PPP1R7, PSME3, RBFOX2, and DMWD. Interactors PPP1CA, PPP1CB, PPP1CC, and PPP1R7 have similar functions. They involve in the regulation of a variety of cellular processes, such as cell division, glycogen metabolism, muscle contractility, protein synthesis, and HIV-1 viral transcription. PSME3 facilitates the MDM2-p53/TP53 interaction which promotes ubiquitination- and MDM2-dependent proteasomal degradation of p53/TP53, limiting its accumulation and resulting in inhibited apoptosis after DNA damage, and might play a role in cell cycle regulation. RBFOX2 regulates alternative splicing events by binding to 5'-UGCAUGU-3' elements. The function of DMWD is unknown.
The above evidence suggests protein C20orf27 plays a role in cell cycle regulation, cell proliferation and differentiation, and cell survival.

Clinical significance

Human protein C20orf27 and its variants have not been discovered to be associated with any diseases or disorders.

Homology and evolution history

Paralogs

There are no known paralogs.

Orthologs

There are about 281+ known orthologs for this gene, ranging from primates to invertebrates.
The closest related orthologs are selected from primates and mammals, and the sequence similarity ranks from 75% to 100%. The moderately related orthologs are selected from fishes and birds, and the sequence similarity ranks from 55% to 75%. The most distantly related orthologs are selected from invertebrates and trichoplax, and the sequence similarity ranks from 40% to 55%. The conserved amino acids are bold in the conceptional translation diagram.
Gene nameGenus and SpeciesTaxonomic groupCommon NamesAccession Num.Protein LengthSeq IdentitySeq SimilarityMYA
C20orf27Homo sapiensPrimatesHumanNP_001034229.1199 aa100%100%0
C20orf27Macaca mulattaPrimatesOld World MonkeyAFE71948.1197 aa98.50%99%29.44
C20orf27Mus musculusRodentiaHouse mouseNP_001298067.1177 aa79.40%82.90%90
C20orf27Rhinolophus ferrumequinumChiropteraGreater horseshoe batXP_032951391.1174 aa82.40%85.40%96
C20orf27Condylura cristataEulipotyphlaStar-nosed moleXP_012583921.1184 aa76.60%79.90%96
C20orf27Dromaius novaehollandiaeCasuariiformesEmuXP_025975497.1174 aa65.30%72.90%312
C20orf27Gopherus evgoodeiTestudinesGopher TortoisesXP_030419106.1176 aa61.80%73.90%312
C20orf27Strigops habroptilaPsittaciformesKākāpōXP_030348224.1174 aa59.30%70.40%312
C20orf27Thamnophis elegansScaled reptilesWestern terrestrial garter snakeXP_032094251.1174 aa56.80%68.30%312
C20orf27Taeniopygia guttataPasseriformesZebra finchNP_001232719.1176 aa56.70%67.50%312
C20orf27Xenopus tropicalisFrogsWestern clawed frogNP_001007504.1174 aa59.30%72.40%351.8
C20orf27Scophthalmus maximusPleuronectiformesTurbotAWP06390.1179 aa29.70%43.20%435
C20orf27Callorhinchus miliiChimaeraAustralian ghostsharkXP_007906148.1179 aa54.00%63.90%473
C20orf27Petromyzon marinusPetromyzontiformesSea lampreyXP_032806447.1173 aa47.60%55.80%615
C20orf27Anneissia japonicaComatulidacomasteridsXP_033124803.1184 aa26.60%46.30%684
C20orf27Ixodes scapularisIxodidaDeer tickXP_002403181.1165 aa28.80%44.20%797
C20orf27Limulus polyphemusXiphosuraAtlantic horseshoe crabXP_022257482.1173 aa28.80%44.50%797
C20orf27Crassostrea gigasOstreidaPacific oysterXP_011438297.1162 aa24.90%44.00%797
C20orf27Drosophila subobscuraFlyFruit FlyXP_034657203.1179 aa24.70%37.70%797
C20orf27Nematostella vectensisSea anemoneStarlet sea anemoneXP_001627979.1169 aa30.30%43.30%824
C20orf27TrichoplaxTrichoplaxTrichoplaxRDD38604.1166 aa22.3%40.0%1017