C6orf47
C6ORF47 is a gene. In humans, it is on chromosome 6.
Gene
General Information
In humans, Chromosome 6 open reading frame 47, C6ORF47, is a single exon gene that spans 2481 nucleotides that encodes for a 294 amino acid protein.Location
In humans, this gene is located on the minus strand at 6p21.33.Gene Expression
Tissue expression in human C6ORF47 was found to ubiquitously expressed throughout all tissues. C6ORF47 gene is also seen to be over-expressed in the colon, urinary bladder, ovary, and pancreas. NCBI GEO Profiles shows that C6ORF47 RNA is expressed ubiquitously varying from low expression to high expression in a couple of areas like the Salivary Gland and Cerebellum.Research by Pontus Boström et al. looked into C6ORF47 mRNA expression using microarray data from macrophages from 4 healthy donors. The goal of this study was to investigate whether or not hypoxia can influence the accumulation of lipids in macrophages. These results would help identify whether or not the macrophages loaded with lipids in the atherosclerotic lesions are there because of the hypoxic regions. Human macrophages were exposed to hypoxia for 24 hours and showed an increased formation of cytosolic lipid droplets and increased tri-glyceride accumulation. Results showed that the hypoxic regions in the atherosclerotic lesions could contribute to forming lipid-loaded macrophages and accumulating triglycerides.8 As we can see below, expression of C6ORF47 shows that expression is almost 6 times greater in the non-hypoxic region than in the hypoxic regions, showing that C6ORF47 is likely not contributing to either the lipid accumulation or an essential process since expression decreased. Once put under hypoxic conditions, only essential processes are left on likely hence when C6ORF47 expression decreased.
Transcription Factors
Below is a short list of transcription factors binding to the promoter region, contains 5' UTR and 500 nucleotides upstream. Bioline software was utilized for the double-stranded DNA seqeunce. UCSC genome browers was used for transcription factors and binding sites providing the information of the transcription factors that bind listed below in the table.| Transcription factor | Generalized Function |
| KLF17, Krüppel-like factor 17 | regulates gene expression, influencing cell differentiation and development. |
| PROX1, Prospero homeobox 1 | Regulates lymphatic development, cell differentiation, and organogenesis processes. |
| WT1, Wilms' tumor 1 | Regulates kidney development, cell growth, and tissue differentiation processes |
| GATA1, GATA binding protein 1 | Controls red blood cell development and regulates hematopoiesis processes. |
| THRB, Thyroid hormone receptor beta | Regulates thyroid hormone signaling, influencing metabolism and growth regulation. |
| ZNF454, Zinc Finger Protein 454 | Regulates gene expression, potentially influencing cell differentiation and development. |
| SP9, Specificity Protein 9 | Regulates cartilage development and skeletal patterning during embryogenesis. |
| EGR3, 'Early Growth Response 3 | Regulates gene expression involved in neuronal activity and immune response. |
| SOX4, SRY-box transcription factor 4 | Regulates cell fate, development, and differentiation in multiple tissues. |
| EBF1, Early B-cell Factor 1 | Regulates B cell differentiation and immune system development. |
| ZNF669, Zinc Finger Protein 669 | Regulates gene expression, potentially involved in development and differentiation. |
| KLF1 Krüppel-like factor 1 | Regulates red blood cell development and hemoglobin expression. |
| STAT3, Signal Transducer and Activator of Transcription 3 | Regulates immune response, cell survival, and inflammation processes. |
| ZIC3, Zinc Finger of the Cerebellum 3 | Regulates brain and heart development, influencing neuronal patterning and function. |
| NHLH2, Nighthawk-like Protein 2 | Regulates neural differentiation and development, influencing nervous system patterning. |
| ZNF454, Zinc Finger Protein 454 | Involved in transcriptional regulation, potentially affecting gene expression and development. |
| EBF2, Early B-cell Factor 2 | Regulates adipocyte differentiation and energy metabolism, influencing fat tissue development. |
| ZNF42, Zinc Finger Protein 42 | Involved in regulating gene expression and cellular differentiation processes. |
| ERF::FIGLA, ETS2 Repressor Factor and Factor of Germline Alpha' | Transcription factor complex that regulates ovarian development and folliculogenesis. |
Single-Nucleotide-Polymorphisms (SNPs)
This table above illustrates 3 SNPs that occur within the CDS, 5' UTR, and 3' UTR. These SNPs were found using Variation Viewer These SNPs were chosen due to location within C6ORF47 gene. Variation Viewer showed no pathogenic SNPs and only large deletions that include copious gene.Protein
Basic Information
- The encoded protein weighs 31,579 daltons.
- EMBL-EBI-SAPS found the human C6ORF47 protein to have a isoelectric point of 5.95.
- C6ORF47 protein was shown to be slightly more abundant than half of the proteins present in the human body.
Family
The C6ORF47 protein belongs to the family of proteins referred to as MHC proteins which is a band on the short arm of chromosome located at 6p21.3 that spans 3.6 megabases. The generalized function of MHC molecules is to bind peptide fragments that are from pathogens and display them on the surface of the cell for recognition by T cells. C6ORF47 protein is considered to be part of the MHC Class III protein. MHC class III proteins are noted to be poorly defined structurally and functionally. It is noted that the MHC Class III genes contain cytokines and heat shock proteins within this region. It was recently found that genes encoded in the telomeric region on the MHC class III and appears to be involved in specific and global inflammatory responses.Primary
Human C6ORF47 mRNA encodes for a 294 amino acid protein. SAPS also showed that the protein had shown enrichment of leucine, proline, and glycine in C6ORF47 protein compared to other human proteins. It had also shown that a significantly lower amount of isoleucine as well as lower valine, tyrosine, threonine, phenylalanine, and asparagine than normal in the C6ORF47 protein when compared to other human proteins. Repeats of leucine residues spaced seven amino acids apart in the basic leucine zipper and was found to be conserved in mammalian orthologs of the C6ORF47 protein via Motif Scan.Secondary
PredictProtein predicted that the secondary structure of the human C6ORF47 protein was 35.4% helix, 2.4% strand, and 62.2% loop.Tertiary
PSORT II prediction tool showed three transmembrane segments in amino acids 182-198, 222-238, and 246-262 of the human C6ORF47 protein.It is also important to note that all of the mammalian orthologs presented show quite similar transmembrane regions besides the platypus.
Due to other C6ORF47 orthologs mainly being much shorter than the mammalian sequences, the predicted cleavage site is usually slightly higher, while the transmembrane segments vary depending on the length of protein sequences. 1-2 transmembrane segments were found in reptiles, one of the two amphibians, and one fish ortholog, but it is by far still most popular to have 3 transmembrane segments in orthologs.
PSORT II showed that the C6ORF47 protein is predicted to be localized in the endoplasmic reticulum. DeepLoc software further supports the idea that the C6ORF47 protein is localized to the endoplasmic reticulum, showing that there is about an 86.12% chance that it is localized there. It also supports the idea previous finding by PSORT II prediction and SOSUI about human C6ORF47 protein being a transmembrane protein.
Post-Translational Modifications
Phosphorylation sites were experimentally proven on amino acids 34, 35, 71, and 90 in the human C6ORF47 protein via NCBI. Sites 34 and 35 are predicted to be phosphorylated by Casein Kinase II.Endoplasmic Reticulum signals ensure the protein remains in the endoplasmic reticulum, aiding proper folding, quality control, and trafficking.
Sumoylation attaches SUMO proteins to targets, regulating nuclear transport, transcription, DNA repair, and protein stability. Sumolyation was found at amino acids 75, 114, and 147.
O-linked β-N-acetylglucosamine modifies serine/threonine residues, regulating signaling, transcription, and protein-protein interactions dynamically and was found to be at amino acid 60.
Interactions
FGFR3: An interaction of C6ORF47 and FGFR3 was found via a two-hybrid assay with an average detection confidence of medium. This was found via a BioGRID interaction database that was found in August 2022 during a large-scale dataset being scored individually and all other interactions globally.Fibroblast growth factor receptor 3, FGFR3, is part of the fibroblast growth factor receptor family that shares similar structure and functions. FGFR3 is known to span the membrane with one end remaining within the membrane while the other end projects to the outer surface of the cell. Fibroblast growth factor receptor 3 is known to play an important role in cartilage development in the growth plate. FGFR3, commonly known as fibroblast growth factor receptor 3, is a tyrosine-protein kinase that acts on the cell-surface receptor for fibroblast growth factors and plays an essential role in cell proliferation, angiogenesis, differentiation, and apoptosis. FGFR3 is known to interact with growth factors outside the cell and receive signals that regulate growth and development within the cell.
Homology
Orthologs
C6ORF47 gene is estimated to have first appeared approximately 563 million years ago in lampreys. C6ORF47 was found in ray-fined fish, cartilaginous fish, lampreys, and lobe-finned fish, but no hagfish suggesting that possibly this gene was inserted into lampreys. C6ORF47 is conserved to vertebrates with no traces of it being present before vertebrates as seen by its oldest ancestor lampreys .Global Alignments with Human C6ORF47 protein with the seven-gill sharpnose shark C6ORF47 protein showed two noticeable large gaps found from human C6ORF47 protein in amino acids 44-62 and 153-173. These gaps were present in all descendants of vertebrates until rodents and rabbits. The second global alignment with the human C6ORF47 protein and pacific pocket mouse C6ORF47 protein shows that these gaps are no longer present indicating a possible insertions of these gaps in the protein in mammals. It is important to note that the pacific pocket mouse C6ORF47 protein was one of the least related sequences within the rodents from the orthologs table and still showed these 2 large gaps being no longer being present when aligned with the human C6ORF47 protein sequence.
Paralogs
No paralogs were found for the human C6ORF47 gene in humans''.''Conserved Regions
The promoter region was found to have many stretched of nucleotides that were conserved across mammalian orthotlogs like transcriptional bindings sites of at least one SP9 spot, NHLH2 and ERF:FIGLA, ZNF454, EBF1 and EBF2, NR5A2, ZNF423, STAT3, and ZND42.Multiple sequence alignments with C6ORF47 orthologs showed that there were many amino acids on the C-terminal side of the protein that are conserved while there is much less conservation in the N-terminal side. This is likely due to the protein containing a large disordered region on the N-terminal side.
The 3' UTR was found to have 9 conserved areas in it. Listed below in the table is all conserved ares that were found for C6ORF47