C16orf58


Chromosome 16 open reading frame 58, or C16orf58, also known as FLJ13638 is a protein which in humans is encoded by the C16orf58 gene. The gene itself is 18892 bp long, with mRNA of 2760 bp, and a protein sequence of 468 amino acids. There is a conserved domain of unknown, DUF647. No function has been determined for this gene yet, but it is predicted that it resides in the endoplasmic reticulum in the cytoplasm.

Species distribution

C16orf58 has very interesting conservation in that it has orthologs back through plants and fungi. However, it has not been found in reptiles, birds, or amphibians. The below table shows some, but not all, orthologs which were found using BLAST.
SpeciesOrganism Common NameNCBI AccessionSequence identityE-valueLength Gene Common Name
Homo sapiensHuman100%0.0468C16orf58
Equus CaballusHorse85%0.0468PREDICTED: similar to UPF0420 protein C16orf58
Canis familiarisDog85%0.0485similar to CG10338-PA
Mus musculusMouse81%0.0466cDNA sequence BC017158
Monodelphis domesticaOpossum65%3e−160466PREDICTED: hypothetical protein
Danio rerioZebrafish53%4e−112432hypothetical protein LOC555936
Drosophila melanogasterFly40%3e−69395CG10338
Arabidopsis thalianaThale Cress37%2e−68403Contains similarity to CG10338 gene product from Drosophila melanogaster
Gallus gallusChicken25%0.361434protein tyrosine phosphatase, receptor type, U
Xenopus tropicalisFrog31%3.4268Stk19 protein
Saccharomyces cerevisiaeYeast25%0.211578YDL140Cp-like protein
Caenorhabditis elegansNematode19%3.0414hypothetical protein M18.6
------

Protein Interactions

Though the function is still unknown, C16orf58 has been shown to interact with three different proteins:

Structure

Although there are several sites that will give predictions on protein structure, C16orf58 does not have a known structure yet. That being said there is at least one transmembrane domain, if not more. Within the protein structure there are several extended areas with uncharged amino acids, these could be possible transmembrane domains, or hydrophobic cores. The below shows the charge of each of the amino acids in the protein sequence, + for positive, - for negative and 0 for uncharged. Note the large segments of uncharged amino acids appear bolded. These stretches of uncharged amino acids are conserved back through distant orthologs.
1 00—000-00 000-00000- 0+00+000-0 0000-0000+ 00000+0000 +0-0+-00-0
61 0000000000 0000000000 000-0000-0 000000-000 0000000000 0000000000
121 0000+00000 0000000+-0 00000+0000 00+00+0-00 0+00+000-0 00-00000-0
181 0000000000 000000000+ 0000000000 +00000000+ +0000-000+ -000-00000
241 0000000000 0000000000 0000000000 000000+00+ 0000-000-0 +0+000+000
301 0+0-00-000 00+0-00000 0000000000 0000+00000 0-00000-00 0-000000-0
361 0000000000 0+000+000+ 0000000000 000-00000- 0—0+0+0+0 00++-00000
421 +-00-00-00 00+00+000- 000+0-+000 -00-0+0000 000-++00