Transmembrane protein 251


Transmembrane protein 251, also known as C14orf109 or UPF0694, is a protein that in humans is encoded by the TMEM251 gene. One notable feature of this protein is the presence of proline residues on one of its predicted transmembrane domains, which is a determinant of the intramitochondrial sorting of inner membrane proteins.

Gene

The TMEM251 gene, also known as the lysosomal enzyme trafficking factor or LYSET, is located on human chromosome 14, at 14q32.12, on the plus strand. The gene size is 1,277 base pairs. It contains 3 distinct introns, and transcription produces six different mRNAs that appear to differ by truncation of the 3' end. There are 2 transcript variants that encode for the TMEM251 protein, with the longer one being 169 base pairs in length, and the shorter one being 131 base pairs in length. The first transcript variant encodes a shorter predicted protein, while the second transcript variant encodes a protein with a longer N-terminus. Both consists of two exons that include the entire coding sequence for the TMEM251 protein.

Promoter

According to Genomatix's ElDorado program, the promoter region of TMEM251 is predicted to be 680 base pairs in length. The promoter region starts 500 base pairs upstream of the 5’ UTR of TMEM251 mRNA transcript and contains part of this 5’ UTR.

Transcription Factors

Various transcription factors are predicted to bind within the conserved parts of the promoter region, on both the plus and minus strands. The transcription factors with the highest matrix scores include NKX homeodomain factors, [GATA Transcription (genetics)|transcription factor|GATA-binding factors], two-handed zinc finger, E2F transcription factor, and T-box transcription factors. No vertebrate TATA binding protein factors, RNA polymerase transcription factor II B,, or CCAAT enhancer binding proteins were found.

Protein

The TMEM251 protein is 169 amino acids in length. The molecular weight of this protein is 18,747 daltons, with an isoelectric point of 8.38. It is known to be a type IV multi-pass membrane because it spans the membrane twice in alpha-helical configuration, with its N-terminal domains targeted to the lumen. The TMEM251 protein contains a domain of unknown function, part of the domain family DUF4583, spanning from amino acids 35-160. TMEM251 has two isoforms, TMEM251.1 and TMEM251.2.

Composition

Leucine is the most abundant amino acid by volume. TMEM251 has very low abundance of Cysteine, Asparagine, and Aspartic acid. It has one negative charge cluster from amino acid 67–82. No repeats are identified. The same patterns are observed in TMEM251's primate orthologs.

Tissue expression

In the human body, microarray-assessed tissue expression patterns show TMEM251 to be highly expressed in ascites, bladder, bone, embryonic tissue, intestine, and skin. In terms of clinical relevance, TMEM251 is expressed in breast carcinoma, dendritic cell line, hepatocellular carcinoma, neuroblastoma, glioblastoma, adult B-acute lymphoblastic leukemia, and blood mononuclear tissues. Over-expression of the TMEM251 gene has not been linked as a causal factor in any of these disease states
The conditions under which TMEM251 rises include occupational benzene exposure, acute cold exposure, macular degeneration and dermal fibroblast, and asthma. These microarray-assessed samples have low percentage rank on NCBI Geo. The conditions under which TMEM251 falls include infantile-onset Pompe disease, caseous tuberculosis granulomas, and endurance exercise training. These samples have relatively high percentage rank.

Homology and Evolution

TMEM251 has no paralogs in humans. It does have orthologs within eukaryotes. Conservation has only been found in primates, not in bacteria, plants, or fungus. The following table represents a small selection of orthologs found using searches in BLAST and BLAT, sorted by % identity. This is by no means a comprehensive list, however it does show the vast diversity of species where TMEM251 orthologs are found.
Genus and speciesCommon nameDate of DivergenceLengthIdentityE-valueNotes
Pan troglodytesChimpanzee6.3 MYA169aa99%1e-1215’ and 3’ are not truncated
Nomascus leucogenysGibbon20.4 MYA169aa97%1e-1195’ truncated
Pteropus alectoBlack flying bat94.2 MYA175aa97%1e-1195’ truncated
Dasypus novemcinctusArmadillo104.2 MYA163aa97%3e-1155’ truncated
Canis lupus familiarisDog94.2 MYA163aa97%7e-1155’ truncated
Odobenus rosmarus divergensWalrus94.2 MYA169aa96%3e-1185’ truncated
Ictidomys tridecemlineatusGround squirrel92.3 MYA169aa96%3e-1185’ truncated
Tinamus guttatusWhite-throated tinamou296 MYA131aa92%1e-855’ truncated
Xenopus (Silurana) tropicalisWestern clawed frog371.2 MYA130aa85%2e-815’ truncated
Corvus cornix cornixHooded crow296 MYA171aa84%4e-915’ truncated
Danio rerioZebrafish141aa141aa69%9e-635’ truncated

The TMEM251 gene first appeared on the planet around 400 million years ago, since the most distant orthologs are found in fish which diverged from humans around the same time. The size of the gene family, which is a set of similar genes that are formed by duplication of an original gene, is around 120 genes. Gene duplication, resulting in paralogous genes, occurred approximately 371.2 million years ago.

Post-Translational Modifications

Using various tools at ExPASy, the following are possible post-translational modifications for TMEM251:
All post-translational modifications are conserved in vertebrates.

Protein Secondary Structure

Using various tools at ExPASy, TMEM251 secondary structure consists of the following:
It is predicted to have two transmembrane helices, of 23 amino acids in length each. The average hydrophobicity is predicted to be 0.19.


Figure 3: TMEM251 predicted secondary structure from SOSUI.

Mutation

TMEM251 has a multitude of mutations in its 5'UTR, coding sequence, and 3'UTR. The majority of the mutations observed are missense mutations.