FAM199X


Family with sequence similarity 199, X-linked is a protein which in humans is encoded by the FAM199X gene. This gene has orthologs in most vertebrates, including most mammals, birds, amphibians, and fish with some homologs within invertebrates. Within humans, this gene is commonly expressed in the brain and thyroid. The gene has been linked to some genetic disorders, such as Pelizaeus–Merzbacher disease, and some cancers, such as Stomach cancer, but FAM199X's role in those diseases is not yet well understood within the scientific community.

Gene

FAM199X is located on the long arm of the X chromosome at Xq22.2 on the plus strand, approximately 30,000 bases, and encodes six exons. The gene is located next to an enhancer called LOC130068517, also known as ATAC-STARR-Seq Lymphoblastoid Active Region 29826.

Expression

Expression is ubiquitous and high across may tissues at consistent values of expression. The gene has the highest expression within the cerebellum of the brain, followed by tissues related to hormone secretion, the thyroid, prostate, and kidney. These results were checked against distant orthologs of FAM199X, which had similar expression profiles. There was especially high expression in the cerebellum, thalamus, and epididymis with below expected expression in adipose, bladder, heart, liver, fetal lung, skin, ileum, and stomach tissue. Within the cell, FAM199X has expression with in the nucleus and endoplasmic reticulum. Within the promoter sequence, there was 6 eQTLs that were expressed and half of them were related to the thyroid of respiratory system.

Protein Localization

FAM199X is localized within the nucelus and cytoplasm.

mRNA

Four transcript variants of FAM199X produce two protein isoforms. The four transcript variants are FAM199X-X1 variant 1 with 7498 nucleotides, FAM199X-X1 variant 2 with 7495 nucleotides, FAM199X-X2 variant 3 with 7179 nucleotides, and FAM199X-X1 variant 4 with 7171 nucleotides. There are six exons in FAM199X-X1 variants and five exons in FAM199X-X2 variants.
FAM199X has two isoforms, each with 6 exons and two variants of each isoform. Isoform X1 encodes for 345 amino acids while Isoform X2 encodes a 205 amino acid protein.
The 3' Untranslated Region of FAM199X is abnormally large, spanning 6124 nucleotides.
Transcript VariantAccession # mRNALength ExonsProtein isoformAccession # ProteinLength Isoelectric Point
Variant 1XM_005262079.474956Isoform X1XP_005262136.13454.84
Variant 2XM_054326467.174986Isoform X1XP_054182442.13454.84
Variant 3XM_047441826.171796Isoform X2XP_047297782.12059.07
Variant 4XM_054326468.171716Isoform X2XP_054182443.12059.07

Evolutionary History

Homologs

FAM199X had several highly conserved orthologs amongst mammals, birds, reptiles, amphibians, fish, and less conserved orthologs in chorodates and arachnids. The most distant ortholog detected is the Common Household Spider, Parasteatoda tepidaiorum.

Paralogs

FAM199X has no paralogs.

Evolution

FAM199X evolved around 708 million years ago, with the oldest known ortholog, Parasteatoda tepidaiorum, diverging from human evolution about 708 million years ago. The evolution of FAM199X was slow, with a protein divergence close to cytochrome c, a highly conserved protein.

Genus and SpeciesCommon nameTaxonomyDate of divergence Accession #Sequence length Identity Similarity
Homo sapiensHumanPrimates: Great Apes0NM_207318.4388100100
Macaca mulattaIndochinese rhesus macaquePrimates: New World Monkey28.8NP_001180862.1388100100
Plecturocebus cupreusCoppery titi monkeyPrimates: Old World Monkey43KAL0588686.142399100
Mus musculusHouse mouseMammals: Rodent87NP_666373.13889897
Pteropus vampyrusLarge flying foxMammals: Chiropetra/Megabat94XP_011379288.13889999
Ornithorhynchus anatinusPlatypusMammals: Monotremes180XP_028923647.13909395
Chelydra serpentinaCommon snapping turtleReptiles: Testudines/Turtles319KAG6937128.13819092
Eublepharis maculariusLeopard geckoBirds: Aves/Galliformes319XP_054853091.13848789
Gallus gallusRed junglefowlReptiles: Squamata319XP_003641135.23819092
Ranitomeya variabilisZimmerman's poison frogAmphibian: Anura352XP_077141317.13758992
Erpetoichthys calabaricusReedfishFish: Ray-finned429XP_028671530.13778589
Collichthys lucidusSpinyhead CroakerFish: Ray-finned429TKS72713.13967883
Leucoraja erinaceusLittle skateFish: Cartilagenous462XP_055499985.13788186
Pristis pectinataSmalltooth sawfishFish: Cartilagenous462XP_051876775.13788186
Lethenteron reissneriAsiatic Brook LampreyJawless Vertebrate: Petromyzontida563XP_061419915.14415969
Branchiostoma lanceolatumCommon lanceletInvertebrate: Cephalochordata581CAH1254916.13563351
Nematostella vectensisStarlet Sea AnemoneInvertebrate: Cnidaria685XP_001632934.13443144
Ixodes scapularisDeerk tickInvertebrate: Arachida6863533149
Magallana gigasPacific OysterInvertebrate: Mollusk708XP_011447698.33173047
Parasteatoda tepidariorumCommon house spiderInvertebrate: Arachida708XP_042905439.12732844

Protein

The protein contains 388 amino acids. FAM199X has a molecular weight about 43 kDa with an isoelectric point of 4.95. There are two protein isoforms of FAM199X, FAM199X-X1 and FAM199X-X2. FAM199X-X1 is 345 amino acids long and has a weight of 38.61kDa, and FAM199X-X2 is 205 amino acids long and 22.8kDa. FAM199X has a protein motif for cytomegalovirus protein US29. Found within FAM199X are cleaevage sites for N-Arginine dibasic convertase, MAPK, and BRCA1. N-Arginine dibasic convertase is an enzyme located in the brain that converts proto-hormones to hormones, but has not been extensively studied. MAPK and BRCA1 have been implicated in cancer, acting as a tumor suppressor that can increase the risk of some cancers. FAM199X also has a high amount of serine, with a two standard deviation increase in serine compared to other human proteins.

Post translational modifications

FAM199X has several regions of interest including a disordered region, a NET domain, N-linked glycosylation, N-myristoylation, C-mannosylation, and two proven phosphorylation sites. The NET domain stands for N-terminal extra-terminal domain. It is thought that this domain is related to bromodomain proteins, and this domain is used for protein binding. N-linked glycosylation is an oligosaccharide bound generally to membrane-associated or secreted proteins, which further shows that FAM199X is secreted. N-myristolation is the attatchment of a fatty acid to the protein, which could allow for a site to associate with the plasma membrane. C-mannosylation has many roles, including intercellular transport and structural stability.

Tertiary structure

The tertiary structure of FAM199X shows a globular area with alpha helices and beta strands within the first 300 amino acids, but the last 88 amino acids are depicted with a large arm and a possible protein binding domain, which is encoded by a alpha helix within the most conserved region of the FAM199X protein.

Protein Interaction

FAM199X associates with three proteins of note, WRD5, P, and M. Proteins P and M are viral proteins while protein WRD5 is a high scoring protein related to the kidney and brain. Protein P and M are relate to the flu and SARS-COV-2. There is also evidence of FAM199X association with Nipah virus.

Variants

There were no variants found to be pathogenic and the majority of the variants were uncategorized and were found at very low frequency.

Clinical Signficance

It is suggested that FAM199X could be involved in various clinical diseases and viruses, including the flu, SARS-COV-2, the Nipah virus, and cancer. It was speculated that FAM199X had effects on Pelizaeus–Merzbacher disease, but those results were never found. It is suggested that FAM199X could be secreted via the normal pathway involved with the endocrine system and the uncinventional secretion method.