FAM200B


FAM200B, is a protein which in humans is encoded by the FAM200B gene.The gene encodes a 657 amino acid protein. The FAM200B protein is a large intracellular protein with no well defined functional domains. Structural data states there is no experimentally proven structures available, however predicted tertiary structures are available. Expression data states FAM200B is expressed moderately and ubiquitously in all tissues, with relatively higher expression in the brain and thymus. Although its function remains unknown, predicted nuclear localization and expression patterns suggest that FAM200B may be involved in regulatory processes such as gene expression or protein protein interactions.

Gene

FAM200B also known as C4orf54, is a protein coding gene located on chromosome 4p15.32. The gene spans 4,287 nucleotides and contains two exons. This gene has multiple transcript variants that encode for two protein isoforms.

Transcripts

The canonical FAM200B transcript is NM_001145191.2, which spans approximately 4.3 kb and consists of two exons with the second exon comprising nearly the entire coding sequence. There are multiple transcript variants for FAM200B, that encodes for two protein isoforms. See table 1.
Table 1: Transcript and protein isoforms of the human FAM200B gene.
TranscriptLength ProteinLength Isoform
NM_001145191.24,287NP_001138663.1657MANE Select
XM_017008048.24,397XP_016863537.1657X1
XM_024453999.23,822XP_024309767.1657X1
XM_024454000.23,818XP_024309768.1657X1
XM_024454001.23,942XP_024309769.1657X1
XM_024454003.23,938XP_024309771.1657X1
XM_024454005.23,803XP_024309773.1657X1
XM_024454006.23,799XP_024309774.1657X1
XM_024454008.23,749XP_024309776.1480X2
XM_024454009.23,869XP_024309777.1480X2
XM_024454010.23,979XP_024309778.1480X2
XM_024454011.23,730XP_024309779.1480X2
XM_047450103.14,464XP_047306059.1657X1
XM_047450104.14,517XP_047306060.1657X1
XM_047450106.14,342XP_047306062.1657X1
XM_047450107.14,378XP_047306063.1657X1
XM_047450108.13,889XP_047306064.1657X1
XM_047450109.13,885XP_047306065.1657X1
XM_047450110.14,480XP_047306066.1657X1
XM_047450112.14,840XP_047306068.1657X1
XM_047450113.13,816XP_047306069.1480X2
XM_047450114.13,926XP_047306070.1480X2
XM_047450115.13,859XP_047306071.1480X2
XM_047450117.13,804XP_047306073.1480X2
XM_054349762.14,559XP_054205737.1657X1
XM_054349763.14,612XP_054205738.1657X1
XM_054349764.14,394XP_054205739.1657X1
XM_054349765.14,339XP_054205740.1657X1
XM_054349766.14,375XP_054205741.1657X1
XM_054349767.13,819XP_054205742.1657X1
XM_054349768.13,815XP_054205743.1657X1
XM_054349769.13,984XP_054205744.1657X1
XM_054349770.13,980XP_054205745.1657X1
XM_054349771.14,037XP_054205746.1657X1
XM_054349772.14,033XP_054205747.1657X1
XM_054349773.14,477XP_054205748.1657X1
XM_054349774.14,837XP_054205749.1657X1
XM_054349775.13,800XP_054205750.1657X1
XM_054349776.13,796XP_054205751.1657X1
XM_054349777.13,746XP_054205752.1480X2
XM_054349778.13,911XP_054205753.1480X2
XM_054349779.13,964XP_054205754.1480X2
XM_054349780.14,074XP_054205755.1480X2
XM_054349781.14,021XP_054205756.1480X2
XM_054349782.13,856XP_054205757.1480X2
XM_054349783.13,727XP_054205758.1480X2
XM_054349784.13,801XP_054205759.1480X2

Protein

Human FAM200B encodes two protein isoforms, a longer 657 amino acid isoform and a shorter 480 amino acid isoform, see Table 1. The canonical transcript is NM_001145191.2, while the remaining variants are predicted models. The predicted molecular weight is ~76.0 kDa and approximate pI is 8.33. Amino acid composition is enriched for Leu, Ser, Lys, and Glu. FAM200B lacks low complexity regions, long tandem repeats, or significant charge clusters, with charged residues distributed evenly throughout the sequence. Only short, localized periodic motifs were detected, consistent with a soluble intracellular protein lacking large repetitive domains. Although no experimentally validated domains have been defined, based on homology analyses across vertebrate ortho logs conserved C2H2- and BED type zinc finger motifs were identified in FAM200B. Secondary structure analysis predicts FAM200B is a mixture of α-helices and β-strands, concentrated in a conserved central region of the protein. Predicted tertiary structure suggest that FAM200B contains a mostly globular fold with a well structured core and more flexible N and C terminal regions.

Gene level regulation

FAM200B is ubiquitously expressed moderately across human tissues and relatively higher expression in brain and thymus. Promoter analysis identified ETC and ETV5::FOXJ1 motif as high scoring transcription factor binding sites. Both transcription factors are known to function in neural development and brain related regulatory pathways, making them biologically plausible given the higher expression of FAM200B in brain tissue.

Protein level regulation

FAM200B is a nuclear, soluble protein, with no signal peptide or transmembrane domains and no evidence of secretion or membrane insertion. Post translational modification predictions indicate multiple serine, threonine, and tyrosine phosphorylation sites and multiple SUMOylation sites, while no evidence for lipid anchor attachment, relevant glycosylation, or N terminal acetylation was identified.

Homology

FAM200B is a vertebrate specific gene with the conserved paralog FAM200A, indicating a stable gene family structure across evolution. The two human para logs FAM200B and FAM200A have 79.79% sequence identity and both contain the conserved Domain of Unknown Function 4371, supporting common evolutionary origin and functional similarity. Comparative genome analysis shows that FAM200B ortho logs are present throughout vertebrates, including mammals, birds, amphibians, and bony fishes, with no clear homologs detected in invertebrate lineages, suggesting emergence during early vertebrate evolution.
The earliest identifiable FAM200B ortho logs occur in A. ctinopterygii, indicating the gene originated prior to the divergence of bony fish and tetrapods approximately 420 - 450 million years ago. Across vertebrates, the number of family members has remained stable at two paralogs, although there are moderate differences observed in transcript length, exon composition, and alternative splicing patterns in distant orthologs. Despite this divergence, the overall sequence and conserved DUF4371 core are maintained.
Table 2: 20 orthologs of the FAM200B protein in organisms including mammals, birds/ reptiles, amphibians and bony fish.
CladeGenus, SpeciesCommon NameTaxonomic GroupDivergence Date Accession NumberQuery CoverSequnce Length Sequence Identity Sequence Similarity
MammaliaHomo SapiensHumanPrimates0NP_001138663.1100657100100
1Pan troglodytesChimpanzeeApes6.4XP_001139775.11005739999
2Papio anubisOlive baboonPrimates28.8XP_017814067.11006579798
3Canis lupus familiarisDogCarnivora94XP_038335570.0888139195
4Monodelphis domesticaGray short-tailed opossumMarsupials160XP_056673701.1757483454
5ReptiliaNatator depressusFlatback sea turtleTestudines319XP_074809886.1916553456
8Chelonia mydasGreen sea turtleTestudines319XP_043379535.1986243457
6AvesOxyura jamaicensisRuddy duckAves319XP_035169477.1875642747
7Caloenas nicobaricaNicobar pigeonAves319XP_065484009.1896042544
9AmphibiaPleurodeles waltlIberian ribbed newtUrodela352XP_069075336.1926173859
10Dendrobates tinctoriusPoison dart frogAnura352XP_073431629.1896253859
11Ascaphus trueiTailed frogAnura352XP_075472991.11006383757
12Rhinatrema bivittatumGymnophiona352XP_029452623.1925983556
13OsteichthyesTrichomycterus rosablancaCave catfishSiluriformes426XP_062844886.1916144364
14Astyanax mexicanusMexican tetraCharaciformes429XP_049334409.1915984264
15Anoplopoma fimbriaSablefishScorpaeniformes429XP_054473507.1855524264
16Eleginops maclovinusPatagonian benniePerciformes429: XP_063763934.1996923859
17Centroberyx gerrardiBright redfishEryciformes429XP_071783535.1915443858
18Carassius auratusGoldfishCypriniformes429XP_026126532.1956333758
19Triplophysa rosaCypriniformes429XP_057204124.1946333959
20Megalobrama amblycephalaWuchang breamCypriniformes429XP_048064547.1855514263

Function

FAM200B encodes a conserved intracellular protein with an unknown function. Sequence and structural analyses indicate that FAM200B lacks catalytic motifs, signal peptides, and transmembrane domains. This suggests it does not function as an enzyme, secreted factor, or membrane protein. It's predicted nuclear localization, the presence of regulatory post translational modification sites, and limited zinc finger like motifs support a role in regulatory processes, possibly involving protein protein interactions.

Interacting proteins

Interaction analysis identified limited biologically plausible binding partners for FAM200B, most notably ANKRD45 and C1orf198. ANKRD45 contains ankyrin repeat domains that mediate protein protein interactions, supporting a role for FAM200B within regulatory complexes, while C1orf198 is an uncharacterized protein associated with nuclear and regulatory proteins, suggesting a protein complex relationship. Other predicted partners lack compatible localization or functional context. This indicates that FAM200B likely interacts as a nuclear regulatory protein that functions through protein - protein interactions within a protein complex.

Clinical significance

FAM200B has no established association with human disease, and no pathogenic variants. However expression under specific cellular stressors, suggests that FAM200B may function as a modifier gene influencing strength, timing, or cellular context of disease related pathways.