FAM200B
FAM200B, is a protein which in humans is encoded by the FAM200B gene.The gene encodes a 657 amino acid protein. The FAM200B protein is a large intracellular protein with no well defined functional domains. Structural data states there is no experimentally proven structures available, however predicted tertiary structures are available. Expression data states FAM200B is expressed moderately and ubiquitously in all tissues, with relatively higher expression in the brain and thymus. Although its function remains unknown, predicted nuclear localization and expression patterns suggest that FAM200B may be involved in regulatory processes such as gene expression or protein protein interactions.
Gene
FAM200B also known as C4orf54, is a protein coding gene located on chromosome 4p15.32. The gene spans 4,287 nucleotides and contains two exons. This gene has multiple transcript variants that encode for two protein isoforms.Transcripts
The canonical FAM200B transcript is NM_001145191.2, which spans approximately 4.3 kb and consists of two exons with the second exon comprising nearly the entire coding sequence. There are multiple transcript variants for FAM200B, that encodes for two protein isoforms. See table 1.Table 1: Transcript and protein isoforms of the human FAM200B gene.
| Transcript | Length | Protein | Length | Isoform |
| NM_001145191.2 | 4,287 | NP_001138663.1 | 657 | MANE Select |
| XM_017008048.2 | 4,397 | XP_016863537.1 | 657 | X1 |
| XM_024453999.2 | 3,822 | XP_024309767.1 | 657 | X1 |
| XM_024454000.2 | 3,818 | XP_024309768.1 | 657 | X1 |
| XM_024454001.2 | 3,942 | XP_024309769.1 | 657 | X1 |
| XM_024454003.2 | 3,938 | XP_024309771.1 | 657 | X1 |
| XM_024454005.2 | 3,803 | XP_024309773.1 | 657 | X1 |
| XM_024454006.2 | 3,799 | XP_024309774.1 | 657 | X1 |
| XM_024454008.2 | 3,749 | XP_024309776.1 | 480 | X2 |
| XM_024454009.2 | 3,869 | XP_024309777.1 | 480 | X2 |
| XM_024454010.2 | 3,979 | XP_024309778.1 | 480 | X2 |
| XM_024454011.2 | 3,730 | XP_024309779.1 | 480 | X2 |
| XM_047450103.1 | 4,464 | XP_047306059.1 | 657 | X1 |
| XM_047450104.1 | 4,517 | XP_047306060.1 | 657 | X1 |
| XM_047450106.1 | 4,342 | XP_047306062.1 | 657 | X1 |
| XM_047450107.1 | 4,378 | XP_047306063.1 | 657 | X1 |
| XM_047450108.1 | 3,889 | XP_047306064.1 | 657 | X1 |
| XM_047450109.1 | 3,885 | XP_047306065.1 | 657 | X1 |
| XM_047450110.1 | 4,480 | XP_047306066.1 | 657 | X1 |
| XM_047450112.1 | 4,840 | XP_047306068.1 | 657 | X1 |
| XM_047450113.1 | 3,816 | XP_047306069.1 | 480 | X2 |
| XM_047450114.1 | 3,926 | XP_047306070.1 | 480 | X2 |
| XM_047450115.1 | 3,859 | XP_047306071.1 | 480 | X2 |
| XM_047450117.1 | 3,804 | XP_047306073.1 | 480 | X2 |
| XM_054349762.1 | 4,559 | XP_054205737.1 | 657 | X1 |
| XM_054349763.1 | 4,612 | XP_054205738.1 | 657 | X1 |
| XM_054349764.1 | 4,394 | XP_054205739.1 | 657 | X1 |
| XM_054349765.1 | 4,339 | XP_054205740.1 | 657 | X1 |
| XM_054349766.1 | 4,375 | XP_054205741.1 | 657 | X1 |
| XM_054349767.1 | 3,819 | XP_054205742.1 | 657 | X1 |
| XM_054349768.1 | 3,815 | XP_054205743.1 | 657 | X1 |
| XM_054349769.1 | 3,984 | XP_054205744.1 | 657 | X1 |
| XM_054349770.1 | 3,980 | XP_054205745.1 | 657 | X1 |
| XM_054349771.1 | 4,037 | XP_054205746.1 | 657 | X1 |
| XM_054349772.1 | 4,033 | XP_054205747.1 | 657 | X1 |
| XM_054349773.1 | 4,477 | XP_054205748.1 | 657 | X1 |
| XM_054349774.1 | 4,837 | XP_054205749.1 | 657 | X1 |
| XM_054349775.1 | 3,800 | XP_054205750.1 | 657 | X1 |
| XM_054349776.1 | 3,796 | XP_054205751.1 | 657 | X1 |
| XM_054349777.1 | 3,746 | XP_054205752.1 | 480 | X2 |
| XM_054349778.1 | 3,911 | XP_054205753.1 | 480 | X2 |
| XM_054349779.1 | 3,964 | XP_054205754.1 | 480 | X2 |
| XM_054349780.1 | 4,074 | XP_054205755.1 | 480 | X2 |
| XM_054349781.1 | 4,021 | XP_054205756.1 | 480 | X2 |
| XM_054349782.1 | 3,856 | XP_054205757.1 | 480 | X2 |
| XM_054349783.1 | 3,727 | XP_054205758.1 | 480 | X2 |
| XM_054349784.1 | 3,801 | XP_054205759.1 | 480 | X2 |
Protein
Human FAM200B encodes two protein isoforms, a longer 657 amino acid isoform and a shorter 480 amino acid isoform, see Table 1. The canonical transcript is NM_001145191.2, while the remaining variants are predicted models. The predicted molecular weight is ~76.0 kDa and approximate pI is 8.33. Amino acid composition is enriched for Leu, Ser, Lys, and Glu. FAM200B lacks low complexity regions, long tandem repeats, or significant charge clusters, with charged residues distributed evenly throughout the sequence. Only short, localized periodic motifs were detected, consistent with a soluble intracellular protein lacking large repetitive domains. Although no experimentally validated domains have been defined, based on homology analyses across vertebrate ortho logs conserved C2H2- and BED type zinc finger motifs were identified in FAM200B. Secondary structure analysis predicts FAM200B is a mixture of α-helices and β-strands, concentrated in a conserved central region of the protein. Predicted tertiary structure suggest that FAM200B contains a mostly globular fold with a well structured core and more flexible N and C terminal regions.Gene level regulation
FAM200B is ubiquitously expressed moderately across human tissues and relatively higher expression in brain and thymus. Promoter analysis identified ETC and ETV5::FOXJ1 motif as high scoring transcription factor binding sites. Both transcription factors are known to function in neural development and brain related regulatory pathways, making them biologically plausible given the higher expression of FAM200B in brain tissue.Protein level regulation
FAM200B is a nuclear, soluble protein, with no signal peptide or transmembrane domains and no evidence of secretion or membrane insertion. Post translational modification predictions indicate multiple serine, threonine, and tyrosine phosphorylation sites and multiple SUMOylation sites, while no evidence for lipid anchor attachment, relevant glycosylation, or N terminal acetylation was identified.Homology
FAM200B is a vertebrate specific gene with the conserved paralog FAM200A, indicating a stable gene family structure across evolution. The two human para logs FAM200B and FAM200A have 79.79% sequence identity and both contain the conserved Domain of Unknown Function 4371, supporting common evolutionary origin and functional similarity. Comparative genome analysis shows that FAM200B ortho logs are present throughout vertebrates, including mammals, birds, amphibians, and bony fishes, with no clear homologs detected in invertebrate lineages, suggesting emergence during early vertebrate evolution.The earliest identifiable FAM200B ortho logs occur in A. ctinopterygii, indicating the gene originated prior to the divergence of bony fish and tetrapods approximately 420 - 450 million years ago. Across vertebrates, the number of family members has remained stable at two paralogs, although there are moderate differences observed in transcript length, exon composition, and alternative splicing patterns in distant orthologs. Despite this divergence, the overall sequence and conserved DUF4371 core are maintained.
Table 2: 20 orthologs of the FAM200B protein in organisms including mammals, birds/ reptiles, amphibians and bony fish.
| Clade | Genus, Species | Common Name | Taxonomic Group | Divergence Date | Accession Number | Query Cover | Sequnce Length | Sequence Identity | Sequence Similarity | |
| Mammalia | Homo Sapiens | Human | Primates | 0 | NP_001138663.1 | 100 | 657 | 100 | 100 | |
| 1 | Pan troglodytes | Chimpanzee | Apes | 6.4 | XP_001139775.1 | 100 | 573 | 99 | 99 | |
| 2 | Papio anubis | Olive baboon | Primates | 28.8 | XP_017814067.1 | 100 | 657 | 97 | 98 | |
| 3 | Canis lupus familiaris | Dog | Carnivora | 94 | XP_038335570.0 | 88 | 813 | 91 | 95 | |
| 4 | Monodelphis domestica | Gray short-tailed opossum | Marsupials | 160 | XP_056673701.1 | 75 | 748 | 34 | 54 | |
| 5 | Reptilia | Natator depressus | Flatback sea turtle | Testudines | 319 | XP_074809886.1 | 91 | 655 | 34 | 56 |
| 8 | Chelonia mydas | Green sea turtle | Testudines | 319 | XP_043379535.1 | 98 | 624 | 34 | 57 | |
| 6 | Aves | Oxyura jamaicensis | Ruddy duck | Aves | 319 | XP_035169477.1 | 87 | 564 | 27 | 47 |
| 7 | Caloenas nicobarica | Nicobar pigeon | Aves | 319 | XP_065484009.1 | 89 | 604 | 25 | 44 | |
| 9 | Amphibia | Pleurodeles waltl | Iberian ribbed newt | Urodela | 352 | XP_069075336.1 | 92 | 617 | 38 | 59 |
| 10 | Dendrobates tinctorius | Poison dart frog | Anura | 352 | XP_073431629.1 | 89 | 625 | 38 | 59 | |
| 11 | Ascaphus truei | Tailed frog | Anura | 352 | XP_075472991.1 | 100 | 638 | 37 | 57 | |
| 12 | Rhinatrema bivittatum | Gymnophiona | 352 | XP_029452623.1 | 92 | 598 | 35 | 56 | ||
| 13 | Osteichthyes | Trichomycterus rosablanca | Cave catfish | Siluriformes | 426 | XP_062844886.1 | 91 | 614 | 43 | 64 |
| 14 | Astyanax mexicanus | Mexican tetra | Characiformes | 429 | XP_049334409.1 | 91 | 598 | 42 | 64 | |
| 15 | Anoplopoma fimbria | Sablefish | Scorpaeniformes | 429 | XP_054473507.1 | 85 | 552 | 42 | 64 | |
| 16 | Eleginops maclovinus | Patagonian bennie | Perciformes | 429 | : XP_063763934.1 | 99 | 692 | 38 | 59 | |
| 17 | Centroberyx gerrardi | Bright redfish | Eryciformes | 429 | XP_071783535.1 | 91 | 544 | 38 | 58 | |
| 18 | Carassius auratus | Goldfish | Cypriniformes | 429 | XP_026126532.1 | 95 | 633 | 37 | 58 | |
| 19 | Triplophysa rosa | Cypriniformes | 429 | XP_057204124.1 | 94 | 633 | 39 | 59 | ||
| 20 | Megalobrama amblycephala | Wuchang bream | Cypriniformes | 429 | XP_048064547.1 | 85 | 551 | 42 | 63 |