TMEM202


Transmembrane 202 protein is encoded by the gene TMEM202 and is a member of the Claudin2 superfamily. Human paralogs include LIMP2, GSG1, CLDND2, NKG7. The specific function of TMEM202 has largely yet to be elucidated, but other Claudin2 superfamily proteins plays important roles in paracellular transport by contributing to the structure of gap junctions. In S. scrofa, TMEM202 has been found to aid in sperm motility, fertilization, and spermatogenesis.

Gene

Location

The TMEM202 gene spans 10,063 nucleotides on chromosome 15 located at 15q23-q24.1 and on the forward strand. The most complete isoform of TMEM202 contains five exons. The genetic neighborhood is rather scarce, as HEXA is the gene nearest to TMEM202 and is found ~23,000 nucleotides upstream.

Expression

TMEM202 demonstrates restricted and low expression in H. sapiens testis.

RNA and Transcriptional variants

There exist four transcriptional variants of TMEM202 mRNA.
TMEM202 Transcript VariantAccession NumbermRNA length 5'UTR length Exon 1Exon 2Exon 3Exon 4Exon 5
Transcript variant 1NM_001080462131822YYYYY
Isoform X1XM_0115214971326147X*YYY
Isoform X2XM_024449910118622YYYXY
Isoform X3XM_0115214991132169XXYYY

'Y' indicates the presence of the exon in the specific transcript variant, whereas 'X' denotes absence of the exon. '*' represents partial loss of exon two in isoform X1.

TMEM202 protein

TMEM202 is classified as a multi-pass protein as it consists of four transmembrane domains with helical structure and one disordered region. TMEM202 protein is a member of the Claudin2 superfamily. Claudin proteins often form protein-protein interactions in epithelial and endothelial tissue, contributing to the scaffolding of the tight junction.
TMEM202 Protein VariantAccession NumberProtein Length Approx. Molecular Weight Prior to PTMs Exons presentSimilarity Internal Composition
TMEM202 Variant 1NP_001073931.1273~31.21-5100L:
W:
TMEM202 Isoform X1XP_011519799.1234~26.8Part of 2; 3-585.7L:
W:
TMEM202 Isoform X2XP_024305678.1229~26.11-3, 583.9W:
A:
D:
TMEM202 Isoform X3XP_011519801.1162~18.53-558.6N/A

Similarity was calculated using Pairwise Sequence Alignments where all transcripts were aligned to transcript variant 1. The approximate molecular weight was calculated prior to post-translational modifications, such as cleavages in the sequence or binding of additional molecules. Emboss SAPS was used to determine relative composition of residues as compared to their database of H. sapiens samples. One denotes one standard deviation away from the average, whereas two indicate two standard deviations away from the average. Proteins with a can be classified as rich or poor for that amino acid.

TMEM202 Secondary Structure

The figure to the right, TMEM202 Secondary Domains shows the predicted secondary structure of TMEM202, along with exon boundaries.

Gene Level Regulation

Expression pattern

The expression of TMEM202 was found to be restricted to the testis during a RNA-sequence study of normal adult human tissue. An additional RNA-seq study of human fetal tissue found that TMEM202 is present in intestinal, lung, and stomach tissues at week 10 of gestation and in adrenal tissue at week 20.

Promoter region and transcription factor binding

The figure Annotated Promoter Sequence of Human TMEM202 shows potential binding locations for the most compatible transcription factors. The promoter region was defined as the 500 nucleotides directly upstream of transcription.
TMEM202 Transcription Factors Potentially Binding in Promoter Region.
Transcription Factors NameBinding Site Location Binding ScoreFunction of Transcription FactorReason for Identification
NFIC1 - Minus421Individually capable of activating transcriptionFunction; proximity to start codon; high conservation
ZSCAN4
49 - Minus
134 - Plus
461
609
Regulates embryonic stem cell pluripotencyFunction relates to microarray studies; high binding score; appears twice in the promoter region
ZBTB2475 - Minus554Involved in BMP2-induced transcription*High binding score
TBP106 - Plus421Aids in the initiation of RNA polymerase II-dependent transcriptionFunction; high conservation
DUXA
109 - Plus
109 - Minus
476
476
Acts as a repressorFunction; binding sites on both the plus and minus strand at the same location; overlapping binding sequence with Tfcp2l1; appears twice in the promoter region
Tfcp2l1115 - Plus446Facilitates establishment and maintenance of pluripotency in embryonic stem cellsFunction relates to microarray studies; overlapping binding sequence with DUXA, ELF3
ELF3122 - Plus400Acts as an activatorFunction; overlapping binding sequence with Tfcp2l1
TBX20124 - Minus449Acts as a transcriptional activator and repressorFunction; overlapping binding sequence with EOMES
EOMES125 - Minus418Acts as transcriptional activator playing a crucial role during developmentFunction; overlapping binding sequence with TBX20
Prdm5148 - Minus427Acts a transcriptional repressorFunction
MEIS2178 - Plus405Transcriptional regulation - stabilization of the homeoprotein-DNA complexFunction; overlapping binding sequence with FOXH1; moderate coservation
FOXH1182 - Minus481Acts as an activatorFunction; overlapping binding sequence with MEIS2; moderate conservation
KLF9205 - Plus403Selectively activates transcription when bound to GC box promoter elementsFunction; overlapping binding sequence with NR1D1
NR1D1205 - Minus405Acts a transcriptional repressorFunction; overlapping binding sequence with KLF9
Sox11221 - Plus421Acts as an activatorFunction; high coservation
ONECUT3245 - Minus508Acts as an activatorFunction; high binding score; overlapping binding sequence with ONECUT1
ONECUT1247 - Minus346Acts as an activatorFunction; overlapping binding sequence with ONECUT3; moderate conservation
FOXD3
270 - Minus
364 - Minus
446
467
Acts as a transcriptional activator and repressorFunction; appears twice in the promoter region; overlapping binding sequence with HOXB13; high conservation
HOXB13355 - Plus450Involved in the developmental regulatory system provides cells with positional identities along the anterior-posterior axisFunction; overlapping binding sequence with FOXD3
Neurod2395 - Minus403Involved in neuronal determinationFunction

Indicates hypothesized function. A higher binding score suggests a higher likelihood that the transcription factor will bind to the sequence when present—binding scores were calculated by the Jaspar Database. This is a non-exhaustive list of potential transcription factors that may bind to TMEM202, but presents transcription factors with the highest binding score or potentially illuminating qualities within the 500 base promoter region.

Protein level regulation

Post-translational modifications

Predicted Post-Translational Modifications of TMEM202.
Type of PTMNumber of SitesLocationSourceNotes
Generic phosphorylation29S13, S39, S254NetPhosLocations denoted were three highest scored
Phosphorylation2Y25, S245PhosphoPlus
GlcNAc O-glycosylation2T107, S245DictyOGlyc
N-myristoylation2G61, G145MyHit
Acetylation1K264PhosphoPlus
N-linked glycosylation1N225NetNGlyc
N-linked glycosylation1N225ELM
Arginine and lysine propeptide cleavage sites--ProP
C-mannosylation sites--NetCGlyc
GPI Anchors--NetGPILikelihood not anchored by GPI - 0.993
O-GalNAc glycosylation sites--NetOGlyc

Different types of PTM’s were predicted using a variety of computational systems. Locations bolded were predicted by two or more unique systems.

TMEM202 homology and evolution

Paralogs of TMEM202

Four paralogs, LIMP2, GSG1, CLDND2, NKG7, have been discovered for TMEM202 and are outlined in the table below.
Paralogs of H. sapiens TMEM202.
TMEM202 ParalogsGenus and SpeciesNameCommon NameAccession NumberSequence Length Sequence Identity to Human Protein Sequence Similarity to Human Protein Sequence Divergence Corrected Sequence Divergence
Protein of InterestHomo sapiensTransmembrane Protein 202TMEM202NP_001073931.127310010000
ParalogsHomo sapiensLens fiber membrane intrinsic protein isoform 2LIMP2NP_085915.221514.528.685.5193.1021537
Homo sapiensProtein NKG7 isoform 1NKG7NP_005592.116514.327.985.7194.4910649
Homo sapiensClaudin domain-containing protein 2CLDND2NP_689566.116714.225.485.8195.1928221
Homo sapiensGerm cell-specific gene 1 protein isoform 1GSG1NP_112579.22858.515.391.5246.5104022

Sequences characterized as paralogs were aligned to TMEM202 via pairwise analysis. The results of the PSA gave percent identity and similarity values, which were also used to calculate the sequence divergence value and corrected sequence divergence value. Estimated median date of divergence is based on predicted values found on TimeTree.

Orthologs of TMEM202

Orthologs were found using the NCBI Blast database by searching for proteins with similar sequences to the H. sapiens TMEM202 protein. All taxia with sequenced proteins, including bacteria, fungi and plants were searched; however, bony fish were found to be the most diverged species to have orthologous proteins. Orthologs were also found in amphibians, birds, reptiles, and mammals. Orthologs are characterized in the table below:
Orthologs of H. sapiens TMEM202.
TMEM202Taxonomic GroupGenus and SpeciesCommon NameAccession NumberMedian Date of Divergence Sequence Length Sequence Identity to Human Protein Sequence Similarity to Human Protein Sequence Divergence Corrected Sequence Divergence
MammalsPrimatesHomo sapiensHumanNP_001073931.10273100.0100.00.00.0
LagomorphsOchotona princepsAmerican pikaXP_0045947728726358.371.441.754.0
PerissodactylaDiceros bicornis minorSouth-central black rhinocerosXP_058396452.19427176.682.423.426.7
ArtiodactylsSus scrofaWild boarXP_005666231.19427168.979.131.137.3
ChiropteraRousettus aegyptiacusEgyptian fruit batXP_015982472.29428963.775.836.345.1
EulipotyphlaCondylura cristataStar-nosed moleXP_004687557.19426761.773.738.348.3
ProboscideansElephas maximus indicusIndian elephantXP_049761646.19926673.682.126.430.7
DiprotodontiaVombatus ursinusCommon wombatXP_027713789.116028842.655.157.485.3
Carnivorous marsupialsSarcophilus harrisiiTasmanian devilXP_031812217.116032438.450.861.695.7
ReptiliaTestudinesChelydra serpentinaCommon snapping turtleKAG6921653.131919018.333.181.7169.8
SquamataPodarcis lilfordiLilford's wall lizardCAI5781213.131919318.032.582.0171.5
SquamataNaja najaIndian cobraKAG8143384.131915017.027.883.0177.2
TestudinesChrysemys picta belliiPainted turtleXP_065430698.131948412.520.587.5207.9
AvesPsittaciformesStrigops habroptilaKākāpōXP_030329880.131932014.220.385.8195.2
GruiformesGrus japonensisRed-crowned craneGAB0205531.131923811.319.888.7218.0
PasseriformesPseudopodoces humilisGround titXP_0055333823196417.711.292.3256.4
AmphibiaAnuraEleutherodactylus coquiCommon coquíXP_066461729.135219220.131.979.9160.4
AnuraAscaphus trueiCoastal tailed frogMEE6481875.135216815.727.284.3185.2
UrodelaPleurodeles waltlIberian ribbed newtKAJ1132505.135217914.928.485.1190.4
VertebrataCypriniformesCyprinus carpioCommon carpKTG06534.142922415.830.684.2184.5
NotacanthiformesAldrovandia affinisGilbert's halosaurid fishKAJ8399001.142917314.228.485.8195.2

Sequences characterized as orthologs were aligned to TMEM202 via pairwise analysis. The results of the PSA gave percent identity and similarity values, which were also used to calculate the sequence divergence value and corrected sequence divergence value. Estimated median date of divergence is based on predicted values found on TimeTree.

Evolution of TMEM202

The evolutionary history of TMEM202 appears to have begun approximately 429 years ago as orthologs of the protein were found in species of bony fish. Since then, TMEM202 has evolved in species of amphibians, reptiles, birds, and mammals. This evolution is quantified in Evolutionary History of TMEM202 compared to Fibrinogen a, cyctochrome C, where corrected sequence divergence of TMEM202 orthologs was graphed over time in comparison to fibrinogen a and cytochrome c.

Evolutionary History of TMEM202 compared to fibrinogen a, cytochrome C.

Fibrinogen alpha is a fast-evolving protein as compared to cytochrome c, a slow-evolving protein. Across its evolution, TMEM202 has undergone numerous amino acid mutations, as it has ascertained mutations at a similar rate to fibrinogen alpha. For this reason, TMEM202 can be considered a fast-evolving protein.To the left, Unrooted Phylogenetic Tree of Homo sapiens TMEM202 and its Orthologous Species, helps visualize divergence of some orthologs from humans. Species were further classified into mammals, aves, amphibians, and reptiles.

TMEM202 interacting proteins

STRING Network revealed proteins found to interact with TMEM202 and are detailed in the table below. Along with those proteins, are kinases predicted to phosphorylate TMEM202.

Proteins predicted to interact with TMEM202

Protein AbbreviationProtein NameBasis of IdentificationFunctionNotes
SPATA31D1Spermatogenesis-associated protein 31D1STRING NetworkMight be involved in cell differentiation and spermatogenesisPredicted to be a part of the cell membrane
CFAP45Cilia and flagella associated protein 45STRING NetworkEnables AMP binding and is involved in establishing left/right asymmetry of the flagella
SPACA1Sperm acrosome membrane-associated protein 1STRING NetworkAids in the acrosome expansion and establishment of normal sperm morphology during spermatogenesisSPACA1 is recognized by anti-sperm antibodies in infertile males
PKCProtein Kinase CPredicted site of phosphorylationRegulates numerous cellular responses including gene expression, protein secretion, cell proliferation, and the inflammatory response
Cdk5Cyclin-dependent kinase 5Predicted site of phosphorylationPlays a pivotal role in brain development and maintenance
Cdc2Cell Division Cycle 2Predicted site of phosphorylationRegulates cell cycle progressionAlso known as CDK1, or Cyclin Dependent Kinase 1

Clinical significance and pathology

Mutations - Single Nucleotide Polymorphisms (SNPs)

There exist many identified SNPs in the TMEM202 gene. Found in the table below include six of the most common to occur within the promoter region and coding sequence of TMEM202.

SNPs in Human TMEM202

SNP NameSNP Location MutationAmino Acid ChangeFrequency in human population SignificanceClinical Significance
rs74622826164 bases upstreamG → AGlycine → Glutamate0.25In promoter regionNone reported in ClinVar
rs74343303193 bases upstreamA→ C/GLeucine → Leucine
0.01In promoter regionNone reported in ClinVar
rs114543781284 bases upstreamC → TProline → Leucine0.42In promoter regionNone reported in ClinVar
rs112986603Base 117C → T32A → 32V0.09In coding sequenceNone reported in ClinVar
rs16956904Base 631A → T204M → 204L3.71In coding sequence within transmembrane region 4None reported in ClinVar
rs35916586Base 679G → A/T/C219S → 219T / 219S / 219P0.01In coding sequenceNone reported in ClinVar

A small subset of the SNPs may have functional influences on TMEM202 based on where they occur. The SNP, rs16956904 at base 631, is within the fourth transmembrane region. While this is an area of importance, both the normal and SNP variation result in a hydrophobic amino acid. This could explain why no clinical / phenotypic discrepancies have been observed. SNP rs114543781, 284 bases upstream is present where transcription factor HOXB13 is predicted to bind. This transcription factor is involved in the developmental regulatory system that provides cells with positional identities along the anterior-posterior axis. SNP rs74343303, 193 bases upstream also interferes with FOXD3 binding. FOXD3 has the ability to act as a transcriptional activator and repressor.

Conceptual translation of TMEM202

Key found on page 2.