Gene polymorphism
A gene is said to be polymorphic if more than one allele occupies that gene's locus within a population. In addition to having more than one allele at a specific locus, each allele must also occur in the population at a rate of at least 1% to generally be considered polymorphic.
Gene polymorphisms can occur in any region of the genome. The majority of polymorphisms are silent, meaning they do not alter the function or expression of a gene. Some polymorphisms are visible. For example, in dogs the E locus can have any of five different alleles, known as E, Em, Eg, Eh, and e. Varying combinations of these alleles contribute to the pigmentation and patterns seen in dog coats.
A polymorphic variant of a gene can lead to the abnormal expression or to the production of an abnormal form of the protein; this abnormality may cause or be associated with disease. For example, a polymorphic variant of the gene encoding the enzyme CYP4A11, in which thymidine replaces cytosine at the gene's nucleotide 8590 position encodes a CYP4A11 protein that substitutes phenylalanine with serine at the protein's amino acid position 434. This variant protein has reduced enzyme activity in metabolizing arachidonic acid to the blood pressure-regulating eicosanoid, 20-hydroxyeicosatetraenoic acid. A study has shown that humans bearing this variant in one or both of their CYP4A11 genes have an increased incidence of hypertension, ischemic stroke, and coronary artery disease.
Most notably, the genes coding for the major histocompatibility complex are in fact the most polymorphic genes known. MHC molecules are involved in the immune system and interact with T-cells. There are more than 32,000 different alleles of human MHC class I and II genes, and it has been estimated that there are 200 variants at the HLA-B HLA-DRB1 loci alone.
Some polymorphism may be maintained by balancing selection.
Differences between gene polymorphism and mutation
A rule of thumb that is sometimes used is to classify genetic variants that occur below 1% allele frequency as mutations rather than polymorphisms. However, since polymorphisms may occur at low allele frequency, this is not a reliable way to tell new mutations from polymorphisms. A mutation is a change to an inherited genetic sequence.- In unicellular organisms, there isn't a distinction.
- In multi-cellular organisms which replicate via sexual reproduction nearly all mutations are not passed on to subsequent generations. A mutation may, or may not, be passed on to off-spring, but also can be insertion or deletion of one or more nucleotides, changes in the number of times a short or longer sequence is repeated. Polymorphisms which result in a change in fitness are the grist for the mill of evolution by natural selection. All genetic polymorphisms start out as a mutation, but only if they are germline and are not lethal can they spread into a population. Polymorphisms are classified based on what happens at the level of the individual mutation in the DNA sequence, and what effect the mutation has on the phenotype. Polymorphisms are also classified based on whether the change is in the sequence of the resulting protein or in the regulation of the expression of the gene, which can occur at sites that are typically upstream and adjacent to the gene, but not always.
Identification
Types
A polymorphism can be any sequence difference. Examples include:- Single nucleotide polymorphisms are a single nucleotide changes that happen in the genome in a particular location. The single nucleotide polymorphism is the most common form of genetic variation.
- Small-scale insertions/deletions consist of insertions or deletions of bases in DNA.
- Polymorphic repetitive elements. Active transposable elements can also cause polymorphism by inserting themselves in new locations. For example, repetitive elements of the Alu and LINE1 families cause polymorphisms in human genome.
- Microsatellites are repeats of 1-6 base pairs of DNA sequence. Microsatellites are commonly used as a molecular markers especially for identifying the relationship between alleles
Clinical significance
Lung cancer
Polymorphisms have been discovered in multiple XPD exons. XPD refers to "xeroderma pigmentosum group D" and is involved in a DNA repair mechanism used during DNA replication. XPD works by cutting and removing segments of DNA that have been damaged due to things such as cigarette smoking and inhalation of other environmental carcinogens. Asp312Asn and Lys751Gln are the two common polymorphisms of XPD that result in a change in a single amino acid. This variation in Asn and Gln alleles has been related to individuals having a reduced DNA repair efficiency. Several studies have been conducted to see if this diminished capacity to repair DNA is related to an increased risk of lung cancer. These studies examined the XPD gene in lung cancer patients of varying age, gender, race, and pack-years. The studies provided mixed results, from concluding individuals who are homozygous for the Asn allele or homozygous for the Gln allele had an increased risk of developing lung cancer, to finding no statistical significance between smokers who have either allele polymorphism and their susceptibility to lung cancer. Research continues to be conducted to determine the relationship between XPD polymorphisms and lung cancer risk.As a cornerstone of Personalized medicine cancers, Sequence analysis is becoming increasingly important to understand the specific mutations involved in the individual's cancer, such as needed to select specific molecular targets such as mutations in various receptors, but also understanding the polymorphisms they inherited which play important roles in diagnosis, prognosis, and treatment, such as treatment of leukemia with 6-mercaptopurine where toxicity largely depends on polymorphisms in multiple different genes involved in its metabolism.