Human genetic variation
Human genetic variation is the genetic differences in and among populations. There may be multiple variants of any given gene in the human population, a situation called polymorphism.
No two humans are genetically identical. Even monozygotic twins have infrequent genetic differences due to mutations occurring during development and gene copy-number variation. Differences between individuals, even closely related individuals, are the key to techniques such as genetic fingerprinting.
The human genome has a total length of approximately 3.2 billion base pairs in 46 chromosomes of DNA as well as slightly under 17,000 bp DNA in cellular mitochondria. In 2015, the typical difference between an individual's genome and the reference genome was estimated at 20 million base pairs. As of 2017, there were a total of 324 million known variants from sequenced human genomes.
Comparatively speaking, humans are a genetically homogeneous species. Although a small number of genetic variants are found more frequently in certain geographic regions or in people with ancestry from those regions, this variation accounts for a small portion of human genome variability. The majority of variation exists within the members of each human population. For comparison, rhesus macaques exhibit 2.5-fold greater DNA sequence diversity compared to humans. These rates differ depending on what macromolecules are being analyzed. Chimpanzees have more genetic variance than humans when examining nuclear DNA, but humans have more genetic variance when examining at the level of proteins.
The lack of discontinuities in genetic distances between human populations, absence of discrete branches in the human species, and striking homogeneity of human beings globally, imply that there is no scientific basis for inferring races or subspecies in humans, and for most traits, there is much more variation within populations than between them.
Despite this, modern genetic studies have found substantial average genetic differences across human populations in traits such as skin colour, bodily dimensions, lactose and starch digestion, high altitude adaptions, drug response, taste receptors, and predisposition to developing particular diseases. The greatest diversity is found within and among populations in Africa, and gradually declines with increasing distance from the African continent, consistent with the Out of Africa theory of human origins.
The study of human genetic variation has evolutionary significance and medical applications. It can help scientists reconstruct and understand patterns of past human migration. In medicine, study of human genetic variation may be important because some disease-causing alleles occur more often in certain population groups. For instance, the mutation for sickle-cell anemia is more often found in people with ancestry from certain sub-Saharan African, south European, Arabian, and Indian populations, due to the evolutionary pressure from mosquitos carrying malaria in these regions.
New findings show that each human has on average 60 new mutations compared to their parents.
Causes of variation
Causes of differences between individuals include independent assortment, the exchange of genes during reproduction and various mutational events.There are at least three reasons why genetic variation exists between populations. Natural selection may confer an adaptive advantage to individuals in a specific environment if an allele provides a competitive advantage. Alleles under selection are likely to occur only in those geographic regions where they confer an advantage. A second important process is genetic drift, which is the effect of random changes in the gene pool, under conditions where most mutations are neutral. Finally, small migrant populations have statistical differences – called the founder effect – from the overall populations where they originated; when these migrants settle new areas, their descendant population typically differs from their population of origin: different genes predominate and it is less genetically diverse.
In humans, the main cause is genetic drift. Serial founder effects and past small population size may have had an important influence in neutral differences between populations. The second main cause of genetic variation is due to the high degree of neutrality of most mutations. A small, but significant number of genes appear to have undergone recent natural selection, and these selective pressures are sometimes specific to one region.
Measures of variation
Genetic variation among humans occurs on many scales, from gross alterations in the human karyotype to single nucleotide changes. Chromosome abnormalities are detected in 1 of 160 live human births. Apart from sex chromosome disorders, most cases of aneuploidy result in death of the developing fetus ; the most common extra autosomal chromosomes among live births are 21, 18 and 13.Nucleotide diversity is the average proportion of nucleotides that differ between two individuals. As of 2004, the human nucleotide diversity was estimated to be 0.1% to 0.4% of base pairs. In 2015, the 1000 Genomes Project, which sequenced one thousand individuals from 26 human populations, found that "a typical genome differs from the reference human genome at 4.1 million to 5.0 million sites … affecting 20 million bases of sequence"; the latter figure corresponds to 0.6% of total number of base pairs. Nearly all of these sites are small differences, either single nucleotide polymorphisms or brief insertions or deletions in the genetic sequence, but structural variations account for a greater number of base-pairs than the SNPs and indels.
, the Single Nucleotide Polymorphism Database, which lists SNP and other variants, listed 324 million variants found in sequenced human genomes.
Single nucleotide polymorphisms
A single nucleotide polymorphism is a difference in a single nucleotide between members of one species that occurs in at least 1% of the population. The 2,504 individuals characterized by the 1000 Genomes Project had 84.7 million SNPs among them. SNPs are the most common type of sequence variation, estimated in 1998 to account for 90% of all sequence variants. Other sequence variations are single base exchanges, deletions and insertions. SNPs occur on average about every 100 to 300 bases and so are the major source of heterogeneity.A functional, or non-synonymous, SNP is one that affects some factor such as gene splicing or messenger RNA, and so causes a phenotypic difference between members of the species. About 3% to 5% of human SNPs are functional. Neutral, or synonymous SNPs are still useful as genetic markers in genome-wide association studies, because of their sheer number and the stable inheritance over generations.
A coding SNP is one that occurs inside a gene. There are 105 Human Reference SNPs that result in premature stop codons in 103 genes. This corresponds to 0.5% of coding SNPs. They occur due to segmental duplication in the genome. These SNPs result in loss of protein, yet all these SNP alleles are common and are not purified in negative selection.
Structural variation
is the variation in structure of an organism's chromosome. Structural variations, such as copy-number variation and deletions, inversions, insertions and duplications, account for much more human genetic variation than single nucleotide diversity. This was concluded in 2007 from analysis of the diploid full sequences of the genomes of two humans: Craig Venter and James D. Watson. This added to the two haploid sequences which were amalgamations of sequences from many individuals, published by the Human Genome Project and Celera Genomics respectively.According to the 1000 Genomes Project, a typical human has 2,100 to 2,500 structural variations, which include approximately 1,000 large deletions, 160 copy-number variants, 915 Alu insertions, 128 L1 insertions, 51 SVA insertions, 4 NUMTs, and 10 inversions.
Copy number variation
A copy-number variation is a difference in the genome due to deleting or duplicating large regions of DNA on some chromosome. It is estimated that 0.4% of the genomes of unrelated humans differ with respect to copy number. When copy number variation is included, human-to-human genetic variation is estimated to be at least 0.5%. Copy number variations are inherited but can also arise during development.A visual map with the regions with high genomic variation of the modern-human reference assembly relatively to a
Neanderthal of 50k has been built by Pratas et al.
Epigenetics
variation is variation in the chemical tags that attach to DNA and affect how genes get read. The tags, "called epigenetic markings, act as switches that control how genes can be read." At some alleles, the epigenetic state of the DNA, and associated phenotype, can be inherited across generations of individuals.Genetic variability
Genetic variability is a measure of the tendency of individual genotypes in a population to vary from one another. Variability is different from genetic diversity, which is the amount of variation seen in a particular population. The variability of a trait is how much that trait tends to vary in response to environmental and genetic influences.Clines
In biology, a cline is a continuum of species, populations, varieties, or forms of organisms that exhibit gradual phenotypic and/or genetic differences over a geographical area, typically as a result of environmental heterogeneity. In the scientific study of human genetic variation, a gene cline can be rigorously defined and subjected to quantitative metrics.Haplogroups
In the study of molecular evolution, a haplogroup is a group of similar haplotypes that share a common ancestor with a single nucleotide polymorphism mutation. The study of haplogroups provides information about ancestral origins dating back thousands of years.The most commonly studied human haplogroups are Y-chromosome haplogroups and mitochondrial DNA haplogroups, both of which can be used to define genetic populations. Y-DNA is passed solely along the patrilineal line, from father to son, while mtDNA is passed down the matrilineal line, from mother to both daughter or son. The Y-DNA and mtDNA may change by chance mutation at each generation.