Histone
In biology, histones are highly-basic proteins abundant in lysine and arginine residues that are found in eukaryotic cell nuclei and in most Archaeal phyla. They act as spools around which DNA winds to create structural units called nucleosomes. Nucleosomes in turn are wrapped into 30-nanometer fibers that form tightly packed chromatin. Histones prevent DNA from becoming tangled and protect it from DNA damage. In addition, histones play important roles in gene regulation and DNA replication. Without histones, unwound DNA in chromosomes would be very long. For example, each human cell has about 1.8 meters of DNA if completely stretched out; however, when wound about histones, this length is reduced to about 9 micrometers of 30 nm diameter chromatin fibers.
There are five families of histones, which are designated H1/H5, H2, H3, and H4. The nucleosome core is formed of two H2A-H2B dimers and a H3-H4 tetramer. The tight wrapping of DNA around histones, is to a large degree, a result of electrostatic attraction between the positively charged histones and negatively charged phosphate backbone of DNA.
Histones may be chemically modified through the action of enzymes to regulate gene transcription. The most common modifications are the methylation of arginine or lysine residues or the acetylation of lysine. Methylation can affect how other proteins such as transcription factors interact with the nucleosomes. Lysine acetylation eliminates a positive charge on lysine thereby weakening the electrostatic attraction between histone and DNA, resulting in partial unwinding of the DNA, making it more accessible for gene expression.
Classes and variants
Five major families of histone proteins exist: H1/H5, H2A, H2B, H3, and H4. Histones H2A, H2B, H3 and H4 are known as the core or nucleosomal histones, while histones H1/H5 are known as the linker histones.The core histones all exist as dimers, which are similar in that they all possess the histone fold domain: three alpha helices linked by two loops. It is this helical structure that allows for interaction between distinct dimers, particularly in a head-tail fashion. The resulting four distinct dimers then come together to form one octameric nucleosome core, approximately 63 Angstroms in diameter. Around 146 base pairs of DNA wrap around this core particle 1.65 times in a left-handed super-helical turn to give a particle of around 100 Angstroms across. The linker histone H1 binds the nucleosome at the entry and exit sites of the DNA, thus locking the DNA into place and allowing the formation of higher order structure. The most basic such formation is the 10 nm fiber or beads on a string conformation. This involves the wrapping of DNA around nucleosomes with approximately 50 base pairs of DNA separating each pair of nucleosomes. Higher-order structures include the 30 nm fiber and 100 nm fiber, these being the structures found in normal cells. During mitosis and meiosis, the condensed chromosomes are assembled through interactions between nucleosomes and other regulatory proteins.
Histones are subdivided into canonical replication-dependent histones, whose genes are expressed during the S-phase of the cell cycle and replication-independent histone variants, expressed during the whole cell cycle. In mammals, genes encoding canonical histones are typically clustered along chromosomes in 4 different highly-conserved loci, lack introns and use a stem loop structure at the 3' end instead of a polyA tail. Genes encoding histone variants are usually not clustered, have introns and their mRNAs are regulated with polyA tails. Complex multicellular organisms typically have a higher number of histone variants providing a variety of different functions. Functionally, histone variants contribute to transcriptional control, epigenetic memory, and DNA repair, serving specialized functions beyond nucleosome packaging which plays distinct roles in chromatin dynamics. For example, H2A.Z is enriched at regulatory elements and promoters of actively transcribed genes, where it modulates nucleosome stability and transcription factor binding. In contrast, H3.3, a replacement variant of Histone H3, is associated with active transcription and is preferentially deposited at enhancer elements and transcribed gene bodies. Another critical variant, CENPA, replaces H3 in centromeric nucleosomes, providing a structural foundation essential for chromosome segregation.
Variants also play essential roles in DNA repair. Variants such as H2A.X are phosphorylated at sites of DNA damage, marking regions for recruitment of repair proteins. This modification, commonly referred to as γH2A.X, serves as a key signal in the cellular response to double-strand breaks, facilitating efficient DNA repair processes. Defects in histone variant regulation have been linked to genome instability, a hallmark of many cancers and age-related diseases.
Recent data are accumulating about the roles of diverse histone variants highlighting the functional links between variants and the delicate regulation of organism development. Histone variants proteins from different organisms, their classification and variant specific features can be found in database. Several pseudogenes have also been discovered and identified in very close sequences of their respective functional ortholog genes.
The following is a list of human histone proteins, genes and pseudogenes:
| Super family | Family | Replication-dependent genes | Replication-independent genes | Pseudogenes |
| Linker | H1 | H1-1, H1-2, H1-3, H1-4, H1-5, H1-6 | H1-0, H1-7, H1-8, H1-10 | H1-9P, H1-12P |
| Core | H2A | H2AC1, H2AC4, H2AC6, H2AC7, H2AC8, H2AC11, H2AC12, H2AC13, H2AC14, H2AC15, H2AC16, H2AC17, H2AC18, H2AC19, H2AC20, H2AC21, H2AC25 | H2AZ1, H2AZ2, MACROH2A1, MACROH2A2, H2AX, H2AJ, H2AB1, H2AB2, H2AB3, H2AP, H2AL1Q, H2AL3 | H2AC2P, H2AC3P, H2AC5P, H2AC9P, H2AC10P, H2AQ1P, H2AL1MP |
| Core | H2B | H2BC1, H2BC3, H2BC4, H2BC5, H2BC6, H2BC7, H2BC8, H2BC9, H2BC10, H2BC11, H2BC12, H2BC13, H2BC14, H2BC15, H2BC17, H2BC18, H2BC21, H2BC26, H2BC12L | H2BK1, H2BW1, H2BW2, H2BW3P, H2BN1 | H2BC2P, H2BC16P, H2BC19P, H2BC20P, H2BC27P, H2BL1P, H2BW3P, H2BW4P |
| Core | H3 | H3C1, H3C2, H3C3, H3C4, H3C6, H3C7, H3C8, H3C10, H3C11, H3C12, H3C13, H3C14, H3C15, H3-4 | H3-3A, H3-3B, H3-5, H3-7, H3Y1, H3Y2, CENPA | H3C5P, H3C9P, H3P16, H3P44 |
| Core | H4 | H4C1, H4C2, H4C3, H4C4, H4C5, H4C6, H4C7, H4C8, H4C9, H4C11, H4C12, H4C13, H4C14, H4C15 | H4C16 | H4C10P |
Structure
The nucleosome core is formed of two H2A-H2B dimers and a H3-H4 tetramer, forming two nearly symmetrical halves by tertiary structure. The H2A-H2B dimers and H3-H4 tetramer also show pseudodyad symmetry. The 4 'core' histones are relatively similar in structure and are highly conserved through evolution, all featuring a 'helix turn helix turn helix' motif. They also share the feature of long 'tails' on one end of the amino acid structure - this being the location of post-translational modification.Archaeal histone only contains a H3-H4 like dimeric structure made out of a single type of unit. Such dimeric structures can stack into a tall superhelix onto which DNA coils in a manner similar to nucleosome spools. Only some archaeal histones have tails.
The distance between the spools around which eukaryotic cells wind their DNA has been determined to range from 59 to 70 Å.
In all, histones make five types of interactions with DNA:
- Salt bridges and hydrogen bonds between side chains of basic amino acids and phosphate oxygens on DNA
- Helix-dipoles form alpha-helixes in H2B, H3, and H4 cause a net positive charge to accumulate at the point of interaction with negatively charged phosphate groups on DNA
- Hydrogen bonds between the DNA backbone and the amide group on the main chain of histone proteins
- Nonpolar interactions between the histone and deoxyribose sugars on DNA
- Non-specific minor groove insertions of the H3 and H2B N-terminal tails into two minor grooves each on the DNA molecule
Histones are subject to post translational modification by enzymes primarily on their N-terminal tails, but also in their globular domains. Such modifications include methylation, citrullination, acetylation, phosphorylation, SUMOylation, ubiquitination, and ADP-ribosylation. This affects their function of gene regulation.
In general, genes that are active have less bound histone, while inactive genes are highly associated with histones during interphase. It also appears that the structure of histones has been evolutionarily conserved, as any deleterious mutations would be severely maladaptive. All histones have a highly positively charged N-terminus with many lysine and arginine residues.
Evolution and species distribution
Core histones are found in the nuclei of eukaryotic cells and in most Archaeal phyla, but not in bacteria. The unicellular algae known as dinoflagellates were previously thought to be the only eukaryotes that completely lack histones, but later studies showed that their DNA still encodes histone genes. Unlike the core histones, homologs of the lysine-rich linker histone proteins are found in bacteria, otherwise known as nucleoprotein HC1/HC2.It has been proposed that core histone proteins are evolutionarily related to the helical part of the extended AAA+ ATPase domain, the C-domain, and to the N-terminal substrate recognition domain of Clp/Hsp100 proteins. Despite the differences in their topology, these three folds share a homologous helix-strand-helix motif. It's also proposed that they may have evolved from ribosomal proteins, both being short and basic proteins.
Archaeal histones may well resemble the evolutionary precursors to eukaryotic histones. Histone proteins are among the most highly conserved proteins in eukaryotes, emphasizing their important role in the biology of the nucleus. In contrast mature sperm cells largely use protamines to package their genomic DNA, most likely because this allows them to achieve an even higher packaging ratio.
There are some variant forms in some of the major classes. They share amino acid sequence homology and core structural similarity to a specific class of major histones but also have their own feature that is distinct from the major histones. These minor histones usually carry out specific functions of the chromatin metabolism. For example, histone H3-like CENPA is associated with only the centromere region of the chromosome. Histone H2A variant H2A.Z is associated with the promoters of actively transcribed genes and also involved in the prevention of the spread of silent heterochromatin. Furthermore, H2A.Z has roles in chromatin for genome stability. Another H2A variant H2A.X is phosphorylated at S139 in regions around double-strand breaks and marks the region undergoing DNA repair. Histone H3.3 is associated with the body of actively transcribed genes.