Non-proteinogenic amino acids


In biochemistry, non-coded or non-proteinogenic amino acids are distinct from the 22 proteinogenic amino acids, which are naturally encoded in the genome of organisms for the assembly of proteins. However, over 140 non-proteinogenic amino acids occur naturally in proteins and thousands more may occur in nature or be synthesized in the laboratory. Chemically synthesized amino acids are often referred to as unnatural or non-canonical amino acids. Unnatural amino acids can be synthetically prepared from their native analogs via modifications such as amine alkylation, side chain substitution, structural bond extension cyclization, and isosteric replacements within the amino acid backbone. Many non-proteinogenic amino acids are important:
  • intermediates in biosynthesis,
  • in post-translational formation of proteins,
  • in a physiological role,
  • natural or man-made pharmacological compounds,
  • present in meteorites or used in prebiotic experiments,
  • might be important neurotransmitters, such as γ-aminobutyric acid, and
  • can play a crucial role in cellular bioenergetics, such as creatine.

    Definition by negation

Technically, any organic compound with an amine and a carboxylic acid functional group is an amino acid. The proteinogenic amino acids are a small subset of this group that possess a central carbon atom bearing an amino group, a carboxyl group, a side chain and an α-hydrogen levo conformation, with the exception of glycine, which is achiral, and proline, whose amine group is a secondary amine and is consequently frequently referred to as an imino acid for traditional reasons, albeit not an imino.
The genetic code encodes 20 standard amino acids for incorporation into proteins during translation. However, there are two extra proteinogenic amino acids: selenocysteine and pyrrolysine. These non-standard amino acids do not have a dedicated codon, but are added in place of a stop codon when a specific sequence is present, UGA codon and SECIS element for selenocysteine, UAG PYLIS downstream sequence for pyrrolysine. All other amino acids are termed "non-proteinogenic".
There are various groups of amino acids:
  • 20 standard amino acids
  • 22 proteinogenic amino acids
  • over 80 amino acids created abiotically in high concentrations
  • about 900 are produced by natural pathways
  • over 118 engineered amino acids have been placed into proteins
These groups overlap, but are not identical. All 22 proteinogenic amino acids are biosynthesised by organisms and some, but not all, of them also are abiotic. Some natural amino acids, such as norleucine, are misincorporated translationally into proteins due to infidelity of the protein-synthesis process. Many amino acids, such as ornithine, are metabolic intermediates produced biosynthetically, but not incorporated translationally into proteins. Post-translational modification of amino acid residues in proteins leads to the formation of many proteinaceous, but non-proteinogenic, amino acids. Other amino acids are solely found in abiotic mixes. Over 30 unnatural amino acids have been inserted translationally into proteins in engineered systems, yet are not biosynthetic.

Nomenclature

In addition to the IUPAC numbering system to differentiate the various carbons in an organic molecule, by sequentially assigning a number to each carbon, including those forming a carboxylic group, the carbons along the side-chain of amino acids can also be labelled with Greek letters, where the α-carbon is the central chiral carbon possessing a carboxyl group, a side chain and, in α-amino acids, an amino group – the carbon in carboxylic groups is not counted.

Natural non-L-α-amino acids

Most natural amino acids are α-amino acids in the L configuration, but some exceptions exist.

Non-alpha

Some non-α-amino acids exist in organisms. In these structures, the amine group is displaced further from the carboxylic acid end of the amino acid molecule. Thus a β-amino acid has the amine group bonded to the second carbon away, and a γ-amino acid has it on the third. Examples include β-alanine, GABA, and δ-aminolevulinic acid.
The reason why α-amino acids are used in proteins has been linked to their frequency in meteorites and prebiotic experiments. An initial speculation on the deleterious properties of β-amino acids in terms of secondary structure turned out to be incorrect.

D-amino acids

Some amino acids contain the opposite absolute chirality, chemicals that are not available from normal ribosomal translation and transcription machinery. Most bacterial cells walls are formed by peptidoglycan, a polymer composed of amino sugars crosslinked with short oligopeptides bridged between each other. The oligopeptide is non-ribosomally synthesised and contains several peculiarities including D-amino acids, generally D-alanine and D-glutamate. A further peculiarity is that the former is racemised by a PLP-binding enzymes, whereas the latter is racemised by a cofactor independent enzyme. Some variants are present, in Thermotoga spp. D-Lysine is present and in certain vancomycin-resistant bacteria D-serine is present.

Without a hydrogen on the α-carbon

All proteinogenic amino acids have at least one hydrogen on the α-carbon. Glycine has two hydrogens, and all others have one hydrogen and one side-chain. Replacement of the remaining hydrogen with a larger substituent, such as a methyl group, distorts the protein backbone.
In some fungi α-aminoisobutyric acid is produced as a precursor to peptides, some of which exhibit antibiotic properties. This compound is similar to alanine, but possesses an additional methyl group on the α-carbon instead of a hydrogen. It is therefore achiral. Another compound similar to alanine without an α-hydrogen is dehydroalanine, which possesses a methylene sidechain. It is one of several naturally occurring dehydroamino acids.

Twin amino acid stereocentres

A subset of L-α-amino acids are ambiguous as to which of two ends is the α-carbon. In proteins a cysteine residue can form a disulfide bond with another cysteine residue, thus crosslinking the protein. Two crosslinked cysteines form a cystine molecule. Cysteine and methionine are generally produced by direct sulfurylation, but in some species they can be produced by transsulfuration, where the activated homoserine or serine is fused to a cysteine or homocysteine forming cystathionine. A similar compound is lanthionine, which can be seen as two alanine molecules joined via a thioether bond and is found in various organisms. Similarly, djenkolic acid, a plant toxin from jengkol beans, is composed of two cysteines connected by a methylene group. Diaminopimelic acid is both used as a bridge in peptidoglycan and is used a precursor to lysine.

Prebiotic amino acids and alternative biochemistries

In meteorites and in prebiotic experiments many more amino acids than the twenty standard amino acids are found, several of which are at higher concentrations than the standard ones. It has been conjectured that if amino acid based life were to arise elsewhere in the universe, no more than 75% of the amino acids would be in common. The most notable anomaly is the lack of aminobutyric acid.
MoleculeElectric dischargeMurchinson meteorite
glycine100100
alanine18036
α-amino-n-butyric acid6119
norvaline1414
valine4.4
norleucine1.4
leucine2.6
isoleucine1.1
alloisoleucine1.2
t-leucine< 0.005
α-amino-n-heptanoic acid0.3
proline0.322
pipecolic acid0.0111
α,β-diaminopropionic acid1.5
α,γ-diaminobutyric acid7.6
ornithine< 0.01
lysine< 0.01
aspartic acid7.713
glutamic acid1.720
serine1.1
threonine0.2
allothreonine0.2
methionine0.1
homocysteine0.5
homoserine0.5
β-alanine4.310
β-amino-n-butyric acid0.15
β-aminoisobutyric acid0.57
γ-aminobutyric acid0.57
α-aminoisobutyric acid733
isovaline111
sarcosine12.57
N-ethylglycine6.86
N-propylglycine0.5
N-isopropylglycine0.5
N-methylalanine3.43
N-ethylalanine< 0.05
N-methyl-β-alanine1.0
N-ethyl-β-alanine< 0.05
isoserine1.2
α-hydroxy-γ-aminobutyric acid17

Straight side chain

The genetic code has been described as a frozen accident and the reasons why there is only one standard amino acid with a straight chain, alanine, could simply be redundancy with valine, leucine and isoleucine. However, straight chained amino acids are reported to form much more stable alpha helices.