Hierarchical editing language for macromolecules
The hierarchical editing language for macromolecules is a method of describing complex biological molecules. It is a notation that is machine readable to render the composition and structure of peptides, proteins, oligonucleotides, and related small molecule linkers.
HELM was developed by a consortium of pharmaceutical companies in what is known as the Pistoia Alliance. Development began in 2008. In 2012 the notation was published openly and for free.
The HELM open source project can be found on GitHub.
Background
The need for a standard representation of biomolecules became apparent as researchers began working on modeling and computational projects involving molecules and engineered biomolecules. There was no language to accurately describe these entities, which required both composition and complex branching and structure. Protein sequences describe larger proteins, and mol files can represent simple peptides. However, the complexity of new research biomolecules makes describing large complex molecules difficult with chemical formats, and peptide formats are not sufficiently flexible to describe non-natural amino acids and other chemistries.Design
In HELM, molecules are represented at four levels in a hierarchy:- Complex polymer
- Simple polymer
- Monomer
- Atom
Adoption
In 2014 ChEMBL announced plans to adopt HELM by 2014. The informatics company BIOVIA developed a modified Molfile format called the Self-Contained Sequence Representation A standard which can incorporate individual attempts to solve the problem and be used universally and avoid proliferating standards is a goal of HELM.Tools
An editor tool is needed to visualize and work with biomolecules at the correct level of detail. The editor is needed to "zoom out" to see a large molecule at the amino-acid sequence level, then "zoom in" to the atomic level at a particular site of conjugation or derivatization.The HELM Editor and HAbE are two client tools which may in the future be released as web-based applications.