Molecular dynamics


Molecular dynamics is a computer simulation method for analyzing the physical movements of atoms and molecules. The atoms and molecules are allowed to interact for a fixed period of time, giving a view of the dynamic "evolution" of the system. In the most common version, the trajectories of atoms and molecules are determined by numerically solving Newton's equations of motion for a system of interacting particles, where forces between the particles and their potential energies are often calculated using interatomic potentials or molecular mechanical force fields. MD simulations are widely applied in chemical physics, materials science, and biophysics.
Because molecular systems typically consist of a vast number of particles, it is impossible to determine the properties of such complex systems analytically; MD simulation circumvents this problem by using numerical methods. However, long MD simulations are mathematically ill-conditioned, generating cumulative errors in numerical integration that can be minimized with proper selection of algorithms and parameters, but not eliminated.
For systems that obey the ergodic hypothesis, the evolution of one molecular dynamics simulation may be used to determine the macroscopic thermodynamic properties of the system: the time averages of an ergodic system correspond to microcanonical ensemble averages. MD has also been termed "statistical mechanics by numbers" and "Laplace's vision of Newtonian mechanics" of predicting the future by animating nature's forces and allowing insight into molecular motion on an atomic scale.

History

MD was originally developed in the early 1950s, following earlier successes with Monte Carlo simulationswhich themselves date back to the eighteenth century, in the Buffon's needle problem for examplebut was popularized for statistical mechanics at Los Alamos National Laboratory by Marshall Rosenbluth and Nicholas Metropolis in what is known today as the Metropolis–Hastings algorithm. Interest in the time evolution of N-body systems dates much earlier to the seventeenth century, beginning with Isaac Newton, and continued into the following century largely with a focus on celestial mechanics and issues such as the stability of the Solar System. Many of the numerical methods used today were developed during this time period, which predates the use of computers; for example, the most common integration algorithm used today, the Verlet integration algorithm, was used as early as 1791 by Jean Baptiste Joseph Delambre. Numerical calculations with these algorithms can be considered to be MD done "by hand".
As early as 1941, integration of the many-body equations of motion was carried out with analog computers. Some undertook the labor-intensive work of modeling atomic motion by constructing physical models, e.g., using macroscopic spheres. The aim was to arrange them in such a way as to replicate the structure of a liquid and use this to examine its behavior. J.D. Bernal describes this process in 1962, writing:
... I took a number of rubber balls and stuck them together with rods of a selection of different lengths ranging from 2.75 to 4 inches. I tried to do this in the first place as casually as possible, working in my own office, being interrupted every five minutes or so and not remembering what I had done before the interruption.
Following the discovery of microscopic particles and the development of computers, interest expanded beyond the proving ground of gravitational systems to the statistical properties of matter. In an attempt to understand the origin of irreversibility, Enrico Fermi proposed in 1953, and published in 1955, the use of the early computer MANIAC I, also at Los Alamos National Laboratory, to solve the time evolution of the equations of motion for a many-body system subject to several choices of force laws. Today, this seminal work is known as the Fermi–Pasta–Ulam–Tsingou problem. The time evolution of the energy from the original work is shown in the figure to the right.
In 1957, Berni Alder and Thomas Wainwright used an IBM 704 computer to simulate perfectly elastic collisions between hard spheres. In 1960, in perhaps the first realistic simulation of matter, J.B. Gibson et al. simulated radiation damage of solid copper by using a Born–Mayer type of repulsive interaction along with a cohesive surface force. In 1964, Aneesur Rahman published simulations of liquid argon that used a Lennard-Jones potential; calculations of system properties, such as the coefficient of self-diffusion, compared well with experimental data. Today, the Lennard-Jones potential is still one of the most frequently used intermolecular potentials. It is used for describing simple substances for conceptual and model studies and as a building block in many force fields of real substances.

Areas of application and limits

First used in theoretical physics, the molecular dynamics method gained popularity in materials science soon afterward, and since the 1970s it has also been commonly used in biochemistry and biophysics. MD is frequently used to refine 3-dimensional structures of proteins and other macromolecules based on experimental constraints from X-ray crystallography or NMR spectroscopy. In physics, MD is used to examine the dynamics of atomic-level phenomena that cannot be observed directly, such as thin film growth and ion subplantation, and to examine the physical properties of nanotechnological devices that have not or cannot yet be created. In biophysics and structural biology, the method is frequently applied to study the motions of macromolecules such as proteins and nucleic acids, which can be useful for interpreting the results of certain biophysical experiments and for modeling interactions with other molecules, as in ligand docking. In principle, MD can be used for ab initio prediction of protein structure by simulating folding of the polypeptide chain from a random coil. MD can also be used to compute other thermodynamic properties such as drug solubilities and free energies of solvation including in polymers.
The results of MD simulations can be tested through comparison to experiments that measure molecular dynamics, of which a popular method is NMR spectroscopy. MD-derived structure predictions can be tested through community-wide experiments in Critical Assessment of Protein Structure Prediction, although the method has historically had limited success in this area. Michael Levitt, who shared the Nobel Prize partly for the application of MD to proteins, wrote in 1999 that CASP participants usually did not use the method due to "... a central embarrassment of molecular mechanics, namely that energy minimization or molecular dynamics generally leads to a model that is less like the experimental structure". Improvements in computational resources permitting more and longer MD trajectories, combined with modern improvements in the quality of force field parameters, have yielded some improvements in both structure prediction and homology model refinement, without reaching the point of practical utility in these areas; many identify force field parameters as a key area for further development.
MD simulation has been reported for pharmacophore development and drug design. For example, Pinto et al. implemented MD simulations of Bcl-xL complexes to calculate average positions of critical amino acids involved in ligand binding. Carlson et al. implemented molecular dynamics simulations to identify compounds that complement a receptor while causing minimal disruption to the conformation and flexibility of the active site. Snapshots of the protein at constant time intervals during the simulation were overlaid to identify conserved binding regions for pharmacophore development. Spyrakis et al. relied on a workflow of MD simulations, fingerprints for ligands and proteins and linear discriminant analysis to identify the best ligand-protein conformations to act as pharmacophore templates based on retrospective ROC analysis of the resulting pharmacophores. In an attempt to ameliorate structure-based drug discovery modeling, vis-à-vis the need for many modeled compounds, Hatmal et al. proposed a combination of MD simulation and ligand-receptor intermolecular contacts analysis to discern critical intermolecular contacts from redundant ones in a single ligand–protein complex. Critical contacts can then be converted into pharmacophore models that can be used for virtual screening.
An important factor is intramolecular hydrogen bonds, which are not explicitly included in modern force fields, but described as Coulomb interactions of atomic point charges. This is a crude approximation because hydrogen bonds have a partially quantum mechanical and chemical nature. Furthermore, electrostatic interactions are usually calculated using the dielectric constant of a vacuum, even though the surrounding aqueous solution has a much higher dielectric constant. Thus, using the macroscopic dielectric constant at short interatomic distances is questionable. Finally, van der Waals interactions in MD are usually described by Lennard-Jones potentials based on the Fritz London theory that is only applicable in a vacuum. However, all types of van der Waals forces are ultimately of electrostatic origin and therefore depend on dielectric properties of the environment. The direct measurement of attraction forces between different materials shows that "the interaction between hydrocarbons across water is about 10% of that across vacuum". The environment-dependence of van der Waals forces is neglected in standard simulations, but can be included by developing polarizable force fields.

Design constraints

The design of a molecular dynamics simulation should account for the available computational power. Simulation size, timestep, and total time duration must be selected so that the calculation can finish within a reasonable time period. However, the simulations should be long enough to be relevant to the time scales of the natural processes being studied. To make statistically valid conclusions from the simulations, the time span simulated should match the kinetics of the natural process. Otherwise, it is analogous to making conclusions about how a human walks when only looking at less than one footstep. Most scientific publications about the dynamics of proteins and DNA use data from simulations spanning nanoseconds to microseconds. To obtain these simulations, several CPU-days to CPU-years are needed. Parallel algorithms allow the load to be distributed among CPUs; an example is the spatial or force decomposition algorithm.
During a classical MD simulation, the most CPU intensive task is the evaluation of the potential as a function of the particles' internal coordinates. Within that energy evaluation, the most expensive one is the non-bonded or non-covalent part. In big O notation, common molecular dynamics simulations scale by if all pair-wise electrostatic and van der Waals interactions must be accounted for explicitly. This computational cost can be reduced by employing electrostatics methods such as particle mesh Ewald summation, particle-particle-particle mesh, or good spherical cutoff methods.
Another factor that impacts total CPU time needed by a simulation is the size of the integration timestep. This is the time length between evaluations of the potential. The timestep must be chosen small enough to avoid discretization errors. Typical timesteps for classical MD are on the order of 1 femtosecond. This value may be extended by using algorithms such as the SHAKE constraint algorithm, which fix the vibrations of the fastest atoms into place. Multiple time scale methods have also been developed, which allow extended times between updates of slower long-range forces.
For simulating molecules in a solvent, a choice should be made between an explicit and implicit solvent. Explicit solvent particles must be calculated expensively by the force field, while implicit solvents use a mean-field approach. Using an explicit solvent is computationally expensive, requiring inclusion of roughly ten times more particles in the simulation. But the granularity and viscosity of explicit solvent is essential to reproduce certain properties of the solute molecules. This is especially important to reproduce chemical kinetics.
In all kinds of molecular dynamics simulations, the simulation box size must be large enough to avoid boundary condition artifacts. Boundary conditions are often treated by choosing fixed values at the edges, or by employing periodic boundary conditions in which one side of the simulation loops back to the opposite side, mimicking a bulk phase.