Authors: Magnus Lundborg, Rossen Apostolov, Daniel Spångberg, Anders Gärdenäs, David van der Spoel, and Erik Lindahl
Molecular dynamics simulations is an important application in theoretical chemistry, and with the large high-performance computing resources available today the programs also generate huge amounts of output data. In particular in life sciences, with complex biomolecules such as proteins, simulation projects regularly deal with several terabytes of data. Apart from the need for more cost-efficient storage, it is increasingly important to be able to archive data, secure the integrity against disk or file transfer errors, to provide rapid access, and facilitate exchange of data through open interfaces. There is already a whole range of different formats used, but few if any of them (including our previous ones) fulfill all these goals. To address these shortcomings, we present “Trajectory Next Generation” (TNG)—a flexible but highly optimized and efficient file format designed with interoperability in mind. TNG both provides state-of-the-art multiframe compression as well as a container framework that will make it possible to extend it with new compression algorithms without modifications in programs using it. TNG will be the new file format in the next major release of the GROMACS package, but it has been implemented as a separate library and API with liberal licensing to enable wide adoption both in academic and commercial codes.
Journal of Computational Chemistry 2013, DOI: 10.1002/jcc.23495
The ymol program has been updated to version 0.8.144.
Authors: Emma Ahlstrand, Daniel Spångberg, Kersti Hermansson, Ran Friedman
Interactions between the group XII metals Zn2+ and Cd2+ and amino acid residues play an important role in biology due to the prevalence of the first and the toxicity of the second. Estimates of the interaction energies between the ions and relevant residues in proteins are however difficult to obtain. This study reports on calculated interaction energy curves for small complexes of Zn2+ or Cd2+ and amino acid mimics (acetate, methanethiolate, and imidazole) or water. Given that many applications and models (e.g., force fields, solvation models, etc.) begin with and rely on an accurate description of gas-phase interaction energies, this is where our focus lies in this study. Four density functional theory (DFT)-functionals and MP2 were used to calculate the interaction energies not only at the respective equilibrium distances but also at a relevant range of ion–ligand separation distances. The calculated values were compared with those obtained by CCSD(T). All DFT-methods are found to overestimate the magnitude of the interaction energy compared to the CCSD(T) reference values. The deviation was analyzed in terms of energy components from localized molecular orbital energy decomposition analysis scheme and is mostly attributed to overestimation of the polarization energy. MP2 shows good agreement with CCSD(T) [root mean square error (RMSE) = 1.2 kcal/mol] for the eight studied complexes at equilibrium distance. Dispersion energy differences at longer separation give rise to increased deviations between MP2 and CCSD(T) (RMSE = 6.4 kcal/mol at 3.0 Å). Overall, the results call for caution in applying DFT methods to metalloprotein model complexes even with closed-shell metal ions such as Zn2+ and Cd2+, in particular at ion–ligand separations that are longer than the equilibrium distances.
International Journal of Quantum Chemistry Volume 113, Issue 23, pages 2554–2562, 5 December 2013
Authors: Daniel Spångberg, Elvira Guàrdia, Marco Masia
Recently many various research groups have devoted a huge effort to develop a realistic classical force field for ions in water. The parametrization techniques used could be gathered into two classes: (i) fit of the ab initio potential energy surface for clusters at gas phase, and (ii) fit of experimental properties. For both classes of force fields, a high level of accuracy has been achieved, which has led to important improvements in the modeling of ion–water systems. In this paper a new, complementary, approach is proposed to overcome the limitations and to get a deeper insight into the atomistic description of ion–water interactions. We use the recently developed force matching method to parametrize classical halide–water force fields for three different water models. Here we discuss both methodological issues and the level of agreement between the results obtained using this method to Car–Parrinello simulation results.
Computational and Theoretical Chemistry, 982: 58-65
Authors: Jonàs Sala, Elvira Guàrdia, Jordi Martí, Daniel Spångberg, and Marco Masia
In the quest towards coarse-grained potentials and new water models, we present an extension of the force matching technique to parameterize an all-atom force field for rigid water. The methodology presented here allows to improve the matching procedure by first optimizing the weighting exponents present in the objective function. A new gauge for unambiguously evaluating the quality of the fit has been introduced; it is based on the root mean square difference of the distributions of target properties between reference data and fitted potentials. Four rigid water models have been parameterized; the matching procedure has been used to assess the role of the ghost atom in TIP4P-like models and of electrostatic damping. In the former case, burying the negative charge inside the molecule allows to fit better the torques. In the latter, since short-range interactions are damped, a better fit of the forces is obtained. Overall, the best performing model is the one with a ghost atom and with electrostatic damping. The approach shown in this paper is of general validity and could be applied to any matching algorithm and to any level of coarse graining, also for non-rigid molecules.
J. Chem. Phys. 136, 054103 (2012);
Authors: Daniel Spångberg, Daniel S. D. Larsson, David van der Spoel
We present general algorithms for the compression of molecular dynamics trajectories. The standard ways to store MD trajectories as text or as raw binary floating point numbers result in very large files when efficient simulation programs are used on supercomputers. Our algorithms are based on the observation that differences in atomic coordinates/velocities, in either time or space, are generally smaller than the absolute values of the coordinates/velocities. Also, it is often possible to store values at a lower precision. We apply several compression schemes to compress the resulting differences further. The most efficient algorithms developed here use a block sorting algorithm in combination with Huffman coding. Depending on the frequency of storage of frames in the trajectory, either space, time, or combinations of space and time differences are usually the most efficient. We compare the efficiency of our algorithms with each other and with other algorithms present in the literature for various systems: liquid argon, water, a virus capsid solvated in 15 mM aqueous NaCl, and solid magnesium oxide. We perform tests to determine how much precision is necessary to obtain accurate structural and dynamic properties, as well as benchmark a parallelized implementation of the algorithms. We obtain compression ratios (compared to single precision floating point) of 1:3.3–1:35 depending on the frequency of storage of frames and the system studied.
Journal of Molecular Modeling October 2011, Volume 17, Issue 10, pp 2669-2685