Crash testing machine learning force fields for molecules, materials, and interfaces: molecular dynamics in the TEA challenge 2023
Poltavsky I., Puleva M., Charkin-Gorbulin A., Fonseca G., Batatia I., Browning N.J., Chmiela S., Cui M., Frank J.T., Heinen S., Huang B., Käser S., Kabylda A., Khan D., Müller C., Price A.J.A., Riedmiller K., Töpfer K., Ko T.W., Meuwly M., Rupp M., Csányi G., Anatole von Lilienfeld O., Margraf J.T., Müller K.R., Tkatchenko A.
Chemical Science, vol. 16, n° 8, pp. 3738-3754, 2025
We present the second part of the rigorous evaluation of modern machine learning force fields (MLFFs) within the TEA Challenge 2023. This study provides an in-depth analysis of the performance of MACE, SO3krates, sGDML, SOAP/GAP, and FCHL19* in modeling molecules, molecule-surface interfaces, and periodic materials. We compare observables obtained from molecular dynamics (MD) simulations using different MLFFs under identical conditions. Where applicable, density-functional theory (DFT) or experiment serves as a reference to reliably assess the performance of the ML models. In the absence of DFT benchmarks, we conduct a comparative analysis based on results from various MLFF architectures. Our findings indicate that, at the current stage of MLFF development, the choice of ML model is in the hands of the practitioner. When a problem falls within the scope of a given MLFF architecture, the resulting simulations exhibit weak dependency on the specific architecture used. Instead, emphasis should be placed on developing complete, reliable, and representative training datasets. Nonetheless, long-range noncovalent interactions remain challenging for all MLFF models, necessitating special caution in simulations of physical systems where such interactions are prominent, such as molecule-surface interfaces. The findings presented here reflect the state of MLFF models as of October 2023.