Ensemble learning of coarse-grained molecular dynamics force fields with a kernel approach
العنوان: | Ensemble learning of coarse-grained molecular dynamics force fields with a kernel approach |
---|---|
المؤلفون: | Cecilia Clementi, Klaus-Robert Müller, Stefan Chmiela, Frank Noé, Jiang Wang |
المصدر: | Journal of Chemical Physics |
بيانات النشر: | AIP Publishing, 2020. |
سنة النشر: | 2020 |
مصطلحات موضوعية: | FOS: Computer and information sciences, Imagination, Computer science, media_common.quotation_subject, FOS: Physical sciences, General Physics and Astronomy, Machine Learning (stat.ML), Molecular dynamics, 010402 general chemistry, 01 natural sciences, Force field (chemistry), Search engine, Statistics - Machine Learning, Physics - Chemical Physics, Machine learning, 0103 physical sciences, Coarse-grain model, Coarse-grained force fields, Physical and Theoretical Chemistry, media_common, Chemical Physics (physics.chem-ph), 000 Informatik, Informationswissenschaft, allgemeine Werke::000 Informatik, Wissen, Systeme::000 Informatik, Informationswissenschaft, allgemeine Werke, Training set, Artificial neural networks, 010304 chemical physics, Artificial neural network, 500 Naturwissenschaften und Mathematik::530 Physik::530 Physik, Energy landscape, Computational Physics (physics.comp-ph), Computer simulation, Ensemble learning, 0104 chemical sciences, Free energy landscapes, Peptides, 500 Naturwissenschaften und Mathematik::540 Chemie::540 Chemie und zugeordnete Wissenschaften, Physics - Computational Physics, Algorithm |
الوصف: | Gradient-domain machine learning (GDML) is an accurate and efficient approach to learn a molecular potential and associated force field based on the kernel ridge regression algorithm. Here, we demonstrate its application to learn an effective coarse-grained (CG) model from all-atom simulation data in a sample efficient manner. The coarse-grained force field is learned by following the thermodynamic consistency principle, here by minimizing the error between the predicted coarse-grained force and the all-atom mean force in the coarse-grained coordinates. Solving this problem by GDML directly is impossible because coarse-graining requires averaging over many training data points, resulting in impractical memory requirements for storing the kernel matrices. In this work, we propose a data-efficient and memory-saving alternative. Using ensemble learning and stratified sampling, we propose a 2-layer training scheme that enables GDML to learn an effective coarse-grained model. We illustrate our method on a simple biomolecular system, alanine dipeptide, by reconstructing the free energy landscape of a coarse-grained variant of this molecule. Our novel GDML training scheme yields a smaller free energy error than neural networks when the training set is small, and a comparably high accuracy when the training set is sufficiently large. 14 pages, 6 figures |
تدمد: | 1089-7690 0021-9606 |
DOI: | 10.1063/5.0007276 |
URL الوصول: | https://explore.openaire.eu/search/publication?articleId=doi_dedup___::bfbe4fe929d94571e7337e7da35b5416 https://doi.org/10.1063/5.0007276 |
Rights: | OPEN |
رقم الانضمام: | edsair.doi.dedup.....bfbe4fe929d94571e7337e7da35b5416 |
قاعدة البيانات: | OpenAIRE |
تدمد: | 10897690 00219606 |
---|---|
DOI: | 10.1063/5.0007276 |