Split Optimization for Protein/Ligand Binding Models

التفاصيل البيبلوغرافية
العنوان: Split Optimization for Protein/Ligand Binding Models
المؤلفون: Davis, Brian, Mcloughlin, Kevin, Allen, Jonathan, Ellingson, Sally
سنة النشر: 2020
المجموعة: Quantitative Biology
مصطلحات موضوعية: Quantitative Biology - Biomolecules
الوصف: In this paper, we investigate potential biases in datasets used to make drug binding predictions using machine learning. We investigate a recently published metric called the Asymmetric Validation Embedding (AVE) bias which is used to quantify this bias and detect overfitting. We compare it to a slightly revised version and introduce a new weighted metric. We find that the new metrics allow to quantify overfitting while not overly limiting training data and produce models with greater predictive value.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2001.03207
رقم الانضمام: edsarx.2001.03207
قاعدة البيانات: arXiv