التفاصيل البيبلوغرافية
العنوان: |
Split Optimization for Protein/Ligand Binding Models |
المؤلفون: |
Davis, Brian, Mcloughlin, Kevin, Allen, Jonathan, Ellingson, Sally |
سنة النشر: |
2020 |
المجموعة: |
Quantitative Biology |
مصطلحات موضوعية: |
Quantitative Biology - Biomolecules |
الوصف: |
In this paper, we investigate potential biases in datasets used to make drug binding predictions using machine learning. We investigate a recently published metric called the Asymmetric Validation Embedding (AVE) bias which is used to quantify this bias and detect overfitting. We compare it to a slightly revised version and introduce a new weighted metric. We find that the new metrics allow to quantify overfitting while not overly limiting training data and produce models with greater predictive value. |
نوع الوثيقة: |
Working Paper |
URL الوصول: |
http://arxiv.org/abs/2001.03207 |
رقم الانضمام: |
edsarx.2001.03207 |
قاعدة البيانات: |
arXiv |