Synthqa - Hierarchical Machine Learning-Based Protein Quality Assessment

التفاصيل البيبلوغرافية
العنوان: Synthqa - Hierarchical Machine Learning-Based Protein Quality Assessment
المؤلفون: Kiyomi Kishaba, Jie Hou, Renzhi Cao, Kyle Hippe, Dong Si, Mikhail Korovnik
بيانات النشر: Cold Spring Harbor Laboratory, 2021.
سنة النشر: 2021
مصطلحات موضوعية: chemistry.chemical_classification, business.industry, Computer science, media_common.quotation_subject, Protein structure prediction, Machine learning, computer.software_genre, Field (computer science), Amino acid, Protein structure, chemistry, Quality (business), Artificial intelligence, CASP, business, Native structure, Protein quality, computer, Energy (signal processing), media_common
الوصف: MotivationIt has been a challenge for biologists to determine 3D shapes of proteins from a linear chain of amino acids and understand how proteins carry out life’s tasks. Experimental techniques, such as X-ray crystallography or Nuclear Magnetic Resonance, are time-consuming. This highlights the importance of computational methods for protein structure predictions. In the field of protein structure prediction, ranking the predicted protein decoys and selecting the one closest to the native structure is known as protein model quality assessment (QA), or accuracy estimation problem. Traditional QA methods don’t consider different types of features from the protein decoy, lack various features for training machine learning models, and don’t consider the relationship between features. In this research, we used multi-scale features from energy score to topology of the protein structure, and proposed a hierarchical architecture for training machine learning models to tackle the QA problem.ResultsWe introduce a new single-model QA method that incorporates multi-scale features from protein structures, utilizes the hierarchical architecture of training machine learning models, and predicts the quality of any protein decoy. Based on our experiment, the new hierarchical architecture is more accurate compared to traditional machine learning-based methods. It also considers the relationship between features and generates additional features so machine learning models can be trained more accurately. We trained our new tool, SynthQA, on the CASP dataset (CASP10 to CASP12), and validated our method on 33 targets from the latest CASP 14 dataset. The result shows that our method is comparable to other state-of-the-art single-model QA methods, and consistently outperforms each of the 14 used features.Availabilityhttps://github.com/Cao-Labs/SynthQA.gitContactcaora@plu.edu
DOI: 10.1101/2021.01.28.428710
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_________::d1a8b808618e88203c58a682f5945731
https://doi.org/10.1101/2021.01.28.428710
Rights: OPEN
رقم الانضمام: edsair.doi...........d1a8b808618e88203c58a682f5945731
قاعدة البيانات: OpenAIRE
الوصف
DOI:10.1101/2021.01.28.428710