Academic Journal

Sachem: a chemical cartridge for high-performance substructure search

التفاصيل البيبلوغرافية
العنوان: Sachem: a chemical cartridge for high-performance substructure search
المؤلفون: Miroslav Kratochvíl, Jiří Vondrášek, Jakub Galgonek
المصدر: Journal of Cheminformatics, Vol 10, Iss 1, Pp 1-11 (2018)
بيانات النشر: BMC, 2018.
سنة النشر: 2018
المجموعة: LCC:Information technology
LCC:Chemistry
مصطلحات موضوعية: Substructure search, Small molecule databases, Molecule cartridges, Inverted indices, Information technology, T58.5-58.64, Chemistry, QD1-999
الوصف: Abstract Background Structure search is one of the valuable capabilities of small-molecule databases. Fingerprint-based screening methods are usually employed to enhance the search performance by reducing the number of calls to the verification procedure. In substructure search, fingerprints are designed to capture important structural aspects of the molecule to aid the decision about whether the molecule contains a given substructure. Currently available cartridges typically provide acceptable search performance for processing user queries, but do not scale satisfactorily with dataset size. Results We present Sachem, a new open-source chemical cartridge that implements two substructure search methods: The first is a performance-oriented reimplementation of substructure indexing based on the OrChem fingerprint, and the second is a novel method that employs newly designed fingerprints stored in inverted indices. We assessed the performance of both methods on small, medium, and large datasets containing 1, 10, and 94 million compounds, respectively. Comparison of Sachem with other freely available cartridges revealed improvements in overall performance, scaling potential and screen-out efficiency. Conclusions The Sachem cartridge allows efficient substructure searches in databases of all sizes. The sublinear performance scaling of the second method and the ability to efficiently query large amounts of pre-extracted information may together open the door to new applications for substructure searches.
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 1758-2946
Relation: http://link.springer.com/article/10.1186/s13321-018-0282-y; https://doaj.org/toc/1758-2946
DOI: 10.1186/s13321-018-0282-y
URL الوصول: https://doaj.org/article/2c13d10fab504fe8a2109a2151431ae8
رقم الانضمام: edsdoj.2c13d10fab504fe8a2109a2151431ae8
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:17582946
DOI:10.1186/s13321-018-0282-y