Automated fragment formula annotation for electron ionisation, high resolution mass spectrometry: application to atmospheric measurements of halocarbons
العنوان: | Automated fragment formula annotation for electron ionisation, high resolution mass spectrometry: application to atmospheric measurements of halocarbons |
---|---|
المؤلفون: | Guillevic, Myriam, Guillevic, Aurore, Vollmer, Martin K., Schlauri, Paul, Hill, Matthias, Emmenegger, Lukas, Reimann, Stefan |
المساهمون: | Swiss Federal Laboratories for Materials Science and Technology [Dübendorf] (EMPA), Cryptology, arithmetic : algebraic methods for better algorithms (CARAMBA), Inria Nancy - Grand Est, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Department of Algorithms, Computation, Image and Geometry (LORIA - ALGO), Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Laboratoire Lorrain de Recherche en Informatique et ses Applications (LORIA), Institut National de Recherche en Informatique et en Automatique (Inria)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS)-Université de Lorraine (UL)-Centre National de la Recherche Scientifique (CNRS) |
المصدر: | Journal of Cheminformatics Journal of Cheminformatics, 2021, 13 (78), pp.1-27. ⟨10.1186/s13321-021-00544-w⟩ Journal of Cheminformatics, Vol 13, Iss 1, Pp 1-27 (2021) Journal of Cheminformatics, Chemistry Central Ltd. and BioMed Central, 2021, 13 (1), pp.27. ⟨10.1186/s13321-021-00544-w⟩ |
بيانات النشر: | Springer International Publishing, 2021. |
سنة النشر: | 2021 |
مصطلحات موضوعية: | Automated compound identification, Information technology, T58.5-58.64, Electron ionisation, [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], Chemistry, Combinatorics, Time of flight mass spectrometry, [MATH.MATH-CO]Mathematics [math]/Combinatorics [math.CO], Machine learning, Non-target screening, Atmospheric trace gases, QD1-999, [CHIM.CHEM]Chemical Sciences/Cheminformatics, Research Article |
الوصف: | Background Non-target screening consists in searching a sample for all present substances, suspected or unknown, with very little prior knowledge about the sample. This approach has been introduced more than a decade ago in the field of water analysis, together with dedicated compound identification tools, but is still very scarce for indoor and atmospheric trace gas measurements, despite the clear need for a better understanding of the atmospheric trace gas composition. For a systematic detection of emerging trace gases in the atmosphere, a new and powerful analytical method is gas chromatography (GC) of preconcentrated samples, followed by electron ionisation, high resolution mass spectrometry (EI-HRMS). In this work, we present data analysis tools to enable automated fragment formula annotation for unknown compounds measured by GC-EI-HRMS. Results Based on co-eluting mass/charge fragments, we developed an innovative data analysis method to reliably reconstruct the chemical formulae of the fragments, using efficient combinatorics and graph theory. The method does not require the presence of the molecular ion, which is absent in \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\sim$$\end{document}∼40% of EI spectra. Our method has been trained and validated on >50 halocarbons and hydrocarbons, with 3–20 atoms and molar masses of 30–330 g mol\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$^{-1}$$\end{document}-1, measured with a mass resolution of approx. 3500. For >90% of the compounds, more than 90% of the annotated fragment formulae are correct. Cases of wrong identification can be attributed to the scarcity of detected fragments per compound or the lack of isotopic constraint (no minor isotopocule detected). Conclusions Our method enables to reconstruct most probable chemical formulae independently from spectral databases. Therefore, it demonstrates the suitability of EI-HRMS data for non-target analysis and paves the way for the identification of substances for which no EI mass spectrum is registered in databases. We illustrate the performances of our method for atmospheric trace gases and suggest that it may be well suited for many other types of samples. The L-GPL licenced Python code is released under the name ALPINAC for ALgorithmic Process for Identification of Non-targeted Atmospheric Compounds. Supplementary Information The online version contains supplementary material available at 10.1186/s13321-021-00544-w. |
اللغة: | English |
تدمد: | 1758-2946 |
DOI: | 10.1186/s13321-021-00544-w⟩ |
URL الوصول: | https://explore.openaire.eu/search/publication?articleId=pmid_dedup__::35c71eb840fc0e15e3db017968eb731a http://europepmc.org/articles/PMC8491408 |
Rights: | OPEN |
رقم الانضمام: | edsair.pmid.dedup....35c71eb840fc0e15e3db017968eb731a |
قاعدة البيانات: | OpenAIRE |
تدمد: | 17582946 |
---|---|
DOI: | 10.1186/s13321-021-00544-w⟩ |