التفاصيل البيبلوغرافية
العنوان: |
Additional file 1 of GUNC: detection of chimerism and contamination in prokaryotic genomes |
المؤلفون: |
Askarbek Orakov (10965119), Anthony Fullam (9315533), Luis Pedro Coelho (3256701), Supriya Khedkar (395108), Damian Szklarczyk (2118244), Daniel R. Mende (7446758), Thomas S. B. Schmidt (8736825), Peer Bork (438) |
سنة النشر: |
2021 |
المجموعة: |
Smithsonian Institution: Digital Repository |
مصطلحات موضوعية: |
Genetics, Genome quality, Genome contamination, Metagenomics, Metagenome-assembled genomes, Bioinformatics |
الوصف: |
Additional file 1: Figure S1. a Percent stacked bar chart of CheckM inferred marker lineage levels (colors) for type 3a simulated chimeric genomes (see Methods & Fig. 2a) across different: 1) divergence levels of source genomes (x-axis); 2) simulated portions of contamination (columns); and 3) scenarios of contamination (‘added’ vs ‘replaced’, rows; see Methods). In a and b, the first column (“0”) are clean (non-chimeric) genomes shown for comparison. b Average inferred CheckM marker lineage depth (y-axis) of simulated chimeric genomes under different contamination scenarios (‘added’ in dark blue; ‘replaced’ in light blue). The true taxonomic depth of divergence between source genomes are indicated in green. c Equivalent to a, but using chimeric genomes simulated from multiple sources (type 3b in Fig. 2a). Columns indicate the number of equally contributing source genomes (n_sources); rows indicate simulation setups (‘0.5’ if 50% of each source genome was used; ‘1/n_sources’ for equal source parts; see Methods). In c & d, the first column (“1”) are clean (non-chimeric) genomes, the second column (“2”) are type 3a genomes as in a & b, shown for comparison. d Average inferred CheckM marker lineage depths (y-axis) with different portions of contamination, equivalent to panel b. Figure S2. Comparison of median scores from GUNC and CheckM of simulations of genomes type 3a and 3b where source genomes make equal contributions summing 1 in total (e.g. 0.2 from each of 5 sources or 0.25 from each of 4 sources). This shows that the trend from Fig. 2b persists when multiple source genomes are mixed in a simulated chimeric genome. Figure S3. F-scores of distinction between clean and chimeric genomes across all divergence levels of source genomes for different simulation scenarios. MIMAG medium is CheckM contamination < 10% and CheckM completeness ≥50%. MIMAG high is CheckM contamination <5% and CheckM completeness >90% and due to irrelevance to our simulations we decided that additional criteria of ... |
نوع الوثيقة: |
article in journal/newspaper |
اللغة: |
unknown |
Relation: |
https://figshare.com/articles/journal_contribution/Additional_file_1_of_GUNC_detection_of_chimerism_and_contamination_in_prokaryotic_genomes/14776610 |
DOI: |
10.6084/m9.figshare.14776610.v1 |
الاتاحة: |
https://doi.org/10.6084/m9.figshare.14776610.v1 |
Rights: |
CC BY + CC0 |
رقم الانضمام: |
edsbas.CF8758D2 |
قاعدة البيانات: |
BASE |