Academic Journal

Intrinsic bias estimation for improved analysis of bulk and single-cell chromatin accessibility profiles using SELMA

التفاصيل البيبلوغرافية
العنوان: Intrinsic bias estimation for improved analysis of bulk and single-cell chromatin accessibility profiles using SELMA
المؤلفون: Hu, Shengen Shawn, Liu, Lin, Li, Qi, Ma, Wenjing, Guertin, Michael J., Meyer, Clifford A., Deng, Ke, Zhang, Tingting, Zang, Chongzhi
المساهمون: U.S. Department of Health & Human Services | NIH | National Cancer Institute, U.S. Department of Health & Human Services | NIH | National Institute of General Medical Sciences, National Science Foundation
المصدر: Nature Communications ; volume 13, issue 1 ; ISSN 2041-1723
بيانات النشر: Springer Science and Business Media LLC
سنة النشر: 2022
الوصف: Genome-wide profiling of chromatin accessibility by DNase-seq or ATAC-seq has been widely used to identify regulatory DNA elements and transcription factor binding sites. However, enzymatic DNA cleavage exhibits intrinsic sequence biases that confound chromatin accessibility profiling data analysis. Existing computational tools are limited in their ability to account for such intrinsic biases and not designed for analyzing single-cell data. Here, we present Simplex Encoded Linear Model for Accessible Chromatin (SELMA), a computational method for systematic estimation of intrinsic cleavage biases from genomic chromatin accessibility profiling data. We demonstrate that SELMA yields accurate and robust bias estimation from both bulk and single-cell DNase-seq and ATAC-seq data. SELMA can utilize internal mitochondrial DNA data to improve bias estimation. We show that transcription factor binding inference from DNase footprints can be improved by incorporating estimated biases using SELMA. Furthermore, we show strong effects of intrinsic biases in single-cell ATAC-seq data, and develop the first single-cell ATAC-seq intrinsic bias correction model to improve cell clustering. SELMA can enhance the performance of existing bioinformatics tools and improve the analysis of both bulk and single-cell chromatin accessibility sequencing data.
نوع الوثيقة: article in journal/newspaper
اللغة: English
DOI: 10.1038/s41467-022-33194-z
الاتاحة: http://dx.doi.org/10.1038/s41467-022-33194-z
https://www.nature.com/articles/s41467-022-33194-z.pdf
https://www.nature.com/articles/s41467-022-33194-z
Rights: https://creativecommons.org/licenses/by/4.0 ; https://creativecommons.org/licenses/by/4.0
رقم الانضمام: edsbas.FFE378C9
قاعدة البيانات: BASE
الوصف
DOI:10.1038/s41467-022-33194-z