Table_1_stLFRsv: A Germline Structural Variant Analysis Pipeline Using Co-barcoded Reads.XLSX

التفاصيل البيبلوغرافية
العنوان: Table_1_stLFRsv: A Germline Structural Variant Analysis Pipeline Using Co-barcoded Reads.XLSX
المؤلفون: Junfu Guo (10321943), Chang Shi (515022), Xi Chen (35903), Ou Wang (306605), Ping Liu (156158), Huanming Yang (5905), Xun Xu (276108), Wenwei Zhang (165134), Hongmei Zhu (105489)
سنة النشر: 2021
المجموعة: Smithsonian Institution: Digital Repository
مصطلحات موضوعية: Genetics, Genetic Engineering, Biomarkers, Developmental Genetics (incl. Sex Determination), Epigenetics (incl. Genome Methylation and Epigenomics), Gene Expression (incl. Microarray and other genome-wide approaches), Genome Structure and Regulation, Genomics, Genetically Modified Animals, Livestock Cloning, Gene and Molecular Therapy, human genome, co-barcoded reads, structural variation, complex variants, breakpoints
الوصف: Co-barcoded reads originating from long DNA fragments (mean length >30 kbp) maintain both single base level accuracy and long-range genomic information. We propose a pipeline, stLFRsv, to detect structural variation using co-barcoded reads. stLFRsv identifies abnormal large gaps between co-barcoded reads to detect potential breakpoints and reconstruct complex structural variants (SVs). Haplotype phasing by co-barcoded reads increases the signal to noise ratio, and barcode sharing profiles are used to filter out false positives. We integrate the short read SV caller smoove for smaller variants with stLFRsv. The integrated pipeline was evaluated on the well-characterized genome HG002/NA24385, and 74.5% precision and a 22.4% recall rate were obtained for deletions. stLFRsv revealed some large variants not included in the benchmark set that were verified by long reads or assembly. For the HG001/NA12878 genome, stLFRsv also achieved the best performance for both resource usage and the detection of large variants. Our work indicates that co-barcoded read technology has the potential to improve genome completeness.
نوع الوثيقة: dataset
اللغة: unknown
Relation: https://figshare.com/articles/dataset/Table_1_stLFRsv_A_Germline_Structural_Variant_Analysis_Pipeline_Using_Co-barcoded_Reads_XLSX/14235206
DOI: 10.3389/fgene.2021.636239.s005
الاتاحة: https://doi.org/10.3389/fgene.2021.636239.s005
Rights: CC BY 4.0
رقم الانضمام: edsbas.8F0EFCE7
قاعدة البيانات: BASE
الوصف
DOI:10.3389/fgene.2021.636239.s005