Academic Journal

التفاصيل البيبلوغرافية
العنوان: [Untitled]
المؤلفون: Mark A Depristo, Eric Banks, Ryan Poplin, Kiran V Garimella, Jared R Maguire, Christopher Hartl, Anthony A Philippakis, Guillermo Del Angel, Manuel A Rivas, Matt Hanna, Aaron Mckenna, Tim J Fennell, Andrew M Kernytsky, Andrey Y Sivachenko, Kristian Cibulskis, Stacey B Gabriel, David Altshuler, Mark J Daly
المساهمون: The Pennsylvania State University CiteSeerX Archives
المصدر: http://bioinformatics.sph.harvard.edu/ngs-workshops/documents/Variant%20Discovery/Nat%20Genet%202011%20Depristo.pdf.
المجموعة: CiteSeerX
الوصف: Recent advances in sequencing technology make it possible to comprehensively catalog genetic variation in population samples, creating a foundation for understanding human disease, ancestry and evolution. The amounts of raw data produced are prodigious, and many computational steps are required to translate this output into high-quality variant calls. We present a unified analytic framework to discover and genotype variation among multiple samples simultaneously that achieves sensitive and specific results across five sequencing technologies and three distinct, canonical experimental designs. Our process includes (i) initial read mapping; (ii) local realignment around indels; (iii) base quality score recalibration; (iv) SNP discovery and genotyping to find all potential variants; and (v) machine learning to separate true segregating variation from machine artifacts common to next-generation sequencing technologies. We here discuss the application of these tools, instantiated in the Genome Analysis Toolkit, to deep whole-genome, whole-exome capture and multi-sample low-pass (~4×) 000 Genomes Project datasets. Recent advances in next-generation sequencing (NGS) technology now provide the first cost-effective approach to large-scale resequencing of human samples for medical and population genetics. Projects such as the 1000 Genomes Project 1 (1KG), The Cancer Genome Atlas and numerous large medically focused exome sequencing projects 2 are underway in an attempt to elucidate the full spectrum of human genetic diversity 1 and the complete genetic architecture of human disease. The ability to examine the entire genome in an unbiased way will make possible comprehensive searches for standing variation in common disease and mutations underlying linkages in Mendelian disease 3 , as well as spontaneously arising variation for which no gene-mapping shortcuts are available (for example, somatic mutations in cancer 4-6 and de novo mutations 7 (Conrad, D.F. et al. unpublished data) in autism and schizophrenia). Many ...
نوع الوثيقة: text
وصف الملف: application/pdf
اللغة: English
Relation: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.1048.9138; http://bioinformatics.sph.harvard.edu/ngs-workshops/documents/Variant%20Discovery/Nat%20Genet%202011%20Depristo.pdf
الاتاحة: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.1048.9138
http://bioinformatics.sph.harvard.edu/ngs-workshops/documents/Variant%20Discovery/Nat%20Genet%202011%20Depristo.pdf
Rights: Metadata may be used without restrictions as long as the oai identifier remains attached to it.
رقم الانضمام: edsbas.D4ADAFD
قاعدة البيانات: BASE