DOT: Gene-set analysis by combining decorrelated association statistics

التفاصيل البيبلوغرافية
العنوان: DOT: Gene-set analysis by combining decorrelated association statistics
المؤلفون: Olga A Vsevolozhskaya, Min Shi, Fengjiao Hu, Dmitri V Zaykin
المصدر: PLoS Computational Biology, Vol 16, Iss 4, p e1007819 (2020)
PLoS Computational Biology
بيانات النشر: Cold Spring Harbor Laboratory, 2019.
سنة النشر: 2019
مصطلحات موضوعية: FOS: Computer and information sciences, 0301 basic medicine, Genomics Statistics, Computer science, Test Statistics, Cleft Lip and Palate, Linkage Disequilibrium, Mathematical and Statistical Techniques, 0302 clinical medicine, Breast Tumors, Statistics, Medicine and Health Sciences, Morphogenesis, Gene set analysis, Biology (General), Statistic, Ecology, Simulation and Modeling, Genomics, Oncology, Computational Theory and Mathematics, Modeling and Simulation, Physical Sciences, Probability distribution, Female, Research Article, Statistical Distributions, Genetic Markers, QH301-705.5, Orthogonal transformation, Cleft Lip, Breast Neoplasms, Correlation and dependence, Research and Analysis Methods, Statistics - Applications, Polymorphism, Single Nucleotide, 03 medical and health sciences, Cellular and Molecular Neuroscience, Breast Cancer, Congenital Disorders, Genome-Wide Association Studies, Genetics, Humans, Quantitative Biology - Genomics, Applications (stat.AP), Genetic Predisposition to Disease, Birth Defects, Statistical Methods, Molecular Biology, Decorrelation, Ecology, Evolution, Behavior and Systematics, Statistical hypothesis testing, Genomics (q-bio.GN), Models, Statistical, Cancers and Neoplasms, Biology and Life Sciences, Computational Biology, Human Genetics, Probability Theory, Genome Analysis, Summary statistics, Algebra, 030104 developmental biology, Otorhinolaryngology, Linear Algebra, FOS: Biological sciences, Pairwise comparison, Eigenvectors, Null hypothesis, Mathematics, 030217 neurology & neurosurgery, Developmental Biology, Genome-Wide Association Study
الوصف: Historically, the majority of statistical association methods have been designed assuming availability of SNP-level information. However, modern genetic and sequencing data present new challenges to access and sharing of genotype-phenotype datasets, including cost of management, difficulties in consolidation of records across research groups, etc. These issues make methods based on SNP-level summary statistics particularly appealing. The most common form of combining statistics is a sum of SNP-level squared scores, possibly weighted, as in burden tests for rare variants. The overall significance of the resulting statistic is evaluated using its distribution under the null hypothesis. Here, we demonstrate that this basic approach can be substantially improved by decorrelating scores prior to their addition, resulting in remarkable power gains in situations that are most commonly encountered in practice; namely, under heterogeneity of effect sizes and diversity between pairwise LD. In these situations, the power of the traditional test, based on the added squared scores, quickly reaches a ceiling, as the number of variants increases. Thus, the traditional approach does not benefit from information potentially contained in any additional SNPs, while our decorrelation by orthogonal transformation (DOT) method yields steady gain in power. We present theoretical and computational analyses of both approaches, and reveal causes behind sometimes dramatic difference in their respective powers. We showcase DOT by analyzing breast cancer and cleft lip data, in which our method strengthened levels of previously reported associations and implied the possibility of multiple new alleles that jointly confer disease risk.
Author summary Joint analysis of association between the outcome and a group of SNPs within a genetic region is increasingly recognized to complement single-SNP analysis and shed light on the underlying molecular mechanisms. However, the correlation among GWAS association results calls for specifically tailored statistical methods. Here we propose DOT (Decorrelation by Orthogonal Transformation) method that can efficiently combine evidence of association over different SNPs and genes within a pathway without access to the original genotypic data. DOT is fast, does not rely on a permutation algorithm, and is often dramatically more powerful than other popular methods, such as VEGAS and the recently proposed ACAT. We believe that DOT will become a useful addition to the toolbox of methods based on the summary statistics for the GWAS community.
DOI: 10.1101/665133
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::36f748f998de35ed2e724fd342bd4224
https://doi.org/10.1101/665133
Rights: OPEN
رقم الانضمام: edsair.doi.dedup.....36f748f998de35ed2e724fd342bd4224
قاعدة البيانات: OpenAIRE