التفاصيل البيبلوغرافية
العنوان: |
Dissecting Gene Expression Heterogeneity: Generalized Pearson Correlation Squares and the K-Lines Clustering Algorithm. |
المؤلفون: |
Li, Jingyi Jessica1 (AUTHOR) jli@stat.ucla.edu, Zhou, Heather J.1 (AUTHOR), Bickel, Peter J.2 (AUTHOR), Tong, Xin3 (AUTHOR) |
المصدر: |
Journal of the American Statistical Association. Dec2024, Vol. 119 Issue 548, p2450-2463. 14p. |
مصطلحات موضوعية: |
PEARSON correlation (Statistics), CLUSTERING algorithms, ASYMPTOTIC distribution, INFERENTIAL statistics, GENE expression |
مستخلص: |
Motivated by the pressing needs for dissecting heterogeneous relationships in gene expression data, here we generalize the squared Pearson correlation to capture a mixture of linear dependences between two real-valued variables, with or without an index variable that specifies the line memberships. We construct the generalized Pearson correlation squares by focusing on three aspects: variable exchangeability, no parametric model assumptions, and inference of population-level parameters. To compute the generalized Pearson correlation square from a sample without a line-membership specification, we develop a K-lines clustering algorithm to find K clusters that exhibit distinct linear dependences, where K can be chosen in a data-adaptive way. To infer the population-level generalized Pearson correlation squares, we derive the asymptotic distributions of the sample-level statistics to enable efficient statistical inference. Simulation studies verify the theoretical results and show the power advantage of the generalized Pearson correlation squares in capturing mixtures of linear dependences. Gene expression data analyses demonstrate the effectiveness of the generalized Pearson correlation squares and the K-lines clustering algorithm in dissecting complex but interpretable relationships. The estimation and inference procedures are implemented in the R package gR2 (). for this article are available online, including a standardized description of the materials available for reproducing the work. [ABSTRACT FROM AUTHOR] |
|
Copyright of Journal of the American Statistical Association is the property of Taylor & Francis Ltd and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.) |
قاعدة البيانات: |
Business Source Index |