Academic Journal

Correcting classifiers for sample selection bias in two-phase case-control studies.

التفاصيل البيبلوغرافية
العنوان: Correcting classifiers for sample selection bias in two-phase case-control studies.
المؤلفون: Krautenbacher, N., Theis, F.J., Fuchs, C.
المصدر: Comput. Math. Methods Med. 2017:7847531 (2017)
بيانات النشر: Hindawi Ltd
سنة النشر: 2017
المجموعة: PuSH - Publikationsserver des Helmholtz Zentrums München
مصطلحات موضوعية: selection probabilities, stratified sample, bias correction, inverseprobability weights, complex survey design, random forest, logistic regression, decision tree, parametric bootstrap, bagging, machine learning
الوصف: Epidemiological studies often utilize stratified data in which rare outcomes or exposures are artificially enriched. This design can increase precision in association tests but distorts predictions when applying classifiers on non-stratified data. Several methods correct for this so-called sample selection bias, but their performance remains unclear especially for machine learning classifiers. With an emphasis on two-phase case-control studies, we aim to assess which corrections to perform in which setting and to obtain methods suitable for machine learning techniques, especially the random forest. We propose two new resampling-based methods to resemble the original data and covariance structure: stochastic inverse-probability oversampling and parametric inverseprobability bagging. We compare all techniques for the random forest and other classifiers, both theoretically and on simulated and real data. Empirical results show that the random forest profits from only the parametric inverse-probability bagging proposed by us. For other classifiers, correction is mostly advantageous, and methods perform uniformly. We discuss consequences of inappropriate distribution assumptions and reason for different .
نوع الوثيقة: article in journal/newspaper
وصف الملف: application/pdf
اللغة: English
تدمد: 1748-670X
1748-6718
Relation: info:eu-repo/semantics/altIdentifier/wos/WOS:000412535700001; info:eu-repo/semantics/altIdentifier/isbn/1748-670X; info:eu-repo/semantics/altIdentifier/pissn/1748-670X; info:eu-repo/semantics/alt; https://push-zb.helmholtz-muenchen.de/frontdoor.php?source_opus=51843; urn:isbn:1748-670X; urn:issn:1748-670X; urn:issn:1748-6718
DOI: 10.1155/2017/7847531
الاتاحة: https://push-zb.helmholtz-muenchen.de/frontdoor.php?source_opus=51843
https://doi.org/10.1155/2017/7847531
Rights: info:eu-repo/semantics/openAccess
رقم الانضمام: edsbas.78871B20
قاعدة البيانات: BASE
الوصف
تدمد:1748670X
17486718
DOI:10.1155/2017/7847531