Academic Journal

On the scalability of feature selection methods on high-dimensional data.

التفاصيل البيبلوغرافية
العنوان: On the scalability of feature selection methods on high-dimensional data.
المؤلفون: Bolón-Canedo, V., Rego-Fernández, D., Peteiro-Barral, D., Alonso-Betanzos, A., Guijarro-Berdiñas, B., Sánchez-Maroño, N.
المصدر: Knowledge & Information Systems; Aug2018, Vol. 56 Issue 2, p395-442, 48p
مصطلحات موضوعية: FEATURE selection, SCALABILITY, EMBEDDED computer systems, REDUNDANCY in engineering, MICROARRAY technology
مستخلص: Lately, derived from the explosion of high dimensionality, researchers in machine learning became interested not only in accuracy, but also in scalability. Although scalability of learning methods is a trending issue, scalability of feature selection methods has not received the same amount of attention. This research analyzes the scalability of state-of-the-art feature selection methods, belonging to filter, embedded and wrapper approaches. For this purpose, several new measures are presented, based not only on accuracy but also on execution time and stability. The results on seven classical artificial datasets are presented and discussed, as well as two cases study analyzing the particularities of microarray data and the effect of redundancy. Trying to check whether the results can be generalized, we included some experiments with two real datasets. As expected, filters are the most scalable feature selection approach, being INTERACT, ReliefF and mRMR the most accurate methods. [ABSTRACT FROM AUTHOR]
Copyright of Knowledge & Information Systems is the property of Springer Nature and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
قاعدة البيانات: Complementary Index
الوصف
تدمد:02191377
DOI:10.1007/s10115-017-1140-3