Scalable Information Gain Variant on Spark Cluster for Rapid Quantification of Microarray

التفاصيل البيبلوغرافية
العنوان: Scalable Information Gain Variant on Spark Cluster for Rapid Quantification of Microarray
المؤلفون: Anand Tirkey, Mukesh Kumar, Santanu Kumar Rath, Ransingh Biswajit Ray
المصدر: Procedia Computer Science. 93:292-298
بيانات النشر: Elsevier BV, 2016.
سنة النشر: 2016
مصطلحات موضوعية: Microarray, Computer science, Big data, Feature selection, 02 engineering and technology, computer.software_genre, 03 medical and health sciences, Naive Bayes classifier, sf-MIFS, 0302 clinical medicine, Spark (mathematics), 0202 electrical engineering, electronic engineering, information engineering, sf-LoR, General Environmental Science, Spark, Resilient Distributed Dataset, business.industry, ComputingMethodologies_PATTERNRECOGNITION, Hadoop, sf-NB, 030220 oncology & carcinogenesis, Scalability, Gene chip analysis, General Earth and Planetary Sciences, 020201 artificial intelligence & image processing, Data mining, business, computer, Classifier (UML)
الوصف: Microarray technology is one of the emerging technologies in the field of genetic research, which many researchers often use to monitor expression levels of genes in a given organism. Microarray experiments have wide range of applications in health care sector. The colossal amount of raw gene expression data often leads to computational and analytical challenges including feature selection and classification of the dataset into correct group or class. In this paper, mutual information feature selection method based on spark framework (sf-MIFS) is proposed to determine the pertinent features. After completion of feature selection process, various classifiers i.e., Logistic Regression (sf-LoR) and Naive Bayes (sf-NB) based on Spark framework has been applied to classify the microarray datasets. A detailed comparative analysis in terms of execution time and accuracy is enumerated on the proposed feature selection and classifier methodologies, based on Spark framework and conventional system respectively.
تدمد: 1877-0509
DOI: 10.1016/j.procs.2016.07.213
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::c596ee170ca8511935bbd3d36c0f27a4
Rights: OPEN
رقم الانضمام: edsair.doi.dedup.....c596ee170ca8511935bbd3d36c0f27a4
قاعدة البيانات: OpenAIRE
الوصف
تدمد:18770509
DOI:10.1016/j.procs.2016.07.213