Transductive active learning – A new semi-supervised learning approach based on iteratively refined generative models to capture structure in data

التفاصيل البيبلوغرافية
العنوان: Transductive active learning – A new semi-supervised learning approach based on iteratively refined generative models to capture structure in data
المؤلفون: Tobias Reitmaier, Bernhard Sick, Adrian Calma
المصدر: Information Sciences. 293:275-298
بيانات النشر: Elsevier BV, 2015.
سنة النشر: 2015
مصطلحات موضوعية: Transduction (machine learning), Information Systems and Management, Computer science, Active learning (machine learning), Stability (learning theory), Semi-supervised learning, Machine learning, computer.software_genre, Theoretical Computer Science, Artificial Intelligence, Instance-based learning, Cluster analysis, Probabilistic classification, Learning classifier system, business.industry, Probabilistic logic, Online machine learning, Generalization error, Computer Science Applications, Generative model, ComputingMethodologies_PATTERNRECOGNITION, Control and Systems Engineering, Active learning, Unsupervised learning, Artificial intelligence, business, computer, Software
الوصف: Pool-based active learning is a paradigm where users (e.g., domains experts) are iteratively asked to label initially unlabeled data, e.g., to train a classifier from these data. An appropriate selection strategy has to choose unlabeled data for such user queries in an efficient and effective way (in principle, high classification performance at low labeling costs). In our transductive active learning approach we provide a completely labeled data pool (samples are either labeled by the experts or in a semi-supervised way) in each active learning cycle. Thereby, a key aspect is to explore and exploit information about structure in data. Structure in data can be detected and modeled by means of clustering algorithms or probabilistic, generative modeling techniques, for instance. Usually, this is done at the beginning of the active learning process when the data are still unlabeled. In our approach we show how a probabilistic generative model, initially parametrized with unlabeled data, can iteratively be refined and improved when during the active learning process more and more labels became available. In each cycle of the active learning process we use this generative model to label all samples not labeled by an expert so far in order to train the kind of classifier we want to train with the active learning process. Thus, this transductive learning process can be combined with any selection strategy and any kind of classifier. Here, we combine it with the 4DS selection strategy and the CMM probabilistic classifier described in previous work. For 20 publicly available benchmark data sets, we show that this new transductive learning process helps to improve pool-based active learning noticeably.
تدمد: 0020-0255
DOI: 10.1016/j.ins.2014.09.009
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_________::57e78ae9b48d0f5085ad4267c0871efa
https://doi.org/10.1016/j.ins.2014.09.009
Rights: CLOSED
رقم الانضمام: edsair.doi...........57e78ae9b48d0f5085ad4267c0871efa
قاعدة البيانات: OpenAIRE
الوصف
تدمد:00200255
DOI:10.1016/j.ins.2014.09.009