Academic Journal

TLATR: Automatic Topic Labeling Using Automatic (Domain-Specific) Term Recognition

التفاصيل البيبلوغرافية
العنوان: TLATR: Automatic Topic Labeling Using Automatic (Domain-Specific) Term Recognition
المؤلفون: Ciprian-Octavian Truica, Elena-Simona Apostol
المصدر: IEEE Access, Vol 9, Pp 76624-76641 (2021)
بيانات النشر: IEEE, 2021.
سنة النشر: 2021
المجموعة: LCC:Electrical engineering. Electronics. Nuclear engineering
مصطلحات موضوعية: Automatic term recognition, automatic topic labeling evaluation, topic labeling, topic modeling, Electrical engineering. Electronics. Nuclear engineering, TK1-9971
الوصف: Topic modeling is a probabilistic graphical model for discovering latent topics in text corpora by using multinomial distributions of topics over words. Topic labeling is used to assign meaningful labels for the discovered topics. In this paper, we present a new topic labeling method that uses automatic term recognition to discover and assign relevant labels for each topic, i.e., TLATR (Topic Labeling using Automatic Term Recognition). TLATR uses domain-specific multi-terms that appear in the set of documents belonging to a topic. The multi-term having the highest score as determined by the automatic term recognition algorithm is chosen as the label for that topic. To evaluate TLATR, we use two real, publicly available datasets that contain scientific articles’ abstracts. The topic label evaluation is done both automatically and using human annotators. For the automatic evaluation, we use Pointwise Mutual Information, Normalized Pointwise Mutual Information, and document similarity. For human evaluation, we employ the average rating method. Furthermore, we also evaluate the quality of the topic models using the Adjusted Rand Index. To prove that our novel method extracts relevant topic labels, we compare TLATR with two state-of-the-art methods, one supervised and one unsupervised, provided by the NETL Automatic Topic Labelling system. The experimental results show that our method outperforms or provides similar results with both NETL’s supervised and unsupervised approaches.
نوع الوثيقة: article
وصف الملف: electronic resource
اللغة: English
تدمد: 2169-3536
Relation: https://ieeexplore.ieee.org/document/9439519/; https://doaj.org/toc/2169-3536
DOI: 10.1109/ACCESS.2021.3083000
URL الوصول: https://doaj.org/article/12716b8994514c1bb13eec7f43da9e90
رقم الانضمام: edsdoj.12716b8994514c1bb13eec7f43da9e90
قاعدة البيانات: Directory of Open Access Journals
الوصف
تدمد:21693536
DOI:10.1109/ACCESS.2021.3083000