Academic Journal

Part-of-speech tagging via deep neural networks for Northern-Ethiopic languages

التفاصيل البيبلوغرافية
العنوان: Part-of-speech tagging via deep neural networks for Northern-Ethiopic languages
المؤلفون: Tesfagergish, Senait Gebremichael, Kapočiūtė-Dzikienė, Jurgita
سنة النشر: 2020
المجموعة: Vytautas Magnus University e-Publication Repository (VMU ePub) / Vytauto Didžiojo universitetas: e. publikacijų talpykla (VDU ePub)
مصطلحات موضوعية: Deep Learning, Word2vec embeddings, Part-of-speech tagging, Natural language processing, Computational linguistics, Tigrinya language, Straipsnis Clarivate Analytics Web of Science / Article in Clarivate Analytics Web of Science (S1), Informatika / Informatics (N009)
جغرافية الموضوع: LT
الوصف: Deep Neural Networks (DNNs) have proven to be especially successful in the area of Natural Language Processing (NLP) and Part-Of-Speech (POS) tagging—which is the process of mapping words to their corresponding POS labels depending on the context. Despite recent development of language technologies, low-resourced languages (such as an East African Tigrinya language), have received too little attention. We investigate the effectiveness of Deep Learning (DL) solutions for the low-resourced Tigrinya language of the Northern-Ethiopic branch. We have selected Tigrinya as the testbed example and have tested state-of-the-art DL approaches seeking to build the most accurate POS tagger. We have evaluated DNN classifiers (Feed Forward Neural Network – FFNN, Long Short-Term Memory method – LSTM, Bidirectional LSTM, and Convolutional Neural Network – CNN) on a top of neural word2vec word embeddings with a small training corpus known as Nagaoka Tigrinya Corpus [19]. To determine the best DNN classifier type, its architecture and hyper-parameter set both manual and automatic hyper-parameter tuning has been performed. BiLSTM method was proved to be the most suitable for our solving task: it achieved the highest accuracy equal to ~92% that is ~65% above the random baseline ; Kauno technologijos universitetas ; Taikomosios informatikos katedra ; Vytauto Didžiojo universitetas
نوع الوثيقة: article in journal/newspaper
وصف الملف: p. 482-494
اللغة: English
تدمد: 1392124X
Relation: Informacinės technologijos ir valdymas = Information technology and control. Kaunas : Technologija, 2020, t. 49, nr. 4; Science Citation Index Expanded (Web of Science); INSPEC; VINITI; Scopus; VDU02-000065257; https://doi.org/10.5755/j01.itc.49.4.26808; WOS:000709572200003
DOI: 10.5755/j01.itc.49.4.26808
الاتاحة: https://doi.org/10.5755/j01.itc.49.4.26808
رقم الانضمام: edsbas.3F5F6AC4
قاعدة البيانات: BASE
الوصف
تدمد:1392124X
DOI:10.5755/j01.itc.49.4.26808