Academic Journal

Part-of-speech tagging via deep neural networks for Northern-Ethiopic languages

التفاصيل البيبلوغرافية
العنوان: Part-of-speech tagging via deep neural networks for Northern-Ethiopic languages
المؤلفون: Tesfagergish, Senait Gebremichael, Kapočiūtė-Dzikienė, Jurgita
سنة النشر: 2020
المجموعة: Vytautas Magnus University e-Publication Repository (VMU ePub) / Vytauto Didžiojo universitetas: e. publikacijų talpykla (VDU ePub)
مصطلحات موضوعية: Deep Learning, Word2vec embeddings, Part-of-speech tagging, Natural language processing, Computational linguistics, Tigrinya language, Straipsnis Clarivate Analytics Web of Science / Article in Clarivate Analytics Web of Science (S1), Informatika / Informatics (N009)
جغرافية الموضوع: LT
الوصف: Deep Neural Networks (DNNs) have proven to be especially successful in the area of Natural Language Processing (NLP) and Part-Of-Speech (POS) tagging—which is the process of mapping words to their corresponding POS labels depending on the context. Despite recent development of language technologies, low-resourced languages (such as an East African Tigrinya language), have received too little attention. We investigate the effectiveness of Deep Learning (DL) solutions for the low-resourced Tigrinya language of the Northern-Ethiopic branch. We have selected Tigrinya as the testbed example and have tested state-of-the-art DL approaches seeking to build the most accurate POS tagger. We have evaluated DNN classifiers (Feed Forward Neural Network – FFNN, Long Short-Term Memory method – LSTM, Bidirectional LSTM, and Convolutional Neural Network – CNN) on a top of neural word2vec word embeddings with a small training corpus known as Nagaoka Tigrinya Corpus [19]. To determine the best DNN classifier type, its architecture and hyper-parameter set both manual and automatic hyper-parameter tuning has been performed. BiLSTM method was proved to be the most suitable for our solving task: it achieved the highest accuracy equal to ~92% that is ~65% above the random baseline ; Kauno technologijos universitetas ; Taikomosios informatikos katedra ; Vytauto Didžiojo universitetas
نوع الوثيقة: article in journal/newspaper
وصف الملف: p. 482-494
اللغة: English
تدمد: 1392124X
Relation: Informacinės technologijos ir valdymas = Information technology and control. Kaunas : Technologija, 2020, t. 49, nr. 4; Science Citation Index Expanded (Web of Science); INSPEC; VINITI; Scopus; VDU02-000065257; https://doi.org/10.5755/j01.itc.49.4.26808; WOS:000709572200003
DOI: 10.5755/j01.itc.49.4.26808
الاتاحة: https://doi.org/10.5755/j01.itc.49.4.26808
رقم الانضمام: edsbas.3F5F6AC4
قاعدة البيانات: BASE
ResultId 1
Header edsbas
BASE
edsbas.3F5F6AC4
835
3
Academic Journal
academicJournal
835.017944335938
PLink https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&scope=site&db=edsbas&AN=edsbas.3F5F6AC4&custid=s6537998&authtype=sso
FullText Array ( [Availability] => 0 )
Array ( [0] => Array ( [Url] => https://doi.org/10.5755/j01.itc.49.4.26808# [Name] => EDS - BASE [Category] => fullText [Text] => View record in BASE [MouseOverText] => View record in BASE ) [1] => Array ( [Url] => https://resolver.ebscohost.com/openurl?custid=s6537998&groupid=main&authtype=ip,guest&sid=EBSCO:edsbas&genre=article&issn=1392124X&ISBN=&volume=&issue=&date=20200101&spage=&pages=&title=Part-of-speech tagging via deep neural networks for Northern-Ethiopic languages&atitle=Part-of-speech%20tagging%20via%20deep%20neural%20networks%20for%20Northern-Ethiopic%20languages&id=DOI:10.5755/j01.itc.49.4.26808 [Name] => Full Text Finder (s6537998api) [Category] => fullText [Text] => Full Text Finder [Icon] => https://imageserver.ebscohost.com/branding/images/FTF.gif [MouseOverText] => Full Text Finder ) )
Items Array ( [Name] => Title [Label] => Title [Group] => Ti [Data] => Part-of-speech tagging via deep neural networks for Northern-Ethiopic languages )
Array ( [Name] => Author [Label] => Authors [Group] => Au [Data] => <searchLink fieldCode="AR" term="%22Tesfagergish%2C+Senait+Gebremichael%22">Tesfagergish, Senait Gebremichael</searchLink><br /><searchLink fieldCode="AR" term="%22Kapočiūtė-Dzikienė%2C+Jurgita%22">Kapočiūtė-Dzikienė, Jurgita</searchLink> )
Array ( [Name] => DatePubCY [Label] => Publication Year [Group] => Date [Data] => 2020 )
Array ( [Name] => Subset [Label] => Collection [Group] => HoldingsInfo [Data] => Vytautas Magnus University e-Publication Repository (VMU ePub) / Vytauto Didžiojo universitetas: e. publikacijų talpykla (VDU ePub) )
Array ( [Name] => Subject [Label] => Subject Terms [Group] => Su [Data] => <searchLink fieldCode="DE" term="%22Deep+Learning%22">Deep Learning</searchLink><br /><searchLink fieldCode="DE" term="%22Word2vec+embeddings%22">Word2vec embeddings</searchLink><br /><searchLink fieldCode="DE" term="%22Part-of-speech+tagging%22">Part-of-speech tagging</searchLink><br /><searchLink fieldCode="DE" term="%22Natural+language+processing%22">Natural language processing</searchLink><br /><searchLink fieldCode="DE" term="%22Computational+linguistics%22">Computational linguistics</searchLink><br /><searchLink fieldCode="DE" term="%22Tigrinya+language%22">Tigrinya language</searchLink><br /><searchLink fieldCode="DE" term="%22Straipsnis+Clarivate+Analytics+Web+of+Science+%2F+Article+in+Clarivate+Analytics+Web+of+Science+%28S1%29%22">Straipsnis Clarivate Analytics Web of Science / Article in Clarivate Analytics Web of Science (S1)</searchLink><br /><searchLink fieldCode="DE" term="%22Informatika+%2F+Informatics+%28N009%29%22">Informatika / Informatics (N009)</searchLink> )
Array ( [Name] => Subject [Label] => Subject Geographic [Group] => Su [Data] => <searchLink fieldCode="DE" term="%22LT%22">LT</searchLink> )
Array ( [Name] => Abstract [Label] => Description [Group] => Ab [Data] => Deep Neural Networks (DNNs) have proven to be especially successful in the area of Natural Language Processing (NLP) and Part-Of-Speech (POS) tagging—which is the process of mapping words to their corresponding POS labels depending on the context. Despite recent development of language technologies, low-resourced languages (such as an East African Tigrinya language), have received too little attention. We investigate the effectiveness of Deep Learning (DL) solutions for the low-resourced Tigrinya language of the Northern-Ethiopic branch. We have selected Tigrinya as the testbed example and have tested state-of-the-art DL approaches seeking to build the most accurate POS tagger. We have evaluated DNN classifiers (Feed Forward Neural Network – FFNN, Long Short-Term Memory method – LSTM, Bidirectional LSTM, and Convolutional Neural Network – CNN) on a top of neural word2vec word embeddings with a small training corpus known as Nagaoka Tigrinya Corpus [19]. To determine the best DNN classifier type, its architecture and hyper-parameter set both manual and automatic hyper-parameter tuning has been performed. BiLSTM method was proved to be the most suitable for our solving task: it achieved the highest accuracy equal to ~92% that is ~65% above the random baseline ; Kauno technologijos universitetas ; Taikomosios informatikos katedra ; Vytauto Didžiojo universitetas )
Array ( [Name] => TypeDocument [Label] => Document Type [Group] => TypDoc [Data] => article in journal/newspaper )
Array ( [Name] => Format [Label] => File Description [Group] => SrcInfo [Data] => p. 482-494 )
Array ( [Name] => Language [Label] => Language [Group] => Lang [Data] => English )
Array ( [Name] => ISSN [Label] => ISSN [Group] => ISSN [Data] => 1392124X )
Array ( [Name] => NoteTitleSource [Label] => Relation [Group] => SrcInfo [Data] => Informacinės technologijos ir valdymas = Information technology and control. Kaunas : Technologija, 2020, t. 49, nr. 4; Science Citation Index Expanded (Web of Science); INSPEC; VINITI; Scopus; VDU02-000065257; https://doi.org/10.5755/j01.itc.49.4.26808; WOS:000709572200003 )
Array ( [Name] => DOI [Label] => DOI [Group] => ID [Data] => 10.5755/j01.itc.49.4.26808 )
Array ( [Name] => URL [Label] => Availability [Group] => URL [Data] => https://doi.org/10.5755/j01.itc.49.4.26808 )
Array ( [Name] => AN [Label] => Accession Number [Group] => ID [Data] => edsbas.3F5F6AC4 )
RecordInfo Array ( [BibEntity] => Array ( [Identifiers] => Array ( [0] => Array ( [Type] => doi [Value] => 10.5755/j01.itc.49.4.26808 ) ) [Languages] => Array ( [0] => Array ( [Text] => English ) ) [Subjects] => Array ( [0] => Array ( [SubjectFull] => LT [Type] => general ) [1] => Array ( [SubjectFull] => Deep Learning [Type] => general ) [2] => Array ( [SubjectFull] => Word2vec embeddings [Type] => general ) [3] => Array ( [SubjectFull] => Part-of-speech tagging [Type] => general ) [4] => Array ( [SubjectFull] => Natural language processing [Type] => general ) [5] => Array ( [SubjectFull] => Computational linguistics [Type] => general ) [6] => Array ( [SubjectFull] => Tigrinya language [Type] => general ) [7] => Array ( [SubjectFull] => Straipsnis Clarivate Analytics Web of Science / Article in Clarivate Analytics Web of Science (S1) [Type] => general ) [8] => Array ( [SubjectFull] => Informatika / Informatics (N009) [Type] => general ) ) [Titles] => Array ( [0] => Array ( [TitleFull] => Part-of-speech tagging via deep neural networks for Northern-Ethiopic languages [Type] => main ) ) ) [BibRelationships] => Array ( [HasContributorRelationships] => Array ( [0] => Array ( [PersonEntity] => Array ( [Name] => Array ( [NameFull] => Tesfagergish, Senait Gebremichael ) ) ) [1] => Array ( [PersonEntity] => Array ( [Name] => Array ( [NameFull] => Kapočiūtė-Dzikienė, Jurgita ) ) ) ) [IsPartOfRelationships] => Array ( [0] => Array ( [BibEntity] => Array ( [Dates] => Array ( [0] => Array ( [D] => 01 [M] => 01 [Type] => published [Y] => 2020 ) ) [Identifiers] => Array ( [0] => Array ( [Type] => issn-print [Value] => 1392124X ) [1] => Array ( [Type] => issn-locals [Value] => edsbas ) ) ) ) ) ) )
IllustrationInfo