Academic Journal
Part-of-speech tagging via deep neural networks for Northern-Ethiopic languages
العنوان: | Part-of-speech tagging via deep neural networks for Northern-Ethiopic languages |
---|---|
المؤلفون: | Tesfagergish, Senait Gebremichael, Kapočiūtė-Dzikienė, Jurgita |
سنة النشر: | 2020 |
المجموعة: | Vytautas Magnus University e-Publication Repository (VMU ePub) / Vytauto Didžiojo universitetas: e. publikacijų talpykla (VDU ePub) |
مصطلحات موضوعية: | Deep Learning, Word2vec embeddings, Part-of-speech tagging, Natural language processing, Computational linguistics, Tigrinya language, Straipsnis Clarivate Analytics Web of Science / Article in Clarivate Analytics Web of Science (S1), Informatika / Informatics (N009) |
جغرافية الموضوع: | LT |
الوصف: | Deep Neural Networks (DNNs) have proven to be especially successful in the area of Natural Language Processing (NLP) and Part-Of-Speech (POS) tagging—which is the process of mapping words to their corresponding POS labels depending on the context. Despite recent development of language technologies, low-resourced languages (such as an East African Tigrinya language), have received too little attention. We investigate the effectiveness of Deep Learning (DL) solutions for the low-resourced Tigrinya language of the Northern-Ethiopic branch. We have selected Tigrinya as the testbed example and have tested state-of-the-art DL approaches seeking to build the most accurate POS tagger. We have evaluated DNN classifiers (Feed Forward Neural Network – FFNN, Long Short-Term Memory method – LSTM, Bidirectional LSTM, and Convolutional Neural Network – CNN) on a top of neural word2vec word embeddings with a small training corpus known as Nagaoka Tigrinya Corpus [19]. To determine the best DNN classifier type, its architecture and hyper-parameter set both manual and automatic hyper-parameter tuning has been performed. BiLSTM method was proved to be the most suitable for our solving task: it achieved the highest accuracy equal to ~92% that is ~65% above the random baseline ; Kauno technologijos universitetas ; Taikomosios informatikos katedra ; Vytauto Didžiojo universitetas |
نوع الوثيقة: | article in journal/newspaper |
وصف الملف: | p. 482-494 |
اللغة: | English |
تدمد: | 1392124X |
Relation: | Informacinės technologijos ir valdymas = Information technology and control. Kaunas : Technologija, 2020, t. 49, nr. 4; Science Citation Index Expanded (Web of Science); INSPEC; VINITI; Scopus; VDU02-000065257; https://doi.org/10.5755/j01.itc.49.4.26808; WOS:000709572200003 |
DOI: | 10.5755/j01.itc.49.4.26808 |
الاتاحة: | https://doi.org/10.5755/j01.itc.49.4.26808 |
رقم الانضمام: | edsbas.3F5F6AC4 |
قاعدة البيانات: | BASE |
ResultId |
1 |
---|---|
Header |
edsbas BASE edsbas.3F5F6AC4 835 3 Academic Journal academicJournal 835.017944335938 |
PLink |
https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&scope=site&db=edsbas&AN=edsbas.3F5F6AC4&custid=s6537998&authtype=sso |
FullText |
Array
(
[Availability] => 0
)
Array ( [0] => Array ( [Url] => https://doi.org/10.5755/j01.itc.49.4.26808# [Name] => EDS - BASE [Category] => fullText [Text] => View record in BASE [MouseOverText] => View record in BASE ) [1] => Array ( [Url] => https://resolver.ebscohost.com/openurl?custid=s6537998&groupid=main&authtype=ip,guest&sid=EBSCO:edsbas&genre=article&issn=1392124X&ISBN=&volume=&issue=&date=20200101&spage=&pages=&title=Part-of-speech tagging via deep neural networks for Northern-Ethiopic languages&atitle=Part-of-speech%20tagging%20via%20deep%20neural%20networks%20for%20Northern-Ethiopic%20languages&id=DOI:10.5755/j01.itc.49.4.26808 [Name] => Full Text Finder (s6537998api) [Category] => fullText [Text] => Full Text Finder [Icon] => https://imageserver.ebscohost.com/branding/images/FTF.gif [MouseOverText] => Full Text Finder ) ) |
Items |
Array
(
[Name] => Title
[Label] => Title
[Group] => Ti
[Data] => Part-of-speech tagging via deep neural networks for Northern-Ethiopic languages
)
Array ( [Name] => Author [Label] => Authors [Group] => Au [Data] => <searchLink fieldCode="AR" term="%22Tesfagergish%2C+Senait+Gebremichael%22">Tesfagergish, Senait Gebremichael</searchLink><br /><searchLink fieldCode="AR" term="%22Kapočiūtė-Dzikienė%2C+Jurgita%22">Kapočiūtė-Dzikienė, Jurgita</searchLink> ) Array ( [Name] => DatePubCY [Label] => Publication Year [Group] => Date [Data] => 2020 ) Array ( [Name] => Subset [Label] => Collection [Group] => HoldingsInfo [Data] => Vytautas Magnus University e-Publication Repository (VMU ePub) / Vytauto Didžiojo universitetas: e. publikacijų talpykla (VDU ePub) ) Array ( [Name] => Subject [Label] => Subject Terms [Group] => Su [Data] => <searchLink fieldCode="DE" term="%22Deep+Learning%22">Deep Learning</searchLink><br /><searchLink fieldCode="DE" term="%22Word2vec+embeddings%22">Word2vec embeddings</searchLink><br /><searchLink fieldCode="DE" term="%22Part-of-speech+tagging%22">Part-of-speech tagging</searchLink><br /><searchLink fieldCode="DE" term="%22Natural+language+processing%22">Natural language processing</searchLink><br /><searchLink fieldCode="DE" term="%22Computational+linguistics%22">Computational linguistics</searchLink><br /><searchLink fieldCode="DE" term="%22Tigrinya+language%22">Tigrinya language</searchLink><br /><searchLink fieldCode="DE" term="%22Straipsnis+Clarivate+Analytics+Web+of+Science+%2F+Article+in+Clarivate+Analytics+Web+of+Science+%28S1%29%22">Straipsnis Clarivate Analytics Web of Science / Article in Clarivate Analytics Web of Science (S1)</searchLink><br /><searchLink fieldCode="DE" term="%22Informatika+%2F+Informatics+%28N009%29%22">Informatika / Informatics (N009)</searchLink> ) Array ( [Name] => Subject [Label] => Subject Geographic [Group] => Su [Data] => <searchLink fieldCode="DE" term="%22LT%22">LT</searchLink> ) Array ( [Name] => Abstract [Label] => Description [Group] => Ab [Data] => Deep Neural Networks (DNNs) have proven to be especially successful in the area of Natural Language Processing (NLP) and Part-Of-Speech (POS) tagging—which is the process of mapping words to their corresponding POS labels depending on the context. Despite recent development of language technologies, low-resourced languages (such as an East African Tigrinya language), have received too little attention. We investigate the effectiveness of Deep Learning (DL) solutions for the low-resourced Tigrinya language of the Northern-Ethiopic branch. We have selected Tigrinya as the testbed example and have tested state-of-the-art DL approaches seeking to build the most accurate POS tagger. We have evaluated DNN classifiers (Feed Forward Neural Network – FFNN, Long Short-Term Memory method – LSTM, Bidirectional LSTM, and Convolutional Neural Network – CNN) on a top of neural word2vec word embeddings with a small training corpus known as Nagaoka Tigrinya Corpus [19]. To determine the best DNN classifier type, its architecture and hyper-parameter set both manual and automatic hyper-parameter tuning has been performed. BiLSTM method was proved to be the most suitable for our solving task: it achieved the highest accuracy equal to ~92% that is ~65% above the random baseline ; Kauno technologijos universitetas ; Taikomosios informatikos katedra ; Vytauto Didžiojo universitetas ) Array ( [Name] => TypeDocument [Label] => Document Type [Group] => TypDoc [Data] => article in journal/newspaper ) Array ( [Name] => Format [Label] => File Description [Group] => SrcInfo [Data] => p. 482-494 ) Array ( [Name] => Language [Label] => Language [Group] => Lang [Data] => English ) Array ( [Name] => ISSN [Label] => ISSN [Group] => ISSN [Data] => 1392124X ) Array ( [Name] => NoteTitleSource [Label] => Relation [Group] => SrcInfo [Data] => Informacinės technologijos ir valdymas = Information technology and control. Kaunas : Technologija, 2020, t. 49, nr. 4; Science Citation Index Expanded (Web of Science); INSPEC; VINITI; Scopus; VDU02-000065257; https://doi.org/10.5755/j01.itc.49.4.26808; WOS:000709572200003 ) Array ( [Name] => DOI [Label] => DOI [Group] => ID [Data] => 10.5755/j01.itc.49.4.26808 ) Array ( [Name] => URL [Label] => Availability [Group] => URL [Data] => https://doi.org/10.5755/j01.itc.49.4.26808 ) Array ( [Name] => AN [Label] => Accession Number [Group] => ID [Data] => edsbas.3F5F6AC4 ) |
RecordInfo |
Array
(
[BibEntity] => Array
(
[Identifiers] => Array
(
[0] => Array
(
[Type] => doi
[Value] => 10.5755/j01.itc.49.4.26808
)
)
[Languages] => Array
(
[0] => Array
(
[Text] => English
)
)
[Subjects] => Array
(
[0] => Array
(
[SubjectFull] => LT
[Type] => general
)
[1] => Array
(
[SubjectFull] => Deep Learning
[Type] => general
)
[2] => Array
(
[SubjectFull] => Word2vec embeddings
[Type] => general
)
[3] => Array
(
[SubjectFull] => Part-of-speech tagging
[Type] => general
)
[4] => Array
(
[SubjectFull] => Natural language processing
[Type] => general
)
[5] => Array
(
[SubjectFull] => Computational linguistics
[Type] => general
)
[6] => Array
(
[SubjectFull] => Tigrinya language
[Type] => general
)
[7] => Array
(
[SubjectFull] => Straipsnis Clarivate Analytics Web of Science / Article in Clarivate Analytics Web of Science (S1)
[Type] => general
)
[8] => Array
(
[SubjectFull] => Informatika / Informatics (N009)
[Type] => general
)
)
[Titles] => Array
(
[0] => Array
(
[TitleFull] => Part-of-speech tagging via deep neural networks for Northern-Ethiopic languages
[Type] => main
)
)
)
[BibRelationships] => Array
(
[HasContributorRelationships] => Array
(
[0] => Array
(
[PersonEntity] => Array
(
[Name] => Array
(
[NameFull] => Tesfagergish, Senait Gebremichael
)
)
)
[1] => Array
(
[PersonEntity] => Array
(
[Name] => Array
(
[NameFull] => Kapočiūtė-Dzikienė, Jurgita
)
)
)
)
[IsPartOfRelationships] => Array
(
[0] => Array
(
[BibEntity] => Array
(
[Dates] => Array
(
[0] => Array
(
[D] => 01
[M] => 01
[Type] => published
[Y] => 2020
)
)
[Identifiers] => Array
(
[0] => Array
(
[Type] => issn-print
[Value] => 1392124X
)
[1] => Array
(
[Type] => issn-locals
[Value] => edsbas
)
)
)
)
)
)
)
|
IllustrationInfo |