التفاصيل البيبلوغرافية
العنوان: |
RuGECToR: Rule-Based Neural Network Model for Russian Language Grammatical Error Correction. |
المؤلفون: |
Khabutdinov, I. A.1,2 (AUTHOR) khabutdinov@ap-team.ru, Chashchin, A. V.2 (AUTHOR) chashchin@ap-team.ru, Grabovoy, A. V.1,2 (AUTHOR) grabovoy@ap-team.ru, Kildyakov, A. S.2 (AUTHOR) kildyakov@ap-team.ru, Chekhovich, U. V.2,3 (AUTHOR) chehovich@ap-team.ru |
المصدر: |
Programming & Computer Software. Aug2024, Vol. 50 Issue 4, p315-321. 7p. |
مصطلحات موضوعية: |
*ARTIFICIAL neural networks, *RUSSIAN language |
مستخلص: |
Grammatical error correction is one of the core natural language processing tasks. Presently, the open-source state-of-the-art sequence tagging for English is the GECToR model. For Russian, this problem does not have equally effective solutions due to the lack of annotated datasets, which motivated the current research. In this paper, we describe the process of creating a synthetic dataset and training the model on it. The GECToR architecture is adapted for the Russian language, and it is called RuGECToR. This architecture is chosen because, unlike the sequence-to-sequence approach, it is easy to interpret and does not require a lot of training data. The aim is to train the model in such a way that it generalizes the morphological properties of the language rather than adapts to a specific training sample. The presented model achieves the quality of 82.5 in the metric on synthetic data and 22.2 on the RULEC dataset, which was not used at the training stage. [ABSTRACT FROM AUTHOR] |
قاعدة البيانات: |
Academic Search Index |