Pragmatic annotation of a domain-restricted English-Spanish comparable corpus

التفاصيل البيبلوغرافية
العنوان: Pragmatic annotation of a domain-restricted English-Spanish comparable corpus
المؤلفون: Noelia Ramón, Rosa Rabadán, Hugo Sanjurjo-González
المصدر: Bergen Language and Linguistics Studies. 11:209-223
بيانات النشر: Universtity of Bergen Library, 2021.
سنة النشر: 2021
مصطلحات موضوعية: Annotation, Computer science, business.industry, A domain, General Medicine, Artificial intelligence, computer.software_genre, business, computer, Natural language processing
الوصف: This paper explores the multi-layer annotation of a written domain-restricted English-Spanish comparable corpus (CLANES – Controlled LANguage English Spanish), focusing on pragmatic annotation. The annotation scheme draws on part of speech tagging and a semantic annotation scheme, i.e. the UCREL Semantic Analysis System, with some added categories to fit the food-and-drink domain represented in CLANES. These are used to build significant (pragmatic) metapatterns. Seven different pragmatic functions have been identified in our corpus, namely , , , , , and . Computer scripts translate this linguistic information into regular expressions to be used in unsupervised annotation. Partial results indicate that applying lexical restrictors boosts the success rate considerably. However, metadata is preferred because of increased replicability and generality. Replicability issues and limitations encountered during testing are also addressed.
تدمد: 1892-2449
DOI: 10.15845/bells.v11i1.3445
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_________::a73168562f0ba43699ce50735a55b01a
https://doi.org/10.15845/bells.v11i1.3445
Rights: OPEN
رقم الانضمام: edsair.doi...........a73168562f0ba43699ce50735a55b01a
قاعدة البيانات: OpenAIRE