A Local Grammar - Dictionary-Graph approach for the extraction of complex text segments

التفاصيل البيبلوغرافية
العنوان: A Local Grammar - Dictionary-Graph approach for the extraction of complex text segments
المؤلفون: Martineau, Claude, Martinez, Cristian
المساهمون: Laboratoire d'Informatique Gaspard-Monge (LIGM), Université Paris-Est Marne-la-Vallée (UPEM)-École nationale des ponts et chaussées (ENPC)-ESIEE Paris-Fédération de Recherche Bézout (BEZOUT), Centre National de la Recherche Scientifique (CNRS)-Centre National de la Recherche Scientifique (CNRS)-Centre National de la Recherche Scientifique (CNRS), Labex Bézout
المصدر: Different perspectives on computer text analysis and classification ; https://hal.science/hal-01613709 ; Different perspectives on computer text analysis and classification, Labex Bézout, Oct 2015, Paris, France
بيانات النشر: CCSD
سنة النشر: 2015
مصطلحات موضوعية: complex text segment, local grammar, dictionary-graph, event extraction, unitex/gramlab, [SCCO.LING]Cognitive science/Linguistics, [INFO]Computer Science [cs]
جغرافية الموضوع: Paris, France
الوصف: International audience ; In this talk we will describe a Local Grammar and Dictionary-Graph approach to develop resources for the extraction of complex text segments. A complex text segment is an extended notion of multi-word units (MWUs) that allows a large description of more complex and syntactically more flexible linguistic patterns. First we will present some basics about Unitex/GramLab, an open-source corpus processing suite. Then, we will show how to describe complex language constructions through graphs and how to produce on-the-fly electronic dictionary entries across graphs transductions. As example, we will illustrate a way to combine dictionaries, local grammars and dictionary-graphs to identify some complex text segments as part of an event extraction task. Finally, we will discuss some advantages and drawbacks of our approach and highlight potential perspectives of further research and applications.
نوع الوثيقة: conference object
اللغة: English
الاتاحة: https://hal.science/hal-01613709
رقم الانضمام: edsbas.DA2995A1
قاعدة البيانات: BASE