PatentEval: Understanding Errors in Patent Generation

التفاصيل البيبلوغرافية
العنوان: PatentEval: Understanding Errors in Patent Generation
المؤلفون: Zuo, You, Gerdes, Kim, Villemonte de La Clergerie, Eric, Sagot, Benoît
المساهمون: Automatic Language Modelling and ANAlysis & Computational Humanities (ALMAnaCH), Inria de Paris, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Laboratoire Interdisciplinaire des Sciences du Numérique (LISN), Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)
المصدر: NAACL2024 - 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics ; https://hal.science/hal-04595013 ; NAACL2024 - 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Jun 2024, Mexico City, Mexico
بيانات النشر: HAL CCSD
سنة النشر: 2024
مصطلحات موضوعية: Patent, Large language modelling, Text generation, Evaluation, [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
جغرافية الموضوع: Mexico City, Mexico
الوصف: International audience ; In this work, we introduce a comprehensive error typology specifically designed for evaluating two distinct tasks in machine-generated patent texts: claims-to-abstract generation, and the generation of the next claim given previous ones. We have also developed a benchmark, PatentEval, for systematically assessing language models in this context. Our study includes a comparative analysis, annotated by humans, of various models. These range from those specifically adapted during training for tasks within the patent domain to the latest general-purpose large language models (LLMs). Furthermore, we explored and evaluated some metrics to approximate human judgments in patent text evaluation, analyzing the extent to which these metrics align with expert assessments. These approaches provide valuable insights into the capabilities and limitations of current language models in the specialized field of patent text generation.
نوع الوثيقة: conference object
اللغة: English
Relation: hal-04595013; https://hal.science/hal-04595013; https://hal.science/hal-04595013v2/document; https://hal.science/hal-04595013v2/file/acl_latex.pdf
الاتاحة: https://hal.science/hal-04595013
https://hal.science/hal-04595013v2/document
https://hal.science/hal-04595013v2/file/acl_latex.pdf
Rights: info:eu-repo/semantics/OpenAccess
رقم الانضمام: edsbas.765F19C1
قاعدة البيانات: BASE