Conference
PatentEval: Understanding Errors in Patent Generation
العنوان: | PatentEval: Understanding Errors in Patent Generation |
---|---|
المؤلفون: | Zuo, You, Gerdes, Kim, Villemonte de La Clergerie, Eric, Sagot, Benoît |
المساهمون: | Automatic Language Modelling and ANAlysis & Computational Humanities (ALMAnaCH), Inria de Paris, Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Laboratoire Interdisciplinaire des Sciences du Numérique (LISN), Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS) |
المصدر: | NAACL2024 - 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics ; https://hal.science/hal-04595013 ; NAACL2024 - 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Jun 2024, Mexico City, Mexico |
بيانات النشر: | HAL CCSD |
سنة النشر: | 2024 |
مصطلحات موضوعية: | Patent, Large language modelling, Text generation, Evaluation, [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL] |
جغرافية الموضوع: | Mexico City, Mexico |
الوصف: | International audience ; In this work, we introduce a comprehensive error typology specifically designed for evaluating two distinct tasks in machine-generated patent texts: claims-to-abstract generation, and the generation of the next claim given previous ones. We have also developed a benchmark, PatentEval, for systematically assessing language models in this context. Our study includes a comparative analysis, annotated by humans, of various models. These range from those specifically adapted during training for tasks within the patent domain to the latest general-purpose large language models (LLMs). Furthermore, we explored and evaluated some metrics to approximate human judgments in patent text evaluation, analyzing the extent to which these metrics align with expert assessments. These approaches provide valuable insights into the capabilities and limitations of current language models in the specialized field of patent text generation. |
نوع الوثيقة: | conference object |
اللغة: | English |
Relation: | hal-04595013; https://hal.science/hal-04595013; https://hal.science/hal-04595013v2/document; https://hal.science/hal-04595013v2/file/acl_latex.pdf |
الاتاحة: | https://hal.science/hal-04595013 https://hal.science/hal-04595013v2/document https://hal.science/hal-04595013v2/file/acl_latex.pdf |
Rights: | info:eu-repo/semantics/OpenAccess |
رقم الانضمام: | edsbas.765F19C1 |
قاعدة البيانات: | BASE |
ResultId |
1 |
---|---|
Header |
edsbas BASE edsbas.765F19C1 1010 3 Conference conference 1010.37725830078 |
PLink |
https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&scope=site&db=edsbas&AN=edsbas.765F19C1&custid=s6537998&authtype=sso |
FullText |
Array
(
[Availability] => 0
)
Array ( [0] => Array ( [Url] => https://hal.science/hal-04595013# [Name] => EDS - BASE [Category] => fullText [Text] => View record in BASE [MouseOverText] => View record in BASE ) ) |
Items |
Array
(
[Name] => Title
[Label] => Title
[Group] => Ti
[Data] => PatentEval: Understanding Errors in Patent Generation
)
Array ( [Name] => Author [Label] => Authors [Group] => Au [Data] => <searchLink fieldCode="AR" term="%22Zuo%2C+You%22">Zuo, You</searchLink><br /><searchLink fieldCode="AR" term="%22Gerdes%2C+Kim%22">Gerdes, Kim</searchLink><br /><searchLink fieldCode="AR" term="%22Villemonte+de+La+Clergerie%2C+Eric%22">Villemonte de La Clergerie, Eric</searchLink><br /><searchLink fieldCode="AR" term="%22Sagot%2C+Benoît%22">Sagot, Benoît</searchLink> ) Array ( [Name] => Author [Label] => Contributors [Group] => Au [Data] => Automatic Language Modelling and ANAlysis & Computational Humanities (ALMAnaCH)<br />Inria de Paris<br />Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)<br />Laboratoire Interdisciplinaire des Sciences du Numérique (LISN)<br />Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS) ) Array ( [Name] => TitleSource [Label] => Source [Group] => Src [Data] => NAACL2024 - 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics ; https://hal.science/hal-04595013 ; NAACL2024 - 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Jun 2024, Mexico City, Mexico ) Array ( [Name] => Publisher [Label] => Publisher Information [Group] => PubInfo [Data] => HAL CCSD ) Array ( [Name] => DatePubCY [Label] => Publication Year [Group] => Date [Data] => 2024 ) Array ( [Name] => Subject [Label] => Subject Terms [Group] => Su [Data] => <searchLink fieldCode="DE" term="%22Patent%22">Patent</searchLink><br /><searchLink fieldCode="DE" term="%22Large+language+modelling%22">Large language modelling</searchLink><br /><searchLink fieldCode="DE" term="%22Text+generation%22">Text generation</searchLink><br /><searchLink fieldCode="DE" term="%22Evaluation%22">Evaluation</searchLink><br /><searchLink fieldCode="DE" term="%22[INFO%2EINFO-AI]Computer+Science+[cs]%2FArtificial+Intelligence+[cs%2EAI]%22">[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]</searchLink><br /><searchLink fieldCode="DE" term="%22[INFO%2EINFO-CL]Computer+Science+[cs]%2FComputation+and+Language+[cs%2ECL]%22">[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]</searchLink> ) Array ( [Name] => Subject [Label] => Subject Geographic [Group] => Su [Data] => <searchLink fieldCode="DE" term="%22Mexico+City%22">Mexico City</searchLink><br /><searchLink fieldCode="DE" term="%22Mexico%22">Mexico</searchLink> ) Array ( [Name] => Abstract [Label] => Description [Group] => Ab [Data] => International audience ; In this work, we introduce a comprehensive error typology specifically designed for evaluating two distinct tasks in machine-generated patent texts: claims-to-abstract generation, and the generation of the next claim given previous ones. We have also developed a benchmark, PatentEval, for systematically assessing language models in this context. Our study includes a comparative analysis, annotated by humans, of various models. These range from those specifically adapted during training for tasks within the patent domain to the latest general-purpose large language models (LLMs). Furthermore, we explored and evaluated some metrics to approximate human judgments in patent text evaluation, analyzing the extent to which these metrics align with expert assessments. These approaches provide valuable insights into the capabilities and limitations of current language models in the specialized field of patent text generation. ) Array ( [Name] => TypeDocument [Label] => Document Type [Group] => TypDoc [Data] => conference object ) Array ( [Name] => Language [Label] => Language [Group] => Lang [Data] => English ) Array ( [Name] => NoteTitleSource [Label] => Relation [Group] => SrcInfo [Data] => hal-04595013; https://hal.science/hal-04595013; https://hal.science/hal-04595013v2/document; https://hal.science/hal-04595013v2/file/acl_latex.pdf ) Array ( [Name] => URL [Label] => Availability [Group] => URL [Data] => https://hal.science/hal-04595013<br />https://hal.science/hal-04595013v2/document<br />https://hal.science/hal-04595013v2/file/acl_latex.pdf ) Array ( [Name] => Copyright [Label] => Rights [Group] => Cpyrght [Data] => info:eu-repo/semantics/OpenAccess ) Array ( [Name] => AN [Label] => Accession Number [Group] => ID [Data] => edsbas.765F19C1 ) |
RecordInfo |
Array
(
[BibEntity] => Array
(
[Languages] => Array
(
[0] => Array
(
[Text] => English
)
)
[Subjects] => Array
(
[0] => Array
(
[SubjectFull] => Mexico City
[Type] => general
)
[1] => Array
(
[SubjectFull] => Mexico
[Type] => general
)
[2] => Array
(
[SubjectFull] => Patent
[Type] => general
)
[3] => Array
(
[SubjectFull] => Large language modelling
[Type] => general
)
[4] => Array
(
[SubjectFull] => Text generation
[Type] => general
)
[5] => Array
(
[SubjectFull] => Evaluation
[Type] => general
)
[6] => Array
(
[SubjectFull] => [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]
[Type] => general
)
[7] => Array
(
[SubjectFull] => [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
[Type] => general
)
)
[Titles] => Array
(
[0] => Array
(
[TitleFull] => PatentEval: Understanding Errors in Patent Generation
[Type] => main
)
)
)
[BibRelationships] => Array
(
[HasContributorRelationships] => Array
(
[0] => Array
(
[PersonEntity] => Array
(
[Name] => Array
(
[NameFull] => Zuo, You
)
)
)
[1] => Array
(
[PersonEntity] => Array
(
[Name] => Array
(
[NameFull] => Gerdes, Kim
)
)
)
[2] => Array
(
[PersonEntity] => Array
(
[Name] => Array
(
[NameFull] => Villemonte de La Clergerie, Eric
)
)
)
[3] => Array
(
[PersonEntity] => Array
(
[Name] => Array
(
[NameFull] => Sagot, Benoît
)
)
)
[4] => Array
(
[PersonEntity] => Array
(
[Name] => Array
(
[NameFull] => Automatic Language Modelling and ANAlysis & Computational Humanities (ALMAnaCH)
)
)
)
[5] => Array
(
[PersonEntity] => Array
(
[Name] => Array
(
[NameFull] => Inria de Paris
)
)
)
[6] => Array
(
[PersonEntity] => Array
(
[Name] => Array
(
[NameFull] => Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)
)
)
)
[7] => Array
(
[PersonEntity] => Array
(
[Name] => Array
(
[NameFull] => Laboratoire Interdisciplinaire des Sciences du Numérique (LISN)
)
)
)
[8] => Array
(
[PersonEntity] => Array
(
[Name] => Array
(
[NameFull] => Institut National de Recherche en Informatique et en Automatique (Inria)-CentraleSupélec-Université Paris-Saclay-Centre National de la Recherche Scientifique (CNRS)
)
)
)
)
[IsPartOfRelationships] => Array
(
[0] => Array
(
[BibEntity] => Array
(
[Dates] => Array
(
[0] => Array
(
[D] => 01
[M] => 01
[Type] => published
[Y] => 2024
)
)
[Identifiers] => Array
(
[0] => Array
(
[Type] => issn-locals
[Value] => edsbas
)
[1] => Array
(
[Type] => issn-locals
[Value] => edsbas.oa
)
)
[Titles] => Array
(
[0] => Array
(
[TitleFull] => NAACL2024 - 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics ; https://hal.science/hal-04595013 ; NAACL2024 - 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics, Jun 2024, Mexico City, Mexico
[Type] => main
)
)
)
)
)
)
)
|
IllustrationInfo |