A Closer Look at Logical Reasoning with LLMs: The Choice of Tool Matters

التفاصيل البيبلوغرافية
العنوان:	A Closer Look at Logical Reasoning with LLMs: The Choice of Tool Matters
المؤلفون:	Lam, Long Hei Matthew, Thatikonda, Ramya Keerthy, Shareghi, Ehsan
سنة النشر:	2024
المجموعة:	Computer Science
مصطلحات موضوعية:	Computer Science - Computation and Language
الوصف:	The emergence of Large Language Models (LLMs) has demonstrated promising progress in solving logical reasoning tasks effectively. Several recent approaches have proposed to change the role of the LLM from the reasoner into a translator between natural language statements and symbolic representations which are then sent to external symbolic solvers to resolve. This paradigm has established the current state-of-the-art result in logical reasoning (i.e., deductive reasoning). However, it remains unclear whether the variance in performance of these approaches stems from the methodologies employed or the specific symbolic solvers utilized. There is a lack of consistent comparison between symbolic solvers and how they influence the overall reported performance. This is important, as each symbolic solver also has its own input symbolic language, presenting varying degrees of challenge in the translation process. To address this gap, we perform experiments on 3 deductive reasoning benchmarks with LLMs augmented with widely used symbolic solvers: Z3, Pyke, and Prover9. The tool-executable rates of symbolic translation generated by different LLMs exhibit a near 50% performance variation. This highlights a significant difference in performance rooted in very basic choices of tools. The almost linear correlation between the executable rate of translations and the accuracy of the outcomes from Prover9 highlight a strong alignment between LLMs ability to translate into Prover9 symbolic language, and the correctness of those translations. Comment: Code and data are publicly available at: https://github.com/Mattylam/Logic_Symbolic_Solvers_Experiment
نوع الوثيقة:	Working Paper
URL الوصول:	http://arxiv.org/abs/2406.00284
رقم الانضمام:	edsarx.2406.00284
قاعدة البيانات:	arXiv

View record in Arxiv

ResultId	1
Header	edsarx arXiv edsarx.2406.00284 1098 3 Report report 1098.03295898438
PLink	https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&scope=site&db=edsarx&AN=edsarx.2406.00284&custid=s6537998&authtype=sso
FullText	Array ( [Availability] => 0 ) Array ( [0] => Array ( [Url] => http://arxiv.org/abs/2406.00284 [Name] => EDS - Arxiv [Category] => fullText [Text] => View record in Arxiv [MouseOverText] => View record in Arxiv ) )
Items	Array ( [Name] => Title [Label] => Title [Group] => Ti [Data] => A Closer Look at Logical Reasoning with LLMs: The Choice of Tool Matters ) Array ( [Name] => Author [Label] => Authors [Group] => Au [Data] => <searchLink fieldCode="AR" term="%22Lam%2C+Long+Hei+Matthew%22">Lam, Long Hei Matthew</searchLink><br /><searchLink fieldCode="AR" term="%22Thatikonda%2C+Ramya+Keerthy%22">Thatikonda, Ramya Keerthy</searchLink><br /><searchLink fieldCode="AR" term="%22Shareghi%2C+Ehsan%22">Shareghi, Ehsan</searchLink> ) Array ( [Name] => DatePubCY [Label] => Publication Year [Group] => Date [Data] => 2024 ) Array ( [Name] => Subset [Label] => Collection [Group] => HoldingsInfo [Data] => Computer Science ) Array ( [Name] => Subject [Label] => Subject Terms [Group] => Su [Data] => <searchLink fieldCode="DE" term="%22Computer+Science+-+Computation+and+Language%22">Computer Science - Computation and Language</searchLink> ) Array ( [Name] => Abstract [Label] => Description [Group] => Ab [Data] => The emergence of Large Language Models (LLMs) has demonstrated promising progress in solving logical reasoning tasks effectively. Several recent approaches have proposed to change the role of the LLM from the reasoner into a translator between natural language statements and symbolic representations which are then sent to external symbolic solvers to resolve. This paradigm has established the current state-of-the-art result in logical reasoning (i.e., deductive reasoning). However, it remains unclear whether the variance in performance of these approaches stems from the methodologies employed or the specific symbolic solvers utilized. There is a lack of consistent comparison between symbolic solvers and how they influence the overall reported performance. This is important, as each symbolic solver also has its own input symbolic language, presenting varying degrees of challenge in the translation process. To address this gap, we perform experiments on 3 deductive reasoning benchmarks with LLMs augmented with widely used symbolic solvers: Z3, Pyke, and Prover9. The tool-executable rates of symbolic translation generated by different LLMs exhibit a near 50% performance variation. This highlights a significant difference in performance rooted in very basic choices of tools. The almost linear correlation between the executable rate of translations and the accuracy of the outcomes from Prover9 highlight a strong alignment between LLMs ability to translate into Prover9 symbolic language, and the correctness of those translations.<br />Comment: Code and data are publicly available at: https://github.com/Mattylam/Logic_Symbolic_Solvers_Experiment ) Array ( [Name] => TypeDocument [Label] => Document Type [Group] => TypDoc [Data] => Working Paper ) Array ( [Name] => URL [Label] => Access URL [Group] => URL [Data] => <link linkTarget="URL" linkTerm="http://arxiv.org/abs/2406.00284" linkWindow="_blank">http://arxiv.org/abs/2406.00284</link> ) Array ( [Name] => AN [Label] => Accession Number [Group] => ID [Data] => edsarx.2406.00284 )
RecordInfo	Array ( [BibEntity] => Array ( [Subjects] => Array ( [0] => Array ( [SubjectFull] => Computer Science - Computation and Language [Type] => general ) ) [Titles] => Array ( [0] => Array ( [TitleFull] => A Closer Look at Logical Reasoning with LLMs: The Choice of Tool Matters [Type] => main ) ) ) [BibRelationships] => Array ( [HasContributorRelationships] => Array ( [0] => Array ( [PersonEntity] => Array ( [Name] => Array ( [NameFull] => Lam, Long Hei Matthew ) ) ) [1] => Array ( [PersonEntity] => Array ( [Name] => Array ( [NameFull] => Thatikonda, Ramya Keerthy ) ) ) [2] => Array ( [PersonEntity] => Array ( [Name] => Array ( [NameFull] => Shareghi, Ehsan ) ) ) ) [IsPartOfRelationships] => Array ( [0] => Array ( [BibEntity] => Array ( [Dates] => Array ( [0] => Array ( [D] => 31 [M] => 05 [Type] => published [Y] => 2024 ) ) ) ) ) ) )
IllustrationInfo