SQLfuse: Enhancing Text-to-SQL Performance through Comprehensive LLM Synergy

التفاصيل البيبلوغرافية
العنوان: SQLfuse: Enhancing Text-to-SQL Performance through Comprehensive LLM Synergy
المؤلفون: Zhang, Tingkai, Chen, Chaoyu, Liao, Cong, Wang, Jun, Zhao, Xudong, Yu, Hang, Wang, Jianchao, Li, Jianguo, Shi, Wenhui
سنة النشر: 2024
المجموعة: Computer Science
مصطلحات موضوعية: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Databases
الوصف: Text-to-SQL conversion is a critical innovation, simplifying the transition from complex SQL to intuitive natural language queries, especially significant given SQL's prevalence in the job market across various roles. The rise of Large Language Models (LLMs) like GPT-3.5 and GPT-4 has greatly advanced this field, offering improved natural language understanding and the ability to generate nuanced SQL statements. However, the potential of open-source LLMs in Text-to-SQL applications remains underexplored, with many frameworks failing to leverage their full capabilities, particularly in handling complex database queries and incorporating feedback for iterative refinement. Addressing these limitations, this paper introduces SQLfuse, a robust system integrating open-source LLMs with a suite of tools to enhance Text-to-SQL translation's accuracy and usability. SQLfuse features four modules: schema mining, schema linking, SQL generation, and a SQL critic module, to not only generate but also continuously enhance SQL query quality. Demonstrated by its leading performance on the Spider Leaderboard and deployment by Ant Group, SQLfuse showcases the practical merits of open-source LLMs in diverse business contexts.
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2407.14568
رقم الانضمام: edsarx.2407.14568
قاعدة البيانات: arXiv
ResultId 1
Header edsarx
arXiv
edsarx.2407.14568
1112
3
Report
report
1112.23120117188
PLink https://search.ebscohost.com/login.aspx?direct=true&site=eds-live&scope=site&db=edsarx&AN=edsarx.2407.14568&custid=s6537998&authtype=sso
FullText Array ( [Availability] => 0 )
Array ( [0] => Array ( [Url] => http://arxiv.org/abs/2407.14568 [Name] => EDS - Arxiv [Category] => fullText [Text] => View record in Arxiv [MouseOverText] => View record in Arxiv ) )
Items Array ( [Name] => Title [Label] => Title [Group] => Ti [Data] => SQLfuse: Enhancing Text-to-SQL Performance through Comprehensive LLM Synergy )
Array ( [Name] => Author [Label] => Authors [Group] => Au [Data] => <searchLink fieldCode="AR" term="%22Zhang%2C+Tingkai%22">Zhang, Tingkai</searchLink><br /><searchLink fieldCode="AR" term="%22Chen%2C+Chaoyu%22">Chen, Chaoyu</searchLink><br /><searchLink fieldCode="AR" term="%22Liao%2C+Cong%22">Liao, Cong</searchLink><br /><searchLink fieldCode="AR" term="%22Wang%2C+Jun%22">Wang, Jun</searchLink><br /><searchLink fieldCode="AR" term="%22Zhao%2C+Xudong%22">Zhao, Xudong</searchLink><br /><searchLink fieldCode="AR" term="%22Yu%2C+Hang%22">Yu, Hang</searchLink><br /><searchLink fieldCode="AR" term="%22Wang%2C+Jianchao%22">Wang, Jianchao</searchLink><br /><searchLink fieldCode="AR" term="%22Li%2C+Jianguo%22">Li, Jianguo</searchLink><br /><searchLink fieldCode="AR" term="%22Shi%2C+Wenhui%22">Shi, Wenhui</searchLink> )
Array ( [Name] => DatePubCY [Label] => Publication Year [Group] => Date [Data] => 2024 )
Array ( [Name] => Subset [Label] => Collection [Group] => HoldingsInfo [Data] => Computer Science )
Array ( [Name] => Subject [Label] => Subject Terms [Group] => Su [Data] => <searchLink fieldCode="DE" term="%22Computer+Science+-+Computation+and+Language%22">Computer Science - Computation and Language</searchLink><br /><searchLink fieldCode="DE" term="%22Computer+Science+-+Artificial+Intelligence%22">Computer Science - Artificial Intelligence</searchLink><br /><searchLink fieldCode="DE" term="%22Computer+Science+-+Databases%22">Computer Science - Databases</searchLink> )
Array ( [Name] => Abstract [Label] => Description [Group] => Ab [Data] => Text-to-SQL conversion is a critical innovation, simplifying the transition from complex SQL to intuitive natural language queries, especially significant given SQL's prevalence in the job market across various roles. The rise of Large Language Models (LLMs) like GPT-3.5 and GPT-4 has greatly advanced this field, offering improved natural language understanding and the ability to generate nuanced SQL statements. However, the potential of open-source LLMs in Text-to-SQL applications remains underexplored, with many frameworks failing to leverage their full capabilities, particularly in handling complex database queries and incorporating feedback for iterative refinement. Addressing these limitations, this paper introduces SQLfuse, a robust system integrating open-source LLMs with a suite of tools to enhance Text-to-SQL translation's accuracy and usability. SQLfuse features four modules: schema mining, schema linking, SQL generation, and a SQL critic module, to not only generate but also continuously enhance SQL query quality. Demonstrated by its leading performance on the Spider Leaderboard and deployment by Ant Group, SQLfuse showcases the practical merits of open-source LLMs in diverse business contexts. )
Array ( [Name] => TypeDocument [Label] => Document Type [Group] => TypDoc [Data] => Working Paper )
Array ( [Name] => URL [Label] => Access URL [Group] => URL [Data] => <link linkTarget="URL" linkTerm="http://arxiv.org/abs/2407.14568" linkWindow="_blank">http://arxiv.org/abs/2407.14568</link> )
Array ( [Name] => AN [Label] => Accession Number [Group] => ID [Data] => edsarx.2407.14568 )
RecordInfo Array ( [BibEntity] => Array ( [Subjects] => Array ( [0] => Array ( [SubjectFull] => Computer Science - Computation and Language [Type] => general ) [1] => Array ( [SubjectFull] => Computer Science - Artificial Intelligence [Type] => general ) [2] => Array ( [SubjectFull] => Computer Science - Databases [Type] => general ) ) [Titles] => Array ( [0] => Array ( [TitleFull] => SQLfuse: Enhancing Text-to-SQL Performance through Comprehensive LLM Synergy [Type] => main ) ) ) [BibRelationships] => Array ( [HasContributorRelationships] => Array ( [0] => Array ( [PersonEntity] => Array ( [Name] => Array ( [NameFull] => Zhang, Tingkai ) ) ) [1] => Array ( [PersonEntity] => Array ( [Name] => Array ( [NameFull] => Chen, Chaoyu ) ) ) [2] => Array ( [PersonEntity] => Array ( [Name] => Array ( [NameFull] => Liao, Cong ) ) ) [3] => Array ( [PersonEntity] => Array ( [Name] => Array ( [NameFull] => Wang, Jun ) ) ) [4] => Array ( [PersonEntity] => Array ( [Name] => Array ( [NameFull] => Zhao, Xudong ) ) ) [5] => Array ( [PersonEntity] => Array ( [Name] => Array ( [NameFull] => Yu, Hang ) ) ) [6] => Array ( [PersonEntity] => Array ( [Name] => Array ( [NameFull] => Wang, Jianchao ) ) ) [7] => Array ( [PersonEntity] => Array ( [Name] => Array ( [NameFull] => Li, Jianguo ) ) ) [8] => Array ( [PersonEntity] => Array ( [Name] => Array ( [NameFull] => Shi, Wenhui ) ) ) ) [IsPartOfRelationships] => Array ( [0] => Array ( [BibEntity] => Array ( [Dates] => Array ( [0] => Array ( [D] => 19 [M] => 07 [Type] => published [Y] => 2024 ) ) ) ) ) ) )
IllustrationInfo