Advancing Structured Query Processing in Retrieval-Augmented Generation with Generative Semantic Integration

Authors

  • Yihe Yang
  • Xiaoming Li
  • Hongwei Jin
  • Kun Huang

DOI:

https://doi.org/10.54097/z309gx59

Keywords:

Retrieval-Augmented Generation, Structured Query Processing, Semantic Integration

Abstract

Retrieval-Augmented Generation (RAG) has become a pivotal approach in enhancing language models by incorporating external knowledge during the text generation process. However, traditional RAG systems often face challenges in processing structured queries, leading to suboptimal integration of retrieved information. In this paper, we introduce a novel method called Generative Semantic Integration (GSI), which advances structured query processing within RAG frameworks. GSI leverages generative models to semantically integrate structured queries with retrieved data, enabling more coherent and contextually relevant responses. Our experiments on benchmark datasets demonstrate that GSI significantly improves the performance of RAG systems in structured query understanding and response generation, outperforming existing baseline models.

Downloads

Download data is not yet available.

References

[1] T. Hwang, S. Jeong, S. Cho, S. Han, and J. C. Park, "DSLR: Document Refinement with Sentence-Level Re-ranking and Reconstruction to Enhance Retrieval-Augmented Generation," Cornell University, 4 Jul. 2024, https://doi.org/10. 48550/ arXiv.2407..

[2] K. Roy et al., "QA-RAG: Leveraging Question and Answer-based Retrieved Chunk Re-Formatting for Improving Response Quality During Retrieval-augmented Generation," 4 Jul. 2024, https://doi.org/10.20944/preprints202407.0376.v1.

[3] Y. Liang, "Balancing: The Effects of AI Tools in Educational Context," vol. 3, no. 8, 22 Aug. 2023, pp. 7-10, https://doi. org/10.54691/fhss.v3i8.5531.

[4] J. Su and W. Yang, "Unlocking the Power of ChatGPT: A Framework for Applying Generative AI in Education," SAGE Publishing, vol. 6, no. 3, 19 Apr. 2023, pp. 355-366, https:// doi.org/10.1177/20965311231168423.

[5] A. Anand, "A Deep Dive into Retrieval-Augmented Generation (RAG): How It Works Behind the Scenes!," 5 Sep. 2024, https://dev.to/abhinowww/a-deep-dive-into-retrieval-augmented-generation-rag-how-it-works-behind-the-scenes-4eid.

[6] S. Xu et al., "Unsupervised Information Refinement Training of Large Language Models for Retrieval-Augmented Generation," Cornell University, 28 Feb. 2024, https:// doi.org/10. 48550/arXiv.2402..

[7] "Advanced RAG for LLMs/SLMs. Retrieval augmented generation (RAG)," 24 Dec. 2023, https://medium. com/@bijit211987/advanced-rag-for-llms-slms-5bcc6fbba411.

[8] P. Belagatti, "Retrieval Augmented Generation (RAG)," 28 Oct. 2023, https://dev.to/pavanbelagatti/wth-is-retrieval-augmented-generation-rag-2a5a.

[9] Y. Gao et al., "Retrieval-Augmented Generation for Large Language Models: A Survey," Cornell University, 1 Jan. 2023, https:// doi.org/10.48550/arxiv.2312.10997.

[10] "RAG based Question-Answering for Contextual Response Prediction System," 5 Sep. 2024, https://doi.org/ 10.48550/ arXiv.2409.03708.

[11] Y. Shi et al., "ERAGent: Enhancing Retrieval-Augmented Language Models with Improved Accuracy, Efficiency, and Personalization," Cornell University, 6 May. 2024, https://doi. org/ 10.48550/arXiv.2405..

[12] J. F. Hurtado, "Harnessing Retrieval-Augmented Generation (RAG) for Uncovering Knowledge Gaps," Cornell University, 1 Jan. 2023, https://doi.org/10.48550/arXiv.2312.

[13] R. Angles et al., "SparqLog: A System for Efficient Evaluation of SPARQL 1.1 Queries via Datalog [Experiment, Analysis and Benchmark]," Cornell University, 1 Jan. 2023, https:// doi.org/10.48550/arxiv.2307.06119.

[14] P. G. Selinger et al., "Access path selection in a relational database management system," 1 Jan. 1979, https://doi.org/ 10.1145/582095.582099.

[15] Z. Shao et al., "Enhancing Retrieval-Augmented Large Language Models with Iterative Retrieval-Generation Synergy," Cornell University, 1 Jan. 2023, https://doi.org/ 10. 48550/arXiv.2305..

[16] "A Survey on Retrieval-Augmented Text Generation for Large Language Models," 26 Aug. 2024, https://www.aimodels. fyi/papers/arxiv/survey-retrieval-augmented-text-generation-large-language.

[17] "Retrieval-Augmented Generation for Natural Language Processing: A Survey," 18 Jul. 2024, https://doi.org/10.48550/ arXiv.2407.13193.

[18] A. P. V. K. N. G. H. K. M. L. W. Y. T. R. D. Kiela, "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks - Meta Research," 16 Dec. 2020, https://research.facebook. com/publications/retrieval-augmented-generation-for-knowledge-intensive-nlp-tasks/.

[19] P. Lewis, "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks | Patrick Lewis," 22 May. 2020, https:// www. patricklewis.io/publication/rag/.

[20] K. Shuster et al., "Retrieval Augmentation Reduces Hallucination in Conversation," Cornell University, 1 Jan. 2021, https://doi.org/10.48550/arXiv.2104.

[21] "Retrieval Augmented Generation (RAG) for LLMs," 1 Jan. 2024, https://www.promptingguide.ai/research/rag.

[22] "Update Your Browser," 22 May. 2019, https://ai. meta. com/blog/retrieval-augmented-generation-streamlining-the-creation-of-intelligent-natural-language-processing-models/.

[23] X. Wang et al., "Adaptive Retrieval-Augmented Generation for Conversational Systems," Cornell University, 31 Jul. 2024, https://doi.org/10.48550/arxiv.2407.21712.

[24] R. Nogueira et al., "Document Expansion by Query Prediction," Cornell University, 1 Jan. 2019, https://doi.org/ 10.48550/arxiv.1904.08375.

[25] A. Çakır and M. Gürkan, "Modified Query Expansion Through Generative Adversarial Networks for Information Extraction in E-Commerce," Cornell University, 1 Jan. 2023, https://doi.org/ 10.48550/arxiv.2301.00036.

[26] M. Dehghani et al., "Learning to Attend, Copy, and Generate for Session-Based Query Suggestion," Cornell University, 1 Jan. 2017, https://doi.org/10.48550/arxiv.1708.03418.

[27] S. Barnett et al., "Seven Failure Points When Engineering a Retrieval Augmented Generation System," Cornell University, 1 Jan. 2024, https://doi.org/10.48550/arxiv.2401.05856.

[28] "RAG-Fusion: a New Take on Retrieval-Augmented Generation," 31 Jan. 2024, https://doi.org/ 10.48550/ arXiv. 2402. 03367.

[29] M. Arenas et al., "Querying in the Age of Graph Databases and Knowledge Graphs," 9 Jun. 2021, https://doi.org/ 10.1145/ 3448016.3457545.

[30] P. Schneider et al., "A Decade of Knowledge Graphs in Natural Language Processing: A Survey," Cornell University, 1 Jan. 2022, https://doi.org/10.48550/arXiv.2210.

[31] S. Sunkle et al., "Generating highly customizable SQL parsers," 29 Mar. 2008, https://doi.org/10.1145/ 1385486. 1385 495.

[32] A. Giordani and A. Moschitti, "Translating Questions to SQL Queries with Generative Parsers Discriminatively Reranked," 1 Dec. 2012, pp. 401-410, http://disi.unitn.it/moschitti/ articles/2012/COLING2012.pdf.

[33] A. Viswanathan et al., "Feature-based reformulation of entities in triple pattern queries," Cornell University, 1 Jan. 2018, https://doi.org/10.48550/arxiv.1807.01801.

[34] S. Vemuru et al., "Handling Complex Queries Using Query Trees," 26 Jun. 2021, https://doi.org/ 10.36227/ techrxiv. 14845212.

[35] C. Wang et al., "Robust Text-to-SQL Generation with Execution-Guided Decoding," Cornell University, 1 Jan. 2018, https://doi.org/10.48550/arXiv.1807.

[36] M. Ghali et al., "Enhancing Knowledge Retrieval with In-Context Learning and Semantic Search through Generative AI," Cornell University, 13 Jun. 2024, https://doi.org/ 10.4 8550/ arXiv.2406..

[37] S. Arnold et al., "Resolving Common Analytical Tasks in Text Databases," 22 Oct. 2015, https://doi.org/10. 1145/ 2811222. 2811224.

[38] A. Abdallah and A. Jatowt, "Generator-Retriever-Generator: A Novel Approach to Open-domain Question Answering," Cornell University, 1 Jan. 2023, https://doi.org/ 10.48550/ arXiv.2307.

[39] G. Aguilar et al., "Modeling Noisiness to Recognize Named Entities using Multitask Neural Networks on Social Media," 1 Jan. 2018, https://doi.org/10.18653/v1/n18-1127.

[40] R. T. Kasenchak, "What is Semantic Search? And why is it important?," IOS Press, vol. 39, no. 3, 13 Dec. 2019, pp. 205-213, https://doi.org/10.3233/isu-190045.

[41] R. Cavill et al., "Transcriptomic and metabolomic data integration," Oxford University Press, vol. 17, no. 5, 14 Oct. 2015, pp. 891-901, https://doi.org/10.1093/bib/bbv090.

[42] W. Yu et al., "A Survey of Knowledge-enhanced Text Generation," Association for Computing Machinery, vol. 54, no. 11s, 31 Jan. 2022, pp. 1-38, https://doi.org/ 10.1145/ 3512467.

[43] N. Raman and S. Shah, "Synthetic Text Generation using Hypergraph Representations," Cornell University, 1 Jan. 2023, https://doi.org/10.48550/arXiv.2309.

[44] H. H. Lee et al., "RecipeGPT: Generative Pre-training Based Cooking Recipe Generation and Evaluation System," 20 Apr. 2020, https://doi.org/10.1145/3366424.3383536.

[45] H. Li et al., "A Survey on Retrieval-Augmented Text Generation," Cornell University, 1 Jan. 2022, https://doi.org/10. 48550/ arXiv.2202..

[46] R. P. Zhao et al., "Retrieving Multimodal Information for Augmented Generation: A Survey," Cornell University, 1 Jan. 2023, https://doi.org/10.48550/arxiv.2303.10868.

[47] S. Ahn et al., "A Neural Knowledge Language Model," Cornell University, 1 Jan. 2016, https://doi.org/ 10.48550/ arxiv. 1608. 00318.

[48] W. Fedus et al., "MaskGAN: Better Text Generation via Filling in the______," Cornell University, 1 Jan. 2018, https://doi. org/ 10. 48550/ arxiv.1801.07736.

[49] A. Sauer et al., "StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis," Cornell University, 1 Jan. 2023, https://doi.org/10.48550/arXiv.2301.

[50] "Semantic Indexes for Machine Learning-based Queries over Unstructured Data *," https://ddkang. github.io/ papers/ 2022/ tasti-paper.pdf.

[51] A. Chaudhary et al., "Exploring the Viability of Synthetic Query Generation for Relevance Prediction," Cornell University, 1 Jan. 2023, https://doi.org/10.48550/ arxiv. 2305.11944.

[52] J. Li et al., "Graph Enhanced BERT for Query Understanding," Cornell University, 1 Jan. 2022, https://doi.org/ 10.48550/ arXiv. 2204..

[53] H. Xiong and R. Sun, "Transferable Natural Language Interface to Structured Queries aided by Adversarial Generation," Cornell University, 1 Jan. 2018, https://doi.org/ 10.48550/ arxiv.1812.01245.

Downloads

Published

27-09-2024

Issue

Section

Articles

How to Cite

Yang , Y., Li, X., Jin, H., & Huang, K. (2024). Advancing Structured Query Processing in Retrieval-Augmented Generation with Generative Semantic Integration. Frontiers in Computing and Intelligent Systems, 9(3), 64-71. https://doi.org/10.54097/z309gx59