A Research of Challenges and Solutions in Retrieval Augmented Generation (RAG) Systems

Jiafeng Gu

doi:10.54097/364hex16

Authors

Jiafeng Gu

DOI:

https://doi.org/10.54097/364hex16

Keywords:

Retrieval augmented generation, natural language processing, information retrieval, knowledge base.

Abstract

Retrieval-Augmented Generation (RAG) systems represent a significant innovation in the field of Natural Language Processing (NLP), ingeniously integrating Large Language Models (LLMs) with dynamic external knowledge retrieval. This amalgamation not only enhances the models' responsiveness to real-world knowledge but also addresses the limitations of conventional generative models in terms of knowledge update velocity and factual accuracy. This review examines the challenges faced by RAG systems and their solutions. It delves into the central architecture of RAG systems, encompassing retrieval components, generative components, and knowledge bases, with a particular focus on recent advancements that have expanded the boundaries of performance and functionality. The study critically analyzes major challenges such as retrieval efficiency and dynamic knowledge management. This paper evaluates various advanced solutions proposed in recent literature, comparing their efficacy and discussing the trade-offs involved. Ultimately, this paper aims to provide researchers, developers, and users of RAG systems with a comprehensive perspective, fostering ongoing innovation and the expansion of applications in this domain.

Downloads

Download data is not yet available.

References

[1] Lewis, Patrick, Scott Reed, Jack Urbanek, and Nando de Freitas. Retrieval-augmented generation for knowledge-intensive NLP tasks. Advances in Neural Information Processing Systems 33 2020: 9459-9474.

[2] Borgeaud, Sebastian, Arthur Mensch, Guillaume Lample, and Marc'Aurelio Ranzato. Improving language models by retrieving from trillions of tokens. International conference on machine learning. PMLR, 2022.

[3] Izacard, Gautier, and Edouard Grave. Atlas: Few-shot learning with retrieval augmented language models. Journal of Machine Learning Research 24.251 2023: 1-43.

[4] Karpukhin, Vladimir, Barlas Oğuz, Sewon Min, Patrick Lewis, Ledell Wu, Alexander Kolesnikov, and Sebastian Ruder. Dense passage retrieval for open-domain question answering. arXiv preprint arXiv:2004.04906 2020.

[5] GAO, Luyu, Wei-Sheng Chin, Yu-Chia Chen, and Cho-Jui Hsieh. Complement lexical retrieval model with semantic residual embeddings. Advances in Information Retrieval: 43rd European Conference on IR Research, ECIR 2021, Virtual Event, March 28-April 1, 2021, Part I 43.

[6] Izacard, Gautier, and Edouard Grave. Leveraging passage retrieval with generative models for open domain question answering. arXiv preprint arXiv:2007.01282 2020.

[7] Shuster, Kurt, Alexander M. Rush, and Jason Weston. Retrieval augmentation reduces hallucination in conversation. arXiv preprint arXiv:2104.07567 2021.

[8] Xu, Yichong, Xiaojun Wan, Xiaoyan Zhu, and Xipeng Qiu. Fusing context into knowledge graph for commonsense question answering. arXiv preprint arXiv:2012.04808 2020.

[9] Wang, Kexin, Zhenzhong Lan, and Jianfeng Gao. GPL: Generative pseudo labeling for unsupervised domain adaptation of dense retrieval. arXiv preprint arXiv:2112.07577 2021.

[10] Shibata, Tetsutaro. Asymptotics of solution curves of Kirchhoff type elliptic equations with logarithmic Kirchhoff function. Qualitative Theory of Dynamical Systems 22.2 2023: 64.

[11] Peng, Dehua, Zhipeng Gui, and Huayi Wu. Interpreting the curse of dimensionality from distance concentration and manifold effect. arXiv preprint arXiv:2401.00422 2023.

[12] Hassantabar, Shayan, Zeyu Wang, and Niraj K. Jha. SCANN: Synthesis of compact and accurate neural networks. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 41.9 2021: 3012-3025.

[13] Dhingra, Bhuwan, and Graham Neubig. Time-aware language models as temporal knowledge bases. Transactions of the Association for Computational Linguistics 10 2022: 257-273.

[14] GAO, Yunfan, Zhenzhong LAN, and Jianfeng GAO. Retrieval-augmented generation for large language models: A survey. ArXiv preprint arXiv: 2312.10997 2023.

[15] Asai, Akari, and Masaaki Komachi. Self-rag: Learning to retrieve, generate, and critique through self-reflection. arXiv preprint arXiv:2310.11511 2023.

[16] Edge, Darren, and Peter J. Stuckey. From local to global: A graph rag approach to query-focused summarization. arXiv preprint arXiv:2404.16130 2024.