Why use RAG in the Era of Long-Context Language Models?

Angelina Yang
5 min readSep 30, 2024

In recent years, the field of natural language processing has seen remarkable advancements, particularly in the development of large language models (LLMs) with increasingly expansive context windows. Models like “GPT-4O OpenAI (2023), Claudi-3.5 Anthropic (2024), Llama3.1 Meta (2024b), Phi-3 Abdin et al. (2024), and Mistral-Large2 AI (2024)” all boast the ability to process up to 128,000 tokens or more in a single context. Gemini-1.5-pro even supports a 1M context window.

This begs the question:

Is there still a place for Retrieval Augmented Generation (RAG) in this new era of long-context LLMs?

Before we dive in, we’re excited to share that our RAG course is launching soon, and there’s still time to fill out the course survey to share your preferences!👇

📝 Course survey: https://maven.com/forms/e48159

Thanks, and we’re looking forward to seeing you there!

The Rise of Long-Context LLMs

Long-context LLMs have made significant strides in understanding and processing extensive inputs. These models can now directly engage with large amounts of text, potentially eliminating the need for complex…

--

--