Member-only story

What’s Order-Preserve RAG (OP-RAG)?

5 min readOct 13, 2024

Why use RAG in the Era of Long-Context Language Models? Part 2

In recent years, the field of natural language processing has seen remarkable advancements, particularly in the development of large language models (LLMs) with increasingly expansive context windows. Last week we introduced the research from Li et al. (2024) comparing RAG with and without long-context (LC) LLMs.

Today, we’ll delve into another intriguing paper by Nvidia researchers, “In Defense of RAG in the Era of Long-Context Language Models.” Interestingly, this paper presents a contrasting conclusion: using a proposed mechanism, RAG outperforms models that rely solely on long-context LLMs in delivering higher-quality answers.

The issue

Despite the impressive capabilities of long-context LLMs, there are inherent limitations to processing extremely large amounts of text in a single pass. As the context window expands, the model’s ability to focus on relevant information can diminish, potentially leading to a degradation in answer quality. This phenomenon highlights a critical trade-off between the breadth of available information and the precision of the model’s output.

P.S. We have also previously covered this in this post:

What’s Order-Preserve RAG (OP-RAG)?

Why use RAG in the Era of Long-Context Language Models? Part 2

The issue

Written by Angelina Yang

No responses yet