Improving Retrieval Augmented Generation with CRAG

Angelina Yang
3 min readMay 22, 2024

Retrieval Augmented Generation (RAG) is a technique that integrates external knowledge sources into large language models (LLMs) to enhance their response generation capabilities. However, one limitation of vanilla RAG systems is that if the initial retrieval of documents is not accurate or relevant, the final response can suffer.

To address this, researchers have proposed a novel method called Corrective Retrieval Augmented Generation (CRAG). The core idea behind CRAG is to introduce a separate “evaluator” component to assess the quality and relevance of the initially retrieved documents before passing them to the LLM for response generation.

How CRAG Works

  1. The system retrieves potentially relevant documents from a vector database based on the user’s query.
  2. These retrieved documents are passed through a lightweight evaluator model (e.g., T5) that classifies each document into one of three categories: correct, ambiguous, or incorrect.
  3. Correct documents are kept as-is, while ambiguous ones are supplemented with additional web search results. Incorrect documents are discarded and replaced with web search results.
  4. The retained documents then go through a “decompose and recompose” step, where they are broken into smaller text strips, filtered again for relevance, and only the most pertinent strips are passed to the LLM.
  5. Finally, the LLM uses these highly relevant text…

--

--