Everything You Need to Know about Basic RAG

Angelina Yang
3 min readMay 24, 2024

Demystifying the Basics of Retrieval Augmented Generation (RAG)

Retrieval Augmented Generation (RAG) is a technique that integrates external knowledge sources into large language models (LLMs) to enhance their response generation capabilities. By ingesting knowledge databases into the LLM, RAG allows the model to access information beyond its training data, leading to more accurate and informative responses.

In this blog post, we’ll walk through the fundamental components of a RAG system and how to implement a basic RAG pipeline from scratch. We’ll also contrast this approach with using popular frameworks like LangChain and LlamaIndex.

The RAG Architecture

A typical RAG system consists of two main components: the retrieval module and the generation module.

  1. The Retrieval Module:
  • Data Ingestion: This involves reading documents (PDFs, web pages, etc.), splitting them into smaller text chunks, and embedding these chunks into a vector database for efficient retrieval.
  • Retrieval: When a user query is received, the retrieval module searches the vector database for the most relevant text chunks based on semantic similarity.