Retrieval-Augmented Generation
Connecting LLMs to your own data โ from basic pipelines to advanced retrieval architectures.
What is RAG and Why
LLMs know a lot, but they don't know your data. Retrieval-Augmented Generation is the pattern that fixes this: not by training the model on your data, but by finding the relevant pieces at query time and handing them directly to the model.
Embeddings and Vector Search
Semantic search, finding text by meaning rather than keywords, is the engine inside most RAG systems. Understanding how embeddings work and how vector databases store and query them is the foundation you need to build reliable retrieval.
Chunking and Indexing
You can't embed a whole document: you split it into pieces first. How you split determines what you can retrieve. The wrong chunking strategy is one of the most common reasons RAG systems fail to find the right answer even when the information clearly exists.
Retrieval Quality: Dense, Sparse, and Hybrid
Semantic search is powerful but not always the best retrieval method. Keyword search finds exact matches that embeddings miss. Re-ranking re-scores candidates with a slower but more accurate model. Understanding when to use each, and how to combine them, is what separates reliable RAG from fragile RAG.
Prompting for RAG
Retrieved chunks are only as useful as the instructions you give the model for using them. The grounding instruction, context format, citation pattern, and no-answer path are what turn a retrieval result into a reliable, trustworthy answer.
Evaluating RAG Systems
A fluent, well-formatted answer based on the wrong chunk is a failure, but it reads like a success. RAG evaluation requires two independent measurement tracks: retrieval quality and generation quality. Conflating them hides the real failure mode.
Advanced RAG Patterns
Basic RAG fails when queries are vague, answers span multiple documents, or context evolves across a conversation. Four patterns, multi-query retrieval, HyDE, contextual retrieval, and small-to-big, each fix a specific retrieval failure mode. Know which failure you have before reaching for a pattern.
Production RAG Checklist
A RAG prototype that works on your test documents is not a production system. This capstone synthesises the full RAG track into a checklist: the gaps that consistently cause RAG failures after launch, and the order to address them.