What is Retrieval-Augmented Generation (RAG)?

Retrieval-Augmented Generation (RAG) — AI Glossary

Retrieval-Augmented Generation (RAG)

RAG is the pattern of fetching relevant documents at query time and feeding them to a model alongside the question, so the answer is grounded in real sources instead of the model's memory. It's how you put private or current data in front of a model without retraining it.

Also known as: RAG

Jun 16, 2026 · Chain of Thought

A bare language model only knows what it learned in training — not your documents, not last week’s data. Retrieval-augmented generation closes that gap: at query time the system retrieves the most relevant chunks from a store and hands them to the model with the question, so the model answers from real sources rather than from memory.

It’s the default way to give a model private, current, or fast-changing knowledge, because you update the store instead of retraining the model. The catch is that RAG is only as good as its retrieval — fetch the wrong chunks and the model answers confidently from bad context. That’s why retrieval quality, not just the model, decides whether a RAG system works.

Retrieval-Augmented Generation (RAG)

Go deeper

From the conversation

Retrieval-Augmented Generation (RAG)

Go deeper

From the conversation

Related terms