Pure RAG: knowledge Q&A
Who uses it: Team building a help-center or docs chatbot
Single LLM call per question
Retrieval narrows the prompt to relevant chunks
Predictable cost and latency
No tool use, no action taken on the world
Failure mode: irrelevant retrieval, not bad reasoning
Why this works: Pure RAG fits when the job is 'find this fact and explain it' — the diagram has one LLM call and no loop, which keeps cost and latency predictable but means RAG can't perform actions or chain multi-step reasoning.
Pure agent: multi-step task
Who uses it: Team automating a workflow that touches several systems
Agent reasons, calls a tool, observes, repeats
Tools: search, code exec, external APIs
Memory holds the trace of the task so far
Cost is unpredictable (depends on steps taken)
Failure mode: bad reasoning, infinite loops
Why this works: Pure agent fits when the task requires multiple decisions and actions — the diagram's loop is the whole point, but it brings unpredictable cost, longer latency, and harder debugging than a single LLM call.
Hybrid: agentic RAG
Who uses it: Team where the agent decides when to retrieve
Agent loop runs as in pure agent
One of the agent's tools is 'retrieve from knowledge base'
Agent decides per turn whether to retrieve
Can re-retrieve with a rewritten query if first attempt is poor
Falls back to other tools when retrieval doesn't help
Why this works: Agentic RAG puts retrieval inside the agent loop — the diagram shows RAG as one of several tools, so the agent invests retrieval calls only when needed and can route around poor retrieval to other tools.
Decision: which one for your use case
Who uses it: Engineer choosing an architecture
Single-turn factual Q&A → RAG
Multi-step task with actions → Agent
Knowledge work with occasional tool use → Agentic RAG
Strict cost cap → start with RAG; add agent only if needed
Audit / compliance needs → RAG (easier to trace)
Why this works: The choice is rarely about which is 'better' — it's about whether the task fits in one LLM call. The diagram helps you walk a stakeholder through the question 'does this need a loop?', which is the deciding factor.