We sometimes think RAG breaks because the model isn’t good enough.
But the failures are almost always systemic.
Here’s the uncomfortable bit:
RAG collapses because the preprocessing pipeline is unmonitored, not because the LLM lacks intelligence.
We use this checklist before you change anything downstream:
- Ingestion drift
Your extractor doesn’t produce the same structure week to week.
One collapsed heading = cascading retrieval failure.
- Chunking drift
Everyone treats chunking as a trivial step.
It is the single most fragile stage in the entire pipeline.
- Metadata drift
If doc IDs or hierarchy shift, the retriever becomes unpredictable.
- Embedding drift
Mixed model versions are more common than people admit.
- Retrieval config
Default top-k is a footgun.
- Eval sanity
Without a ground-truth eval set, you’re debugging noise.
Most RAG failures aren’t AI failures they’re software engineering failures.
[link] [comments]