The real reason most RAG systems “mysteriously break”

We sometimes think RAG breaks because the model isn’t good enough.

But the failures are almost always systemic.

Here’s the uncomfortable bit:

RAG collapses because the preprocessing pipeline is unmonitored, not because the LLM lacks intelligence.

We use this checklist before you change anything downstream:

Your extractor doesn’t produce the same structure week to week.

One collapsed heading = cascading retrieval failure.

Everyone treats chunking as a trivial step.

It is the single most fragile stage in the entire pipeline.

If doc IDs or hierarchy shift, the retriever becomes unpredictable.

Mixed model versions are more common than people admit.

Default top-k is a footgun.

Without a ground-truth eval set, you’re debugging noise.

Most RAG failures aren’t AI failures they’re software engineering failures.

submitted by /u/coolandy00
[link] [comments]