How much longer will RAG be useful in light of Gemini’s 1M and 10M token context window and 99% accuracy?
How much longer will RAG be useful in light of Gemini’s 1M and 10M token context window and 99% accuracy?

How much longer will RAG be useful in light of Gemini’s 1M and 10M token context window and 99% accuracy?

The Gemini release was really interesting in that they sort of buried the lede by not mentioning the 99% accuracy of the context window.

The 128k context window of OpenAI will fall down pretty quickly and really is only 32k-64k if you care about your context actually being used.

Ideally you would just fit all your data into the 10M token context window but that's going to be about $5 as per my understanding.

That's going to get expensive quickly for a lot of applications.

The questions is how long will this be the case. If RAG is only about cost savings I can see it starting to fade away in use over the next 1-2 years and most people just wanting to push everything into the context window.

submitted by /u/brainhack3r
[link] [comments]