If you remember this post I wrote a few weeks ago, I made the passionate argument that you can't or you shouldn't conflate context windows with information retrieval mechanisms, i.e. RAG/ETL/DTO, etc. So why am I bringing this up again? Well, Anthropic just did a surprise release of Claude 3 and I immediately thought cool let me go and check out the specs. To my surprise their context window specs where exactly as they were before. Hmmmmm, very interesting. Improved the model to another iteration version 2 -> 3 but no context window enhancement. Almost. So close. Claude does say they will start with their old 200K context window upon launch but all 3 models are capable of exceeding 1 million tokens and ***You can inquire about it now. As an example, their mid tier flavor model, Sonnet, advertises its potential uses as this.
I'm just beating the drum over here by myself with no support that you can't be calling this context. It's RAG. PERIOD. Just say it. Be transparent. That way, if I want to use your RAG and instant embedding function system I can willingly do that. But the cost is different it's not the same it's NOT inference. If it's not inference and it's just legacy compute then be transparent about that. Also, the main reason why I wish people would stop doing it is because. How can I say this. It's like you're not fighting on the same battle field. Like one group is so far advanced it's going to become unreachable in a couple iterations here and you guys aren't seeing it. Or something. Point is. In Google's Deepmind paper they released this. It was the 1 thing that I missed in that paper that further proves my point. Ahhh YEAH. ^^^ That thing right there. Very well said by Google actually but also, to me, an admission of this is not true pure context. It's just RAG. And I can do my own RAG. Google is admitting here that this is adjacent to the model rather than being something of the model. For the life of me I don't know why more people aren't questioning this. What I want is not you (Google / Anthropic / anyone) generating my RAG. I want you to give me and my robot memory. I want real memories that are long term to me and my robot. This is and has to be the next step. Think of all the things you could do with persisted memory. You could have a personality, funny, curt, nice, mean, joyful. You could have a purpose or desire. You could have a task that is everyday at 9:00. You could tell it things and it would just recall them because the memory is in a retrievable state like a humans is. Which memories the model should pull is a very important aspect of how this would all work. What is not a model and is not helpful is vectorizing data and then nlp'ing that for a query returning NIHS. RAG IS NOT CONTEXT; RAG IS NOT COHERENCE; RAG IS NOT MEMORY. I hope and pray OAI doesn't release something like this and call it a win. It would be very disappointing. I have a feeling we are going to find out soon. [link] [comments] |