Hello World! ;)
First, a disclaimer: I'm not an AI developer - just a systems engineer who watches maybe a bit too much AI content on YouTube🤣, and my knowledge of how AI (LLMs in particular) works is rather rudimentary. But I've been puzzling over why current LLMs behave so unlike actual intelligence - and more to the point, why are they prone to "hallucinations".
Had some drinks and a long conversation with Claude about this, so here are my thoughts after it:
Missing Piece #1: Confidence Monitoring We, humans, know what we don't know. Current LLMs are forced to always output something, even when basically guessing.
My idea: Instead of just generating tokens, output (token, confidence_score) tuples. Users see detokenized text, system tracks confidence curves across sequences using confidence_score. When confidence drops below threshold → rollback to a checkpoint where confidence was *high*, try re-generating the rest again. Simple retry limit (2-3 attempts) prevents loops: after retry_count retries, if confidence remains low - just say "I don't know" or "I'm not sure"
Bonus idea: Use temperature probing - generate multiple high-temp samples to measure response density/variance as a confidence signal.
Missing Piece #2: Rumination We replay past conversations and reconsider responses. AI should do this systematically.
My idea: Scheduled background process reviews past responses, cross-references claims, does online research, identifies potential errors (cronjob?). Updates model weights and/or makes notes for future reference (fine-tuning?)
Missing Piece #3: Hierarchical Memory System LLMs can't learn from interactions outside current context windows. ChatGPT does have "memory", but it's just a database of some specific data or queries about/from the user.
My idea:
- Short-term memory: Temporary buffer for general knowledge candidates (could be produced during "rumination" phase), gets consolidated into base model during "sleep cycles," then wiped (fine-tuning?)
- Long-term user memory: Persistent user-specific context (preferences, ongoing projects, communication style) - doesn't get "fine-tuned" on, only "remembered", as part of context of each "chat".
I've asked Claude to help me consolidate my ideas and put it into a Reddit post-friendly format. The above is the result. The actual conversation I've had with Claude is much... MUCH longer! 😂And some technical details were much deeper during the conversation. I asked Claude to help me boil down to the essence of my ideas. Just want to be absolutely clear and transparent! ;)
Now that that's been said... What do you, AI gurus, think about these ideas? Again: I'm not an AI developer, i'm just a systems engineer... with a systems engineer's way of thinking about problems ;)
Kind regards - D.
[link] [comments]