My thoughts on AI vs human intelligence, and how to maybe make them closer.
My thoughts on AI vs human intelligence, and how to maybe make them closer.

My thoughts on AI vs human intelligence, and how to maybe make them closer.

Hello World! ;)

First, a disclaimer: I'm not an AI developer - just a systems engineer who watches maybe a bit too much AI content on YouTube🤣, and my knowledge of how AI (LLMs in particular) works is rather rudimentary. But I've been puzzling over why current LLMs behave so unlike actual intelligence - and more to the point, why are they prone to "hallucinations".

Had some drinks and a long conversation with Claude about this, so here are my thoughts after it:

Missing Piece #1: Confidence Monitoring We, humans, know what we don't know. Current LLMs are forced to always output something, even when basically guessing.

My idea: Instead of just generating tokens, output (token, confidence_score) tuples. Users see detokenized text, system tracks confidence curves across sequences using confidence_score. When confidence drops below threshold → rollback to a checkpoint where confidence was *high*, try re-generating the rest again. Simple retry limit (2-3 attempts) prevents loops: after retry_count retries, if confidence remains low - just say "I don't know" or "I'm not sure"

Bonus idea: Use temperature probing - generate multiple high-temp samples to measure response density/variance as a confidence signal.

Missing Piece #2: Rumination We replay past conversations and reconsider responses. AI should do this systematically.

My idea: Scheduled background process reviews past responses, cross-references claims, does online research, identifies potential errors (cronjob?). Updates model weights and/or makes notes for future reference (fine-tuning?)

Missing Piece #3: Hierarchical Memory System LLMs can't learn from interactions outside current context windows. ChatGPT does have "memory", but it's just a database of some specific data or queries about/from the user.

My idea:

  • Short-term memory: Temporary buffer for general knowledge candidates (could be produced during "rumination" phase), gets consolidated into base model during "sleep cycles," then wiped (fine-tuning?)
  • Long-term user memory: Persistent user-specific context (preferences, ongoing projects, communication style) - doesn't get "fine-tuned" on, only "remembered", as part of context of each "chat".

I've asked Claude to help me consolidate my ideas and put it into a Reddit post-friendly format. The above is the result. The actual conversation I've had with Claude is much... MUCH longer! 😂And some technical details were much deeper during the conversation. I asked Claude to help me boil down to the essence of my ideas. Just want to be absolutely clear and transparent! ;)

Now that that's been said... What do you, AI gurus, think about these ideas? Again: I'm not an AI developer, i'm just a systems engineer... with a systems engineer's way of thinking about problems ;)

Kind regards - D.

submitted by /u/Tall_Space2261
[link] [comments]