<span class="vcard">/u/Future_AGI</span>
/u/Future_AGI

AI models are getting smarter but we’re getting dumber about how we deploy them

flash models. quantized variants. distilled twins. not breakthroughs, patches. because the real problem isn’t model capability, it’s infra stupidity. everyone’s racing to scale training runs, but inference is where things break: – token bottlenecks kil…