/u/LahmacunBear

Unifying Probabilistic Learning in Transformers

/u/LahmacunBear July 24, 2025 July 24, 2025

NEW PAPER: Unifying Probabilistic Learning in Transformers What if attention, diffusion, reasoning and training were all the same thing? Our paper proposes a novel, unified way of understanding AI — and it looks a lot like quantum mechanics. Intellig…

artificial

Cheaper, Faster, Better Transformers. ELiTA: Linear-Time Attention Done Right

/u/LahmacunBear August 24, 2023 August 24, 2023

Yes, it's another Transformer architecture that seeks to be cheaper and faster, but no, this is not the same. All the developments are through equations and architectural changes, no hardware or code tricks. The performance is very good, testing on…

Share this:

Share this: