/u/Defiant_Confection15

What if attention didn’t need matrix multiplication?

/u/Defiant_Confection15 April 15, 2026 April 15, 2026

I built a cognitive architecture where all computation reduces to three bit operations: XOR, MAJ, POPCNT. No GEMM. No GPU. No floating-point weights. The core idea: transformer attention is a similarity computation. Float32 cosine computes it with 24,5…

artificial

A 135M model achieves coherent output on a laptop CPU. Scaling is σ compensation, not intelligence.

/u/Defiant_Confection15 April 10, 2026 April 10, 2026

SmolLM2 135M. Lenovo T14 CPU. No GPU. No RLHF. No BPE. Coherent, non-sycophantic, contextually appropriate output. First message. No prior context window. Same base model under standard pipeline: garbage. What changed: • BPE replaced with geometric has…

Share this:

Share this: