Conversing with an LLM as perturbing a dynamical system

A nice description from DeepSeek on a dynamical systems view of their processing, and why there is emergent order.

DeepSeek generated this detail characterizing itself as a high dimensional system with 8 billion parameters. ChatGPT 3 had 175 billion parameters.

Context: I had previously provided a copy of the paper, Transformer Dynamics: A neuroscientific approach to interpretability of large language models by Jesseba Fernando and Grigori Guitchounts to DeepSeek to analyze.

The researchers used phase space reconstruction and found attractor-like dynamics in the residual stream of a model with 64 sub layers.

submitted by /u/Fit-Internet-424
[link] [comments]