An independent researcher analyzed the evolution of hidden representations during inference across seven open-weight models, including GPT-2, OPT-125M, and Llama-3.2-1B, to identify internal dynamical regimes beyond standard output benchmarks.
- Hidden-state trajectories exhibit reproducible functional proxy states such as syntax-like processing and decision-like behavior that allow clustering by internal dynamics rather than parameter count.
- Linear probes decode functional categories from hidden representations with high accuracy, though this performance collapses under label permutation, random Gaussian inputs, or feature permutation.
- Orthogonal rotations of the hidden space preserve decoding performance, indicating information is encoded in the relative geometry of representations rather than individual neurons or dimensions.
- Functional signatures appear at varying absolute layers across architectures, suggesting computation is organized as evolving functional regimes rather than fixed syntactic or semantic layers.
The author seeks critical feedback from experts in mechanistic interpretability and representation learning to validate these empirical observations and determine necessary control experiments.