The authors propose H-Res (Hierarchical Residual Steering), a mechanism that adapts large Transformer models by modulating their effective energy landscape without altering global equilibrium or expanding sequence length. This approach formulates adaptation as a control problem on the activation manifold to steer token trajectories into task-specific basins of attraction.
- H-Res learns a state-dependent vector field to guide retrieval dynamics, avoiding catastrophic interference from weight modification and capacity degradation from static prompts.
- The method formally preserves the attention entropy of the foundation model and facilitates Neural Collapse.
- Empirical results show H-Res outperforms global weight modification by 26% on associative retrieval tasks while eliminating the computational overhead of prompt-based methods.
This technique enables efficient adaptation to new tasks in structured domains, offering a scalable alternative to existing parameter-efficient fine-tuning strategies.