Carles Marin has released an open-source, bilingual (English and Spanish) guide that bridges the mathematical foundations of Transformer architectures with their practical implementation. The resource focuses on low-level mechanics, providing reproducible code and interactive elements to explain complex topics.
- Attention Dynamics: Covers scratch implementations and analysis of attention collapse.
- Context & Memory: Explores KV-cache compression techniques and challenges related to long-context windows.
- Advanced Concepts: Includes explanations of grokking, optimization strategies, and structural analysis.
- Interactive Tools: Features the TAF Agent framework for browser-based LLM testing alongside theoretical explanations.
The guide aims to serve as a comprehensive educational resource for understanding Transformer internals, with the author inviting community feedback on attention state visualization and optimization techniques.