Carles Marin has released an open-source, bilingual (English and Spanish) guide that bridges the mathematical foundations of Transformer architectures with their practical implementation. The resource focuses on low-level mechanics, providing reproducible code and interactive elements to explain complex topics.

  • Attention Dynamics: Covers scratch implementations and analysis of attention collapse.
  • Context & Memory: Explores KV-cache compression techniques and challenges related to long-context windows.
  • Advanced Concepts: Includes explanations of grokking, optimization strategies, and structural analysis.
  • Interactive Tools: Features the TAF Agent framework for browser-based LLM testing alongside theoretical explanations.

The guide aims to serve as a comprehensive educational resource for understanding Transformer internals, with the author inviting community feedback on attention state visualization and optimization techniques.