First-Token Broadcasters in Transformers: Language Identity and Robustness
LIHA reveals a small set of first-token broadcaster heads in GPT-2 that persistently attend to the initial prompt token, driving language switches. Instruction tuning reorganizes these circuits, concentrating language identity at early layers, as seen in Qwen2.5-1.5B-Instruct and confirmed in Chinese and Russian language handling at layer 0.