A new method uses program synthesis to generate Python programs that reproduce attention patterns in transformer models. Fewer than 1,000 such programs achieve over 75% intersection-over-union similarity on TinyStories, and replacing 25% of attention heads with these programs increases perplexity by only 16% while preserving performance on question-answering tasks.
Reverse-Engineering Transformer Attention with Executable Programs
from English