The article introduces Program-as-Weights (PAW), a paradigm that compiles natural-language specifications into compact, locally-executable neural artifacts to replace large language model APIs. This approach aims to improve locality, reproducibility, and cost by treating foundation models as tool builders rather than per-input problem solvers.

  • PAW utilizes a 4B compiler trained on FuzzyBench, a newly released dataset of 10M examples.
  • The system emits parameter-efficient adapters for a frozen, lightweight interpreter, specifically using a 0.6B Qwen3 model.
  • Performance matches direct prompting of Qwen3-32B while using roughly one fiftieth of the inference memory.
  • On a MacBook M3, the solution runs at 30 tokens/s, enabling cheap and offline subsequent calls.

PAW reframes the foundation model from a per-input problem solver into a tool builder, producing small reusable artifacts that reduce the need for expensive API calls.