Simon Willison shares tips from a Fireside Chat with the Claude Code team to optimize token usage by allowing models like Fable and Opus to exercise their own judgement rather than following rigid instructions.

  • Instead of dictating testing rules, users should instruct Fable to decide when to write tests based on its own assessment.
  • To conserve tokens before price increases, prompt the model to use lower-power models for smaller tasks via subagents.
  • A specific memory file was created to delegate coding tasks to subagents using Sonnet for substantive implementation and Haiku for trivial edits.

This approach helps preserve the Fable allowance while maintaining efficiency by reserving the top-tier model for judgment-heavy tasks like review and synthesis.