Tmax: A Simple RL Recipe for Terminal Agents

Tmax presents the strongest open RL recipe for terminal agents, achieving 27% on Terminal-Bench 2.0 with only 9B parameters. It uses a novel data taxonomy to generate over 2.5x more terminal environments than prior datasets, enabling efficient training with a simple, outcome-only recipe. The dataset, models, and code are open-sourced at https://github.com/hamishivi/tmax.

Benchmarks