Benchmark · agentic

Windows Agent Arena

1 results 1 models
0 3.5 7 10.5 14 2026-06-29 proposed RL fine-tuning framework · 12.6 · 2026-06-29
proposed RL fine-tuning framework
Timeline
  1. 2026-06-29 proposed RL fine-tuning framework 12.6pts Reinforcement Learning for Computer-Use Agents with Autonomous Evaluation