SGCD introduces an iterative framework to improve GUI agents by addressing supervision gaps in off-trajectory states. It extracts skills from both successful and failed rollouts, using them to guide policy continuations that are mixed with expert trajectories. On OSWorld-Verified, SGCD boosts success rates of three base models from low-30\% to over 50\%.
arxiv
arXiv cs.AI
·
7d ago
·
research
Skill-Guided Continuation Distillation for GUI Agents
from English
Importance 3/3
New feature vs. leaders
New harness with differentiators
arXiv cs.AI
Mistral AI
Google DeepMind
OpenAI
AI agents
Evaluation & benchmarks
Reasoning models
Benchmarks
| Benchmark | Model | Score |
|---|---|---|
| OSWorld | three base models | 50% |