korshunov
.ai
English
Today
This week
All articles
Benchmark · math
PutnamBench
2 results
2 models
0
19.5
39
58.5
78
2026-06-17
RobustCoTAgent · 0 · 2026-06-17
our framework · 72.5 · 2026-06-17
RobustCoTAgent
our framework
Timeline
2026-06-17
RobustCoTAgent
0.0%
Automated Prompt Optimization for LLM Game Agents
2026-06-17
our framework
72.5%
Automated Prompt Optimization for LLM Game Agents