Benchmark · agentic
GAIA
General AI Assistant benchmark from Meta/HF.
2 results
1 models
Qwen3-4B
Timeline
-
2026-06-18
Qwen3-4B
4.8pts
Data Recipe Boosts Long-Context Reasoning in LLMs
-
2026-06-18
Qwen3-4B
4.8pts
Data Recipe Boosts Long-Context Reasoning in LLMs