Benchmark · agentic

τ-bench

Conversational tool-use benchmark from Sierra.

0 results 0 models

No verified scores reported yet for this benchmark.