SABER-Math: Automated Benchmark for Information Retrieval Evaluation in Mathematics
Researchers introduce SABER-Math, the first fully automated benchmark for evaluating mathematical information retrieval without expert annotation, addressing the difficulty of isolating retriever effects on downstream performance.