ReM-MoA: Reasoning Memory Sustains Mixture-of-Agents Scaling

The authors propose ReM-MoA, a memory-augmented Mixture-of-Agents framework designed to sustain performance gains as model depth increases, addressing the degradation and saturation issues found in existing variants. The system utilizes a Ranked Reasoning Memory and a Curated Diversified Memory Routing scheme to preserve exploration diversity while propagating high-quality reasoning traces across layers.

ReM-MoA employs a Ranked Reasoning Memory that persistently stores and ranks reasoning traces from all layers using a comparative Reviewer Agent.
A Curated Diversified Memory Routing scheme exposes different agents to distinct combinations of successful and failed traces to maintain exploration diversity.
An optional multi-domain Reviewer distillation pipeline improves ranking quality through frontier-model supervision.
The framework consistently outperforms prior MoA variants across five reasoning benchmarks spanning math, formal logic, code, knowledge, and commonsense.
Performance advantages widen with increased depth, establishing structured cross-layer reasoning memory as a key mechanism for scalable multi-agent inference.

The authors consider this important because it establishes structured cross-layer reasoning memory as a critical missing component for achieving scalable multi-agent inference, allowing performance to improve rather than degrade as systems become deeper.