This paper introduces Marginal Advantage Accumulation (MAA), a post-processing architecture that addresses cross-batch inconsistency in memory-driven agent self-evolution. MAA formalizes alignment and comparability as structural conditions, uses differential signals and exponential moving average to accumulate signed evidence per operation, and ensures traceability via semantic identity merging. It outperforms batch-level baselines in 14 out of 16 settings and reduces token consumption by about 75%.
Marginal Advantage Accumulation for Memory-Driven Agent Self-Evolution
from English