Supersede: Diagnosing and Training the Memory-Update Gap in LLM Agents
This article identifies a distinct failure mode in large language model agents where they struggle to discard outdated facts in favor of current ones, even when comprehension is intact. The authors demonstrate that this "supersession gap" persists across model scales and memory sizes, indicating it is a trainable bottleneck rather than a limitation of context window or model strength.