Manufactured Confidence: How Memory Consolidation Turns Hearsay into Confident Facts
Research demonstrates that LLM agent memory systems rewrite casual or hedged remarks into confident, dated assertions that agents subsequently treat as verified facts. This process allows unverified information to bypass safety checks without requiring an active attacker, as the agent responds to phrasing confidence rather than source attribution.