Self-Evolving World Models for LLM Agent Planning

The paper introduces WorldEvolver, a framework that equips long-horizon LLM agents with reliable foresight by revising deployment-time context without modifying model parameters. It addresses the issue of unreliable predictions degrading decision-making through a self-evolving approach that enhances predictive fidelity and planning performance.

Episodic Memory retrieves real action transitions for simulation.
Semantic Memory extracts persistent heuristic rules from prediction-observation mismatches.
Selective Foresight filters low-confidence predictions before integration.
Evaluated on ALFWorld and ScienceWorld, it achieves highest prediction accuracy across three backbones.
Leads other baselines on downstream agent success rate measured on AgentBoard.

This approach demonstrates that test-time memory revision significantly improves both the accuracy of world model predictions and the overall success rate of agent planning tasks.