Researchers propose Mandol, an agglomerative memory system designed to consolidate fragmented memory representations into a unified architecture for long-term conversational agents. This approach addresses the high latency and noise issues inherent in existing systems that rely on heterogeneous vector and graph databases.

  • Mandol utilizes a hierarchical memory model with basic and abstract layers represented as structured semantic graphs.
  • The system employs an agglomerative semantic data structure that natively fuses key-value, vector, and graph structures.
  • It features a quantitative query mechanism with adaptive routing and token-constrained context generation without involving LLMs during retrieval.
  • Experiments on LoCoMo and LongMemEval benchmarks show Mandol achieves the best overall accuracy among representative agent memory systems.
  • The system demonstrates a 5.4x retrieval speedup and a 4.8x insertion speedup under 10 QPS concurrent load while maintaining low latency on consumer-grade hardware.

Mandol improves LLM accuracy and efficiency by eliminating cross-database I/O and providing precise control over token budgets during the retrieval process.