All articles
arxiv arXiv cs.CL · 8h ago

Nemotron-TwoTower: Diffusion Language Modeling with Pretrained Autoregressive Context

NVIDIA introduces Nemotron-TwoTower, a diffusion language model that decouples context representation and iterative denoising into two separate networks to overcome capacity limitations in existing approaches. Built on the open-weight Nemotron-3-Nano-30B-A3B model and trained on 2.1T tokens, it retains 98.7% of the autoregressive baseline's quality while achieving 2.42X higher wall-clock generation throughput.

arxiv arXiv cs.CL · 8h ago

MemStrata: Eliminating Stale-Fact Errors in RAG Agents via Temporal Validity

The article introduces MemStrata, a retrieval memory system designed to eliminate stale-fact errors in AI agents by maintaining temporal validity within accumulated knowledge. Unlike standard Retrieval-Augmented Generation (RAG), which struggles to distinguish between duplicated and contradicted facts due to embedding similarity, MemStrata uses a deterministic supersession rule to retire outdated information.

arxiv arXiv cs.CL · 10h ago

SocialPersona: Benchmarking Personalized Profiling and Response with Multimodal Social-Media Context

The authors introduce SocialPersona, a benchmark designed to evaluate whether multimodal large language models (MLLMs) can recover revealed preferences from longitudinal social-media timelines and use them in dialogue. This work addresses the limitation of current evaluations that focus only on explicit memory by testing a model's ability to infer interests from natural multimodal traces.