All articles
arxiv arXiv cs.CL · 6h ago

MultiHashFormer: Hash-based Generative Language Models

The paper introduces MultiHashFormer, a framework enabling hash-based autoregression in causal language models by representing tokens as unique signatures of discrete hash IDs. This approach allows the model to compress token information into latent vectors for Transformer processing while mapping them back to text, effectively addressing the many-to-one collision issues that previously prevented hashing in generative contexts.

arxiv arXiv cs.CL · 6h ago

Vision-Default, Prior-Override: Causal Mechanisms of Perception-Knowledge Conflict in Vision-Language Models

This study investigates how vision-language models resolve conflicts between visual evidence and memorized world knowledge by combining activation patching with mechanistic analysis across three model families. The research identifies a sparse causal circuit where visual grounding is the default, while overriding it with prior knowledge requires specific attention heads.

arxiv arXiv cs.CL · 6h ago

Google Introduces Paper Assistant Tool for Automated Scientific Review

To address the scalability challenges of traditional peer review in the era of AI-assisted science, researchers propose a taxonomy of AI-human collaboration and introduce the Paper Assistant Tool (PAT). PAT is an agentic AI framework designed to ingest full scientific manuscripts and produce comprehensive evaluations by checking theoretical results, validating experiments, and identifying potential flaws.