Vision-Default, Prior-Override: Causal Mechanisms of Perception-Knowledge Conflict in Vision-Language Models
This study investigates how vision-language models resolve conflicts between visual evidence and memorized world knowledge by combining activation patching with mechanistic analysis across three model families. The research identifies a sparse causal circuit where visual grounding is the default, while overriding it with prior knowledge requires specific attention heads.