A study reveals that conditioning language and vision models on narrow tasks suppresses their ability to report co-present, safety-critical signals they can otherwise detect. This phenomenon, termed the "Inattentional Gap," demonstrates a dissociation between measured benchmark safety and real-world safety.

  • Suppression of safety signals occurred across radiology, driving text, and chest-radiograph vision tasks in every model tested.
  • The effect did not diminish with model scale and persisted even in reasoning models.
  • Variations in suppression were driven more by model family than by size.
  • Models reported these critical signals at substantially higher rates when operating without task constraints.

The authors argue that this gap decouples benchmark performance from actual safety, meaning a system can score near-perfectly on specified hazards while remaining blind to those that cause real-world harm.