A study demonstrates that the per-word processing time in the state-space language model Mamba aligns with human reading times. The research shows that Mamba's dynamic discretization timestep is a significant predictor of how long humans take to read words, even when controlling for other factors like GPT-2 surprisal.

  • Mamba's recurrent state transition uses a dynamic discretization timestep ($Δ_t$) determined by the input.
  • Analysis of a naturalistic reading dataset confirms Mamba's per-word timestep predicts human reading duration.
  • This predictive power remains significant even when known predictors such as GPT-2 surprisal are controlled for.
  • Formal analysis suggests Mamba offers a lens to study real-time language processing with ever-updated memory.

The authors suggest that Mamba serves as a valuable tool for examining how language model modules weigh short- and long-term information retention and how noise interacts with continuous memory representations.