The authors introduce Masked Language Flow Models (MLFMs), which combine masked diffusion with continuous flows to enable efficient, multi-step reasoning in language generation. This approach bridges the gap between parallel generation efficiency and complex task performance by allowing pretrained models to be adapted into MLFMs.

  • MLFMs use a continuous stochastic interpolant to bridge partially masked and clean sequences, enabling conditional generation via continuous flows.
  • The framework allows pretrained Masked Diffusion Models (MDMs) to be converted into MLFMs through a simple, lightweight adaptation.
  • A novel sampler is proposed that alternates continuous denoising with the discrete unmasking of confident tokens to support multi-step reasoning.
  • Evaluations on GSM8K and MT-Bench demonstrate that flow-based language models can now scale to solve downstream reasoning and instruction-following tasks.

This work addresses the limitation of Flow Language Models in decoding every token, proving for the first time that flow-based models are viable for complex reasoning and instruction-following applications.