This paper introduces a continuous decoding framework for masked diffusion language models (MDLMs) that reinterprets mask prediction as clean-state prediction to induce a continuous flow in input embedding space. By allowing tokens to accumulate partial progress and remain revisable, the method addresses the premature commitments inherent in standard binary unmasking regimes.

  • The framework replaces globally synchronous schedules with confidence-based asynchronous updates to handle uneven contextual constraints across positions.
  • A lightweight policy network is introduced and trained via reinforcement learning to manage the decoding process.
  • Applied to the pretrained LLaDA model, the continuous decoder achieves 97% of its performance on the HumanEval dataset using only 25% of the decoding budget.

This approach improves efficiency by enabling partial belief representation during generation, allowing for significant performance retention with substantially reduced computational resources.