Masked Diffusion Decoding as x-Prediction Flow

This paper introduces a continuous decoding framework for masked diffusion language models (MDLMs) that reinterprets mask prediction as clean-state prediction to induce a continuous flow in input embedding space. By allowing tokens to accumulate partial progress and remain revisable, the method addresses the premature commitments inherent in standard binary unmasking regimes.

The framework replaces globally synchronous schedules with confidence-based asynchronous updates to handle uneven contextual constraints across positions.
A lightweight policy network is introduced and trained via reinforcement learning to manage the decoding process.
Applied to the pretrained LLaDA model, the continuous decoder achieves 97% of its performance on the HumanEval dataset using only 25% of the decoding budget.

This approach improves efficiency by enabling partial belief representation during generation, allowing for significant performance retention with substantially reduced computational resources.

Benchmarks