IceFog72 has implemented an experimental "scatter" sampler for llama.cpp that locally smooths the next-token probability distribution among top candidates. This approach aims to reduce generation rigidity without introducing noise from the deep tail of the distribution.
- The sampler uses a local diffusion step over token rank, allowing nearby ranks to exchange probability mass while preserving the filtered candidate set.
- It is positioned in the default sampler chain between "xtc" and "temperature" but is disabled by default.
- Features include fixed or adaptive scattering strength based on entropy feedback, optional repeated-token absorption, and collision gating.
- The implementation includes native API functions and invariant tests within the llama.cpp framework.
This tool provides a more localized alternative to raising temperature, offering finer control over text generation diversity while avoiding incoherent jumps caused by weak tail tokens.