The Top-N-Sigma sampler currently performs an unconditional softmax and sort operation at the end, which is wasted when followed by Dist. This PR removes that step, improving throughput by 50% on a M3 Max MacBook Pro for the google_gemma-4-E4B-it-Q8_0 model, reducing token time by 10ms. The change may affect sampler chains and is not yet verified for all backends and models.
Top-N-Sigma: Remove unconditional softmax+sort
from English