A measurement study finds that 26 semantic post-hoc operators do not improve held-out accuracy over Best-of-N in frozen small code models. While two operators—expression-layer recovery and adaptive consensus early-stop—offer benefits in compute efficiency or program recovery, none outperform BoN in accuracy. The results highlight systemic limitations in error detection and coverage, suggesting that model harnesses and error coverage must be improved before post-hoc reasoning is considered.