Code generation
media r/LocalLLaMA · 9d ago

AeroLLM: Fast, open-source AI app for Apple Silicon

AeroLLM is a fast, optimised, and open-source chat application designed for Apple Silicon devices using the MLX backend. It supports local AI tasks like text-to-speech, speech-to-text, and large language models, with models downloaded directly from Hugging Face based on available RAM. The app is notarised due to lack of Apple Developer membership, but users can follow provided steps to run it as a signed macOS app.

arxiv arXiv cs.CL · 9d ago

Post-Hoc Operators Fail to Improve Accuracy in Small Code Models

A measurement study finds that 26 semantic post-hoc operators do not improve held-out accuracy over Best-of-N in frozen small code models. While two operators—expression-layer recovery and adaptive consensus early-stop—offer benefits in compute efficiency or program recovery, none outperform BoN in accuracy. The results highlight systemic limitations in error detection and coverage, suggesting that model harnesses and error coverage must be improved before post-hoc reasoning is considered.

arxiv arXiv cs.LG · 9d ago

Post-Hoc Falsification Operators Fail to Improve Accuracy in Small Code Models

A measurement study finds that 26 semantic post-hoc operators do not improve held-out accuracy over Best-of-N in frozen small code models. While some operators reduce compute usage or recover correct programs, none outperform BoN in accuracy, due to systemic limitations like coverage walls and consensus traps. An expression-layer recovery (M1) improves performance on HumanEval+ by 12 tasks, with no harm or leakage, and shows consistent results across model cells.