The llama.cpp project has released version b9835, which includes a fix for the stop and reasoning skip functionality in single-model mode. This update addresses specific issues within the user interface to improve control during model inference.
- macOS: Binaries provided for Apple Silicon (arm64) and Intel (x64), with KleidiAI disabled on Apple Silicon; iOS XCFramework included.
- Linux: Builds available for Ubuntu x64 and arm64 (CPU, Vulkan, ROCm 7.2, OpenVINO, SYCL FP32/FP16).
- Android: CPU binary provided for arm64 architecture.
- Windows: Binaries for x64 and arm64 CPUs, plus GPU support via CUDA 12/13, Vulkan, OpenCL Adreno, OpenVINO, SYCL, and HIP.
- openEuler: Builds for x86 (310p, 910b ACL Graph) and aarch64 (310p, 910b ACL Graph), with standard support disabled.
The release ensures broader hardware compatibility across multiple operating systems and accelerators while correcting UI behavior in single-model scenarios.