The llama.cpp project has released version b9829, which includes a reduction of logging output in the server, common components, and speculative decoding modules. This update also standardizes naming conventions by replacing CMN_ with COM_.

  • Server logs have been reduced for better verbosity control.
  • macOS Apple Silicon builds are available, but KleidiAI support is disabled.
  • Linux binaries cover Ubuntu x64, arm64, s390x, Vulkan, ROCm 7.2, OpenVINO, and SYCL FP32/FP16.
  • Windows releases include CPU, OpenCL Adreno, CUDA 12.4/13.3, Vulkan, OpenVINO, SYCL, and HIP variants.
  • Android arm64 (CPU) and iOS XCFramework binaries are provided.
  • openEuler support is disabled for x86 but available for aarch64 with ACL Graph.

This release provides updated binaries for developers across various operating systems and hardware accelerators, ensuring compatibility with recent CUDA versions and reducing log noise in server environments.