The llama.cpp project has released version b9816, which includes a synchronization with the ggml library. This update provides pre-built binaries for macOS, iOS, Linux, Windows, Android, and openEuler platforms.

  • macOS Apple Silicon (arm64) and Intel (x64) builds are available, while KleidiAI support is disabled.
  • Linux binaries cover Ubuntu x64 and arm64 CPU, Vulkan, ROCm 7.2, OpenVINO, and SYCL FP32/FP16 variants.
  • Windows releases include CPU, CUDA 12.4/13.3, Vulkan, OpenVINO, SYCL, HIP, and OpenCL Adreno options.
  • Android arm64 (CPU) and iOS XCFramework binaries are provided for mobile deployment.
  • openEuler support includes x86 and aarch64 builds with ACL Graph, though standard openEuler is disabled.

This release enables users to run llama.cpp on a wide variety of hardware architectures and operating systems using the latest ggml backend.