The llama.cpp project has published the b9843 release, providing pre-built binaries for macOS, Linux, Android, Windows, and openEuler across various hardware architectures.

  • Reverts PR #20793 to reintroduce less synchronizations during split compute.
  • Disables KleidiAI support for macOS Apple Silicon builds.
  • Provides CPU, Vulkan, ROCm, OpenVINO, SYCL, CUDA, HIP, and OpenCL variants for Linux and Windows.
  • Includes iOS XCFramework, Android arm64 (CPU), and UI binaries.

This release allows users to run llama.cpp on a wide range of devices and accelerators without compiling from source.