The llama.cpp project has published the b9802 release, offering pre-built binaries across multiple operating systems and hardware architectures. This update includes support for CPU, GPU, and specialized AI accelerators on platforms such as macOS, Linux, Windows, Android, and openEuler.

  • macOS builds are available for Apple Silicon (arm64) and Intel (x64), alongside an iOS XCFramework.
  • Linux binaries cover Ubuntu x64 and arm64 CPUs, Vulkan, ROCm 7.2, OpenVINO, and SYCL FP32/FP16 variants.
  • Windows releases include CPU, CUDA 12.4 and 13.3, Vulkan, OpenVINO, SYCL, HIP, and OpenCL Adreno options.
  • Android support is provided for arm64 CPUs, while openEuler builds are available for x86 and aarch64 architectures.

The release enables users to run llama.cpp locally on a wide variety of devices without requiring local compilation.