The llama.cpp project has released version b9814, which includes an optimization for the `mul_mat_vecq` operation in Vulkan specifically targeting the AMD mi50 GPU. This update is accompanied by a comprehensive set of pre-built binaries across multiple operating systems and hardware architectures.

  • macOS Apple Silicon (arm64) and Intel (x64) builds are available, with KleidiAI support disabled for Apple Silicon.
  • Linux binaries cover Ubuntu x64 and arm64 CPU, Vulkan, ROCm 7.2, OpenVINO, and SYCL FP32/FP16 variants.
  • Windows releases include CPU, OpenCL Adreno, CUDA 12.4 and 13.3, Vulkan, OpenVINO, SYCL, and HIP backends.
  • Android arm64 (CPU) and openEuler x86/aarch64 builds for 310p and 910b chips are provided, with openEuler x86 standard build disabled.

This release allows users to run llama.cpp on a wide variety of hardware configurations, including specific optimizations for AMD GPUs via Vulkan.