The llama.cpp b9817 release updates the OpenVINO backend to version 2026.2.1 and makes its release packages self-contained. This update includes several operator improvements within the OpenVINO backend, such as removing hardcoded compute_op_type sets and enabling softmax with sink input.

  • Update to OpenVINO 2026.2.1 with self-contained release packages.
  • Remove hardcoded compute_op_type sets in the OpenVINO backend.
  • Enable softmax with sink input support.
  • Optimize mul_mat_id conversion process for large sizes.
  • Modify add_id to support 2D/4D inputs.
  • Add glu_swiglu_oai operator support.

The release provides pre-built binaries for macOS (Apple Silicon and Intel), iOS, Linux (Ubuntu x64, arm64, s390x), Android, Windows (CPU, CUDA 12/13, Vulkan, OpenVINO, SYCL, HIP), and openEuler across various CPU architectures.