The llama.cpp b9817 release updates the OpenVINO backend to version 2026.2.1 and makes its release packages self-contained. This update includes several operator improvements within the OpenVINO backend, such as removing hardcoded compute_op_type sets and enabling softmax with sink input.
- Update to OpenVINO 2026.2.1 with self-contained release packages.
- Remove hardcoded compute_op_type sets in the OpenVINO backend.
- Enable softmax with sink input support.
- Optimize mul_mat_id conversion process for large sizes.
- Modify add_id to support 2D/4D inputs.
- Add glu_swiglu_oai operator support.
The release provides pre-built binaries for macOS (Apple Silicon and Intel), iOS, Linux (Ubuntu x64, arm64, s390x), Android, Windows (CPU, CUDA 12/13, Vulkan, OpenVINO, SYCL, HIP), and openEuler across various CPU architectures.