llama.cpp release b9767 improves MTP inference using mat-vec paths for small batches and includes updated GPU support. The release provides binaries for macOS, Linux, Android, Windows, and openEuler across multiple architectures and APIs including Vulkan, CUDA, OpenVINO, and SYCL.
llama.cpp release b9767 adds GPU and multi-platform support
from English