llama.cpp now supports granite-speech-4.1-2b-plus and LFM2.5-ColBERT/Embedding-350M models. Vulkan backend updates include support for 3D convolutions, aligned operations, GET_ROWS_BACK, and improved numerical stability in feedforward layers. Additional improvements cover UI enhancements and backend test coverage.
llama.cpp updates: Granite-Speech, LFM2.5-ColBERT models, Vulkan backend enhancements
from English