llama.cpp release b9670 includes fixes for NVFP4 edge cases in llama-graph, such as moving post-GEMM MUL operations and restricting build_ffn to supported combinations. The release provides binaries for macOS, Linux, Android, Windows, and openEuler across multiple architectures and backend options, including CUDA, Vulkan, SYCL, and OpenVINO.
llama.cpp Release b9670: Fixes and New Builds
from English