llama.cpp b9851 release fixes CUDA integer truncation and provides binaries

The llama.cpp project has released version b9851, which includes a fix for CUDA to prevent integer truncation and overflow errors in the flash_attn_mask_to_KV_max kernel. This update addresses issues related to KQ mask strides within the specified kernel.

macOS Apple Silicon (arm64) binaries are available, while KleidiAI support is disabled.
Linux builds cover Ubuntu x64 and arm64 for CPU, Vulkan, ROCm 7.2, OpenVINO, and SYCL FP32/FP16.
Android arm64 (CPU) binaries are provided for mobile devices.
Windows releases include CPU, OpenCL Adreno, CUDA 12/13, Vulkan, OpenVINO, SYCL, and HIP variants.
openEuler builds for x86 and aarch64 architectures are listed, with some configurations disabled.
A standalone UI binary is also included in the release assets.

This release ensures stability for CUDA users by correcting calculation errors and provides comprehensive pre-built binaries across major operating systems and hardware accelerators.