llama.cpp b9856 release with CUDA restrict + PDL for FA

The llama.cpp project has released version b9856, introducing consistent use of the `restrict` keyword and PDL for Flash Attention in CUDA. This update is accompanied by pre-built binaries for macOS, Linux, Android, Windows, and openEuler across various hardware backends.

macOS Apple Silicon (arm64) builds are available, while KleidiAI support remains disabled.
Linux binaries cover CPU (x64, arm64, s390x), Vulkan, ROCm 7.2, OpenVINO, and SYCL FP32/FP16.
Windows releases include CPU, OpenCL Adreno, CUDA 12.4/13.3, Vulkan, OpenVINO, SYCL, and HIP.
Android arm64 (CPU) and UI binaries are also provided for this release.