The llama.cpp project has released version b9860, introducing a new public C API function named `llama_ftype_name` to expose the model file type (quantization) name.

  • The `llama_ftype_name` function returns strings such as "Q8_0" or "Q4_K - Medium", with the pointer valid for the model's lifetime and nullptr if invalid.
  • The implementation was optimized by prepending the "(guessed)" label instead of appending it, removing a non-thread-safe static string to make the function allocation-free.
  • Binaries are available for macOS (Apple Silicon and Intel), Linux (CPU, Vulkan, ROCm, OpenVINO, SYCL), Android, Windows (CPU, CUDA 12/13, Vulkan, OpenCL, OpenVINO, SYCL, HIP), and openEuler.

This update allows developers to programmatically identify the quantization format of loaded models without relying on external metadata or guessing.