The llama.cpp project has released version b9833, introducing a dedicated parser for the MiniCPM5 model alongside various bug fixes and refactoring. This update includes support for tool call parsing, grammar simplification, and corrected Jinja API behavior to ensure compatibility with Jinja2 standards.

  • Implemented a dedicated MiniCPM5 PEG parser with XML tool call support and fixed streaming tool-argument placeholders.
  • Refactored the chat module to use an autoparser for MiniCPM5 while reverting shared mappers and history fallbacks.
  • Fixed the jinja min/max API to match Jinja2 specifications and updated template naming to openbmb-MiniCPM5-1B.jinja.
  • Provided binaries for macOS (Apple Silicon, Intel), iOS, Linux (CPU, Vulkan, ROCm, OpenVINO, SYCL), Android, Windows (CPU, CUDA 12/13, Vulkan, OpenCL, HIP), and openEuler.

This release enables users to run MiniCPM5 models with improved tool-calling capabilities across a wide range of hardware architectures and operating systems.