All articles — korshunov.ai

All articles Page 1 / 99

CrewAI 1.15.1a1 Release Notes

The CrewAI 1.15.1a1 update introduces new telemetry tracking, enforces explicit project definitions for CrewAI, and improves the CLI deployment workflow.

github vLLM · 13h ago

v0.24.0

The v0.24.0 release includes a continuous integration update to raise the GSM8K startup timeout for MoE Refactor Qwen3 NVFP4 configurations.

lab OpenAI News · 15h ago

OpenAI previews GPT-5.6 Sol, Terra, and Luna models

OpenAI has initiated a limited preview of the GPT-5.6 series, introducing three new models: Sol as the flagship, Terra for balanced everyday work, and Luna for fast, affordable tasks. The company plans to make these models generally available in the coming weeks following this initial phase with trusted partners.

github llama.cpp · 15h ago

llama.cpp b9821 Release: CLI Flags and Multi-Platform Binaries

The llama.cpp project has released version b9821, which introduces command-line interface updates allowing users to invoke --version, --licenses, and --help flags. This release provides a comprehensive set of pre-built binaries for macOS, Linux, Android, Windows, and openEuler across various hardware accelerators.

media Hugging Face Forums · 16h ago

Mission: Build RAG System for Endangered Spoken Language

A job posting seeks an experienced NLP or LLM engineer to develop the first Retrieval-Augmented Generation (RAG) localization engine for a low-resource language spoken in South America. The project utilizes a proprietary corpus of pedagogical content and dictionary data developed over four years.

lab Claude Code Releases · 16h ago

Claude Code v2.1.195 Release Notes

Claude Code version 2.1.195 introduces several fixes and improvements, including new environment variables for mouse control in fullscreen mode and corrections to hook matcher logic.

github llama.cpp · 19h ago

llama.cpp b9820 release: reduced CUDA syncs and new binaries

The llama.cpp b9820 release introduces performance improvements by reintroducing less synchronizations during split compute, specifically targeting CUDA backends. This update also provides pre-built binaries for macOS, Linux, Windows, Android, and openEuler across CPU, GPU, and specialized hardware accelerators.

github llama.cpp · 20h ago

llama.cpp b9816 Release: Sync with ggml and New Binaries

The llama.cpp project has released version b9816, which includes a synchronization with the ggml library. This update provides pre-built binaries for macOS, iOS, Linux, Windows, Android, and openEuler platforms.

github llama.cpp · 20h ago

llama.cpp b9817 release: OpenVINO 2026.2.1 update and operator improvements

The llama.cpp b9817 release updates the OpenVINO backend to version 2026.2.1 and makes its release packages self-contained. This update includes several operator improvements within the OpenVINO backend, such as removing hardcoded compute_op_type sets and enabling softmax with sink input.

github llama.cpp · 22h ago

llama.cpp b9813 release adds Intel Xe-LPG Plus Vulkan support

The llama.cpp b9813 release introduces Vulkan support for Intel Xe-LPG Plus hardware by adding the INTEL_XE1 architecture enum and enabling coopmat1. This update addresses previous code comments, renames the architecture identifier, and includes a Windows driver check.

github llama.cpp · 22h ago

llama.cpp b9814 release with Vulkan optimization for mi50

The llama.cpp project has released version b9814, which includes an optimization for the `mul_mat_vecq` operation in Vulkan specifically targeting the AMD mi50 GPU. This update is accompanied by a comprehensive set of pre-built binaries across multiple operating systems and hardware architectures.

media Hugging Face Forums · 23h ago

How do you evaluate an LLM before deploying it in production?

This Hugging Face discussion thread addresses the methods and considerations for testing Large Language Models to ensure they are suitable for real-world applications.

media Hugging Face Forums · 23h ago

User reports paper indexed but missing from Daily Papers

A user on the Hugging Face forum reports that their arXiv paper, "Agent-as-a-Router: Agentic Model Routing for Coding Tasks," was successfully indexed and claimed but never appeared on the Daily Papers homepage. Despite receiving community upvotes and linking a corresponding dataset, the paper has not been featured after several days.

github MCP (GitHub org) · 1d ago

MCP Python SDK v2.0.0a3 Release Notes

The Model Context Protocol (MCP) Python SDK has released its third alpha version, v2.0.0a3, introducing significant protocol and architectural changes while maintaining backward compatibility for stable 1.x users.

github llama.cpp · 1d ago

llama.cpp b9811 release with Vulkan compiler workaround

The llama.cpp project has released version b9811, which includes a fix for a compiler bug affecting the conv2d coopmat2 path in Vulkan. This workaround is also applied to the CONV_3D implementation based on suggestions from NVIDIA engineer Jeff Bolz.

github llama.cpp · 1d ago

llama.cpp b9810 release adds cublasSgemmBatched mapping and new binaries

The llama.cpp project has released version b9810, introducing a CUDA mapping for `cublasSgemmBatched` in HIP/MUSA vendor headers. This update is accompanied by a comprehensive set of pre-built binaries for macOS, Linux, Windows, Android, and openEuler platforms.

github MCP (GitHub org) · 1d ago

Model Context Protocol Python SDK v1.28.1 Release

The Model Context Protocol Python SDK has released version 1.28.1, introducing updates to stream handling and transport security.

media Hugging Face Forums · 1d ago

Pendo Hiring Staff and Sr AI Engineers in NYC for Novus

Pendo is hiring onsite Staff and Senior AI Engineers in New York City to work on Novus, a production-grade product agent designed to autonomously read live codebases and detect real user pain.

media Hugging Face Forums · 1d ago

eBPF in Go: Observability for AI-Generated Services

This article presents a tutorial on using eBPF with Go to achieve kernel-level observability, addressing the lack of visibility when debugging production issues in AI-generated services.

github llama.cpp · 1d ago

llama.cpp b9804 release: Mamba2 fixes and new binaries

The llama.cpp b9804 release introduces a fix for the Mamba2 architecture by removing a hardcoded 2x expansion factor and an invalid parameter check, allowing support for any expand value. This change updates the `convert_hf_to_gguf.py` script to make the expand parameter optional with a default of 2.