All articles — korshunov.ai

All articles Page 1 / 98

llama.cpp b9820 release: reduced CUDA syncs and new binaries

The llama.cpp b9820 release introduces performance improvements by reintroducing less synchronizations during split compute, specifically targeting CUDA backends. This update also provides pre-built binaries for macOS, Linux, Windows, Android, and openEuler across CPU, GPU, and specialized hardware accelerators.

github llama.cpp · 6h ago

llama.cpp b9816 Release: Sync with ggml and New Binaries

The llama.cpp project has released version b9816, which includes a synchronization with the ggml library. This update provides pre-built binaries for macOS, iOS, Linux, Windows, Android, and openEuler platforms.

github llama.cpp · 7h ago

llama.cpp b9817 release: OpenVINO 2026.2.1 update and operator improvements

The llama.cpp b9817 release updates the OpenVINO backend to version 2026.2.1 and makes its release packages self-contained. This update includes several operator improvements within the OpenVINO backend, such as removing hardcoded compute_op_type sets and enabling softmax with sink input.

github llama.cpp · 8h ago

llama.cpp b9813 release adds Intel Xe-LPG Plus Vulkan support

The llama.cpp b9813 release introduces Vulkan support for Intel Xe-LPG Plus hardware by adding the INTEL_XE1 architecture enum and enabling coopmat1. This update addresses previous code comments, renames the architecture identifier, and includes a Windows driver check.

github llama.cpp · 8h ago

llama.cpp b9814 release with Vulkan optimization for mi50

The llama.cpp project has released version b9814, which includes an optimization for the `mul_mat_vecq` operation in Vulkan specifically targeting the AMD mi50 GPU. This update is accompanied by a comprehensive set of pre-built binaries across multiple operating systems and hardware architectures.

media Hugging Face Forums · 9h ago

How do you evaluate an LLM before deploying it in production?

This Hugging Face discussion thread addresses the methods and considerations for testing Large Language Models to ensure they are suitable for real-world applications.

media Hugging Face Forums · 9h ago

User reports paper indexed but missing from Daily Papers

A user on the Hugging Face forum reports that their arXiv paper, "Agent-as-a-Router: Agentic Model Routing for Coding Tasks," was successfully indexed and claimed but never appeared on the Daily Papers homepage. Despite receiving community upvotes and linking a corresponding dataset, the paper has not been featured after several days.

github MCP (GitHub org) · 10h ago

MCP Python SDK v2.0.0a3 Release Notes

The Model Context Protocol (MCP) Python SDK has released its third alpha version, v2.0.0a3, introducing significant protocol and architectural changes while maintaining backward compatibility for stable 1.x users.

github llama.cpp · 10h ago

llama.cpp b9811 release with Vulkan compiler workaround

The llama.cpp project has released version b9811, which includes a fix for a compiler bug affecting the conv2d coopmat2 path in Vulkan. This workaround is also applied to the CONV_3D implementation based on suggestions from NVIDIA engineer Jeff Bolz.

github llama.cpp · 12h ago

llama.cpp b9810 release adds cublasSgemmBatched mapping and new binaries

The llama.cpp project has released version b9810, introducing a CUDA mapping for `cublasSgemmBatched` in HIP/MUSA vendor headers. This update is accompanied by a comprehensive set of pre-built binaries for macOS, Linux, Windows, Android, and openEuler platforms.

github MCP (GitHub org) · 12h ago

Model Context Protocol Python SDK v1.28.1 Release

The Model Context Protocol Python SDK has released version 1.28.1, introducing updates to stream handling and transport security.

media Hugging Face Forums · 12h ago

Pendo Hiring Staff and Sr AI Engineers in NYC for Novus

Pendo is hiring onsite Staff and Senior AI Engineers in New York City to work on Novus, a production-grade product agent designed to autonomously read live codebases and detect real user pain.

media Hugging Face Forums · 12h ago

eBPF in Go: Observability for AI-Generated Services

This article presents a tutorial on using eBPF with Go to achieve kernel-level observability, addressing the lack of visibility when debugging production issues in AI-generated services.

github llama.cpp · 17h ago

llama.cpp b9804 release: Mamba2 fixes and new binaries

The llama.cpp b9804 release introduces a fix for the Mamba2 architecture by removing a hardcoded 2x expansion factor and an invalid parameter check, allowing support for any expand value. This change updates the `convert_hf_to_gguf.py` script to make the expand parameter optional with a default of 2.

media Hugging Face Forums · 19h ago

JoeBro: a native macOS AI workspace with zero dependencies

JoeBro is a local-first, native macOS application designed to provide an AI workspace without requiring external dependencies like pip or Docker. It features a bundled Python backend and SQLite storage to ensure all data remains on the user's machine, eliminating telemetry and account requirements.

media Hugging Face Forums · 19h ago

How can I add someone to a Hugging Face dataset/database?

The provided source content indicates that the original post topic was deleted by the author. Consequently, no specific information regarding the process of adding users to a Hugging Face dataset or database is available in this excerpt.

github CrewAI · 21h ago

crewAI 1.15.0 Release Notes

The crewAI 1.15.0 release introduces significant enhancements to Flow definitions, including unified declarative loading, inline crew support, and new composite actions like `each` and single agent actions.

github AutoGPT · 21h ago

AutoGPT Platform Beta v0.6.65 Release Notes

The AutoGPT platform has released version 0.6.65, introducing significant updates to the Copilot system, user interface navigation, and infrastructure reliability.

github llama.cpp · 21h ago

llama.cpp b9803 release with OpenCL profiling fix

The llama.cpp project has released version b9803, which includes a fix for OpenCL to flush profiling batches at shutdown for incomplete batches. This update provides binaries for macOS, Linux, Windows, Android, and openEuler across various hardware backends.

github llama.cpp · 22h ago

llama.cpp b9802 release provides binaries for macOS, Linux, Windows, and Android

The llama.cpp project has published the b9802 release, offering pre-built binaries across multiple operating systems and hardware architectures. This update includes support for CPU, GPU, and specialized AI accelerators on platforms such as macOS, Linux, Windows, Android, and openEuler.