All articles — korshunov.ai

All articles Page 1 / 98

How do you evaluate an LLM before deploying it in production?

This Hugging Face discussion thread addresses the methods and considerations for testing Large Language Models to ensure they are suitable for real-world applications.

media Hugging Face Forums · 9h ago

User reports paper indexed but missing from Daily Papers

A user on the Hugging Face forum reports that their arXiv paper, "Agent-as-a-Router: Agentic Model Routing for Coding Tasks," was successfully indexed and claimed but never appeared on the Daily Papers homepage. Despite receiving community upvotes and linking a corresponding dataset, the paper has not been featured after several days.

github MCP (GitHub org) · 10h ago

MCP Python SDK v2.0.0a3 Release Notes

The Model Context Protocol (MCP) Python SDK has released its third alpha version, v2.0.0a3, introducing significant protocol and architectural changes while maintaining backward compatibility for stable 1.x users.

github llama.cpp · 10h ago

llama.cpp b9811 release with Vulkan compiler workaround

The llama.cpp project has released version b9811, which includes a fix for a compiler bug affecting the conv2d coopmat2 path in Vulkan. This workaround is also applied to the CONV_3D implementation based on suggestions from NVIDIA engineer Jeff Bolz.

github llama.cpp · 12h ago

llama.cpp b9810 release adds cublasSgemmBatched mapping and new binaries

The llama.cpp project has released version b9810, introducing a CUDA mapping for `cublasSgemmBatched` in HIP/MUSA vendor headers. This update is accompanied by a comprehensive set of pre-built binaries for macOS, Linux, Windows, Android, and openEuler platforms.

github MCP (GitHub org) · 12h ago

Model Context Protocol Python SDK v1.28.1 Release

The Model Context Protocol Python SDK has released version 1.28.1, introducing updates to stream handling and transport security.

media Hugging Face Forums · 12h ago

Pendo Hiring Staff and Sr AI Engineers in NYC for Novus

Pendo is hiring onsite Staff and Senior AI Engineers in New York City to work on Novus, a production-grade product agent designed to autonomously read live codebases and detect real user pain.

media Hugging Face Forums · 12h ago

eBPF in Go: Observability for AI-Generated Services

This article presents a tutorial on using eBPF with Go to achieve kernel-level observability, addressing the lack of visibility when debugging production issues in AI-generated services.

github llama.cpp · 17h ago

llama.cpp b9804 release: Mamba2 fixes and new binaries

The llama.cpp b9804 release introduces a fix for the Mamba2 architecture by removing a hardcoded 2x expansion factor and an invalid parameter check, allowing support for any expand value. This change updates the `convert_hf_to_gguf.py` script to make the expand parameter optional with a default of 2.

media Hugging Face Forums · 19h ago

JoeBro: a native macOS AI workspace with zero dependencies

JoeBro is a local-first, native macOS application designed to provide an AI workspace without requiring external dependencies like pip or Docker. It features a bundled Python backend and SQLite storage to ensure all data remains on the user's machine, eliminating telemetry and account requirements.

media Hugging Face Forums · 19h ago

How can I add someone to a Hugging Face dataset/database?

The provided source content indicates that the original post topic was deleted by the author. Consequently, no specific information regarding the process of adding users to a Hugging Face dataset or database is available in this excerpt.

github CrewAI · 21h ago

crewAI 1.15.0 Release Notes

The crewAI 1.15.0 release introduces significant enhancements to Flow definitions, including unified declarative loading, inline crew support, and new composite actions like `each` and single agent actions.

github AutoGPT · 21h ago

AutoGPT Platform Beta v0.6.65 Release Notes

The AutoGPT platform has released version 0.6.65, introducing significant updates to the Copilot system, user interface navigation, and infrastructure reliability.

github llama.cpp · 21h ago

llama.cpp b9803 release with OpenCL profiling fix

The llama.cpp project has released version b9803, which includes a fix for OpenCL to flush profiling batches at shutdown for incomplete batches. This update provides binaries for macOS, Linux, Windows, Android, and openEuler across various hardware backends.

github llama.cpp · 22h ago

llama.cpp b9802 release provides binaries for macOS, Linux, Windows, and Android

The llama.cpp project has published the b9802 release, offering pre-built binaries across multiple operating systems and hardware architectures. This update includes support for CPU, GPU, and specialized AI accelerators on platforms such as macOS, Linux, Windows, Android, and openEuler.

github SGLang · 1d ago

v0.5.14

The article announces the release of version 0.5.14.

lab Claude Code Releases · 1d ago

Claude Code v2.1.193 Release Notes

Claude Code version 2.1.193 introduces several enhancements to auto-mode classification, telemetry logging, and background agent management. This update also includes fixes for UI state issues, authentication handling in MCP servers, and various backgrounding bugs.

lab Cohere Blog · 1d ago

Automating fork maintenance with AI agents

This article describes a method for automating the maintenance of software forks using AI coding agents, applying it to Cohere's fork of vLLM. The approach compresses the time required to absorb upstream releases from weeks to days by replacing manual intervention with an automated feedback loop.

github Goose (Block) · 1d ago

v1.39.0

This release attempts to fix the Flatpak build.

lab Microsoft Research Blog · 1d ago

Understanding the brain with AI-driven explanations and experiments

Researchers have developed Generative Causal Testing (GCT), a framework that translates uninterpretable LLM-based brain-prediction models into concise, testable verbal hypotheses about cortical function. This method distills model parameters into short phrases describing what specific brain regions respond to, such as "food preparation," and then verifies these explanations through targeted fMRI experiments.