All articles — korshunov.ai

All articles Page 1 / 97

AutoGPT Platform Beta v0.6.65 Release Notes

The AutoGPT platform has released version 0.6.65, introducing significant updates to the Copilot system, user interface navigation, and infrastructure reliability.

github llama.cpp · 5h ago

llama.cpp b9803 release with OpenCL profiling fix

The llama.cpp project has released version b9803, which includes a fix for OpenCL to flush profiling batches at shutdown for incomplete batches. This update provides binaries for macOS, Linux, Windows, Android, and openEuler across various hardware backends.

github llama.cpp · 6h ago

llama.cpp b9802 release provides binaries for macOS, Linux, Windows, and Android

The llama.cpp project has published the b9802 release, offering pre-built binaries across multiple operating systems and hardware architectures. This update includes support for CPU, GPU, and specialized AI accelerators on platforms such as macOS, Linux, Windows, Android, and openEuler.

github SGLang · 7h ago

v0.5.14

The article announces the release of version 0.5.14.

lab Claude Code Releases · 9h ago

Claude Code v2.1.193 Release Notes

Claude Code version 2.1.193 introduces several enhancements to auto-mode classification, telemetry logging, and background agent management. This update also includes fixes for UI state issues, authentication handling in MCP servers, and various backgrounding bugs.

lab Cohere Blog · 10h ago

Automating fork maintenance with AI agents

This article describes a method for automating the maintenance of software forks using AI coding agents, applying it to Cohere's fork of vLLM. The approach compresses the time required to absorb upstream releases from weeks to days by replacing manual intervention with an automated feedback loop.

github Goose (Block) · 11h ago

v1.39.0

This release attempts to fix the Flatpak build.

lab Microsoft Research Blog · 11h ago

Understanding the brain with AI-driven explanations and experiments

Researchers have developed Generative Causal Testing (GCT), a framework that translates uninterpretable LLM-based brain-prediction models into concise, testable verbal hypotheses about cortical function. This method distills model parameters into short phrases describing what specific brain regions respond to, such as "food preparation," and then verifies these explanations through targeted fMRI experiments.

lab Google — The Keyword (AI) · 11h ago

Google Finance exits beta with new Android app

Google Finance is officially leaving its beta phase and launching a dedicated application for Android devices.

arxiv arXiv cs.LG · 11h ago

CoorDex: Coordinating Body and Hand Priors for Continuous Dexterous Humanoid Loco-Manipulation

The authors introduce CoorDex, a learning pipeline that enables high-degree-of-freedom dexterous loco-manipulation on moving humanoids by converting body and hand control into coordinated latent residual control. This approach allows the Unitree G1 humanoid to perform complex tasks like non-stop bottle grasping and fridge door opening while in motion.

lab Hugging Face Blog · 11h ago

Run a vLLM Server on HF Jobs in One Command

Hugging Face has introduced a new feature that allows users to deploy vLLM servers directly through the Hugging Face Jobs platform using a single command.

github vLLM · 11h ago

v0.24.0rc2: Fix P/D with DP Supervisor (#46628)

This release candidate addresses a fix for Prefill/Decode (P/D) functionality in conjunction with the Data Parallelism (DP) Supervisor within the vLLM project.

arxiv arXiv cs.LG · 12h ago

AutoDex: An Automated Real-World System for Dexterous Grasping Data Collection

AutoDex is an automated system designed to close the loop of real-world dexterous grasping data collection by handling perception, execution, labeling, and reset without human intervention. It addresses the scalability issues of teleoperation and the lack of physical certification in simulation by generating candidate grasps and verifying them on real hardware.

arxiv arXiv cs.AI · 12h ago

Adaptive Hard-Soft Physics-Informed Neural Networks for Robust Boundary-Constrained PDE Solving

This study proposes a unified hard--soft physics--informed neural network (HSPINN) with adaptive loss weighting to address the slow convergence and inaccurate boundary enforcement of conventional PINNs. The framework enforces Dirichlet and periodic boundary conditions exactly through analytical lifting or masking, while treating PDE residuals and initial conditions as soft constraints balanced by an inverse-share softmax strategy.

arxiv arXiv cs.AI · 12h ago

Rethinking Molecular Graph Backdoors under Chemistry-aware Admission

The article introduces ChemGuard, an operational protocol that formalizes the overlooked admission stage of molecular learning pipelines by requiring sanitizable strings and consistent graph reconstruction. This framework reveals that many existing graph-based backdoors lose efficacy because their poisons are chemically invalid or representation-inconsistent.

arxiv arXiv cs.AI · 12h ago

Measuring & Mitigating Over-Alignment for LLMs in Multilingual Criminal Law Courts

This article addresses the challenge of over-alignment in large language models used within Swiss Federal Supreme Court criminal law contexts, where model guardrails frequently trigger refusals when processing sensitive case details. The authors introduce TF-RefusalBench, a multilingual benchmark derived from public rulings, to measure this phenomenon across French, German, Italian, and English.

arxiv arXiv cs.AI · 12h ago

Energy-Based Transformers as Predictors of Reading Difficulty

This study introduces energy-based transformers as a novel measure for predicting human reading difficulty, establishing a formal link between transformer models and associative memory literature like Hopfield networks.

arxiv arXiv cs.AI · 12h ago

Distribution-Aware Diffusion-LLM for Robust Ultra-Long-Term Time Series Forecasting

The authors propose Diffusion-LLM, a framework that integrates a conditional diffusion model into an LLM-based pipeline to address challenges in multimodal time series forecasting. This joint design enables the learning of future data distributions while improving semantic alignment within a shared latent space.

media r/LocalLLaMA · 12h ago

Fast medical RAG API to give your local LLMs access to facts

A developer has released a free, simple Retrieval-Augmented Generation (RAG) API powered by medical Wikipedia articles to provide local large language models with accurate factual information. The service aims for subsecond responses and currently runs on a single ARM VPS using approximately 2GB of RAM.

media r/LocalLLaMA · 12h ago

DGX Spark OS lifetime?

A user on Reddit asks whether Nvidia has disclosed the support lifecycle for the operating system running on DGX Spark hardware. The inquiry specifically concerns the duration of OS support and whether users will be forced to upgrade to new products in the near future, such as by 2028.