All articles — korshunov.ai

All articles Page 1 / 97

v1.39.0

This release attempts to fix the Flatpak build.

Understanding the brain with AI-driven explanations and experiments

Researchers have developed Generative Causal Testing (GCT), a framework that translates uninterpretable LLM-based brain-prediction models into concise, testable verbal hypotheses about cortical function. This method distills model parameters into short phrases describing what specific brain regions respond to, such as "food preparation," and then verifies these explanations through targeted fMRI experiments.

lab Google — The Keyword (AI) · 4h ago

Google Finance exits beta with new Android app

Google Finance is officially leaving its beta phase and launching a dedicated application for Android devices.

arxiv arXiv cs.LG · 4h ago

CoorDex: Coordinating Body and Hand Priors for Continuous Dexterous Humanoid Loco-Manipulation

The authors introduce CoorDex, a learning pipeline that enables high-degree-of-freedom dexterous loco-manipulation on moving humanoids by converting body and hand control into coordinated latent residual control. This approach allows the Unitree G1 humanoid to perform complex tasks like non-stop bottle grasping and fridge door opening while in motion.

lab Hugging Face Blog · 4h ago

Run a vLLM Server on HF Jobs in One Command

Hugging Face has introduced a new feature that allows users to deploy vLLM servers directly through the Hugging Face Jobs platform using a single command.

github vLLM · 4h ago

v0.24.0rc2: Fix P/D with DP Supervisor (#46628)

This release candidate addresses a fix for Prefill/Decode (P/D) functionality in conjunction with the Data Parallelism (DP) Supervisor within the vLLM project.

arxiv arXiv cs.LG · 5h ago

AutoDex: An Automated Real-World System for Dexterous Grasping Data Collection

AutoDex is an automated system designed to close the loop of real-world dexterous grasping data collection by handling perception, execution, labeling, and reset without human intervention. It addresses the scalability issues of teleoperation and the lack of physical certification in simulation by generating candidate grasps and verifying them on real hardware.

arxiv arXiv cs.AI · 5h ago

Adaptive Hard-Soft Physics-Informed Neural Networks for Robust Boundary-Constrained PDE Solving

This study proposes a unified hard--soft physics--informed neural network (HSPINN) with adaptive loss weighting to address the slow convergence and inaccurate boundary enforcement of conventional PINNs. The framework enforces Dirichlet and periodic boundary conditions exactly through analytical lifting or masking, while treating PDE residuals and initial conditions as soft constraints balanced by an inverse-share softmax strategy.

arxiv arXiv cs.AI · 5h ago

Rethinking Molecular Graph Backdoors under Chemistry-aware Admission

The article introduces ChemGuard, an operational protocol that formalizes the overlooked admission stage of molecular learning pipelines by requiring sanitizable strings and consistent graph reconstruction. This framework reveals that many existing graph-based backdoors lose efficacy because their poisons are chemically invalid or representation-inconsistent.

arxiv arXiv cs.AI · 5h ago

Measuring & Mitigating Over-Alignment for LLMs in Multilingual Criminal Law Courts

This article addresses the challenge of over-alignment in large language models used within Swiss Federal Supreme Court criminal law contexts, where model guardrails frequently trigger refusals when processing sensitive case details. The authors introduce TF-RefusalBench, a multilingual benchmark derived from public rulings, to measure this phenomenon across French, German, Italian, and English.

arxiv arXiv cs.AI · 5h ago

Energy-Based Transformers as Predictors of Reading Difficulty

This study introduces energy-based transformers as a novel measure for predicting human reading difficulty, establishing a formal link between transformer models and associative memory literature like Hopfield networks.

arxiv arXiv cs.AI · 5h ago

Distribution-Aware Diffusion-LLM for Robust Ultra-Long-Term Time Series Forecasting

The authors propose Diffusion-LLM, a framework that integrates a conditional diffusion model into an LLM-based pipeline to address challenges in multimodal time series forecasting. This joint design enables the learning of future data distributions while improving semantic alignment within a shared latent space.

media r/LocalLLaMA · 5h ago

Fast medical RAG API to give your local LLMs access to facts

A developer has released a free, simple Retrieval-Augmented Generation (RAG) API powered by medical Wikipedia articles to provide local large language models with accurate factual information. The service aims for subsecond responses and currently runs on a single ARM VPS using approximately 2GB of RAM.

media r/LocalLLaMA · 5h ago

DGX Spark OS lifetime?

A user on Reddit asks whether Nvidia has disclosed the support lifecycle for the operating system running on DGX Spark hardware. The inquiry specifically concerns the duration of OS support and whether users will be forced to upgrade to new products in the near future, such as by 2028.

arxiv arXiv cs.AI · 5h ago

Automated Semantic Fault Localization in SysML v2 Using Knowledge-Graph Augmented LLMs

This paper presents a human-in-the-loop framework for automatically identifying and repairing semantic errors in SysML v2 models that compilers cannot detect. The approach combines fine-tuned Small Language Models with a domain knowledge graph to ground repair suggestions in valid engineering constraints.

arxiv arXiv cs.AI · 5h ago

Litmus: Zero-Label, Code-Driven Metric Specification for Evaluating AI Systems

Litmus is a zero-label system that designs evaluation and monitoring metrics for AI pipelines by eliciting evaluation intent from source code and targeted interrogation. Instead of assuming the evaluation target is known, it identifies what must be measured and why to construct a justified metric portfolio.

arxiv arXiv cs.AI · 5h ago

ReasoningLens: Hierarchical Visualization and Diagnostic Auditing for Large Reasoning Models

The emergence of Large Reasoning Models has introduced exceptionally long Chain-of-Thought traces, creating a transparency burden where critical logic is often buried under massive procedural text. To address this, the authors present ReasoningLens, an open-source framework designed for the hierarchical visualization and diagnostic auditing of complex reasoning chains.

arxiv arXiv cs.AI · 6h ago

HyperQuant: A Rate-Distortion-Optimal Quantization Pipeline for Large Language and Diffusion Models

HyperQuant is a unified post-training quantization pipeline designed for the weights and KV cache of large language and diffusion transformers, combining Hadamard transforms with optimal lattice quantization. The method outperforms recent schemes like HIGGS, TurboQuant, and OCTOPUS across various bit rates while maintaining near-lossless quality.

arxiv arXiv cs.AI · 6h ago

UnBias-Plus: Detect, Explain, and Rewrite Bias

UnBias-Plus is an open-source toolkit designed to address persistent bias in natural language by unifying detection, explanation, and neutral rewriting capabilities.

arxiv arXiv cs.AI · 6h ago

Detecting Malicious Agent Skills in the Wild using Attention

The authors present Locate-and-Judge, a two-stage detector designed to identify malicious skills in LLM agent marketplaces where traditional prompt-injection defenses fail.