All articles — korshunov.ai

All articles Page 1 / 89

ExpRL: Exploratory RL for LLM Mid-Training

ExpRL introduces a novel mid-training approach for LLMs using human-written question-answer data as reward scaffolds. Instead of imitating reference solutions, it constructs problem-specific grading rubrics to reward intermediate reasoning steps, enabling better initialization for sparse-reward RL and outperforming SFT, sparse-reward GRPO, and self-distillation on math reasoning tasks.

arxiv arXiv cs.LG · 9d ago

HAMON: Passive Optical Forecasting Core

HAMON uses passive optical diffraction to generate forecasts, outperforming digital baselines on ETTm2 at all horizons and ETTh2 at all but the longest horizon. It achieves up to 14% lower MSE and operates without trainable digital mixing, relying instead on physical optical propagation.

arxiv arXiv cs.LG · 9d ago

KVEraser: Efficient Localized Context Erasing in LLMs

KVEraser enables efficient localized context erasing in large language models by replacing only the KV cache states of an erased span with learned steering states. It achieves near-full-recomputation performance on in-domain tasks and offers a 24% latency increase versus a 17.6x increase for full recomputation, with up to 3--4x speedup on long-document QA tasks.

arxiv arXiv cs.LG · 9d ago

DP-FL Backdoor Attacks: RING Exploits Privacy for Malicious Signals

A new attack, RING, exploits differential privacy in federated learning to conceal backdoor signals while maximizing impact. It achieves 90.3% attack success against state-of-the-art defenses, up to 26.08x over baseline methods, and reveals a critical security gap in DP-FL due to inherent masking of malicious updates.

arxiv arXiv cs.LG · 9d ago

Phase in Neural Representations: An Internal Oppenheim-Lim Test

Image classifiers like PRISM2D, GFNet, and ViT-B/16 show that phase, not magnitude, drives predictions in hidden layers. ResNet-50 reveals a latent sign code in late blocks, indicating phase/sign identity exists across architectures, though expressed differently due to activation and readout mechanisms.

arxiv arXiv cs.LG · 9d ago

HABC Improves RL Fine-Tuning of VLAs with Sparse Outcomes

Hierarchical Advantage-Weighted Behavior Cloning (HABC) enhances online RL fine-tuning of vision-language agents by using separate critic heads for viability and efficiency. It combines their outputs via a state-adaptive gate and applies per-transition weights, while intervention-aware credit assignment prevents supervision leakage. In real-robot experiments, HABC boosts success rates to 92%, 88%, and 38% on three bimanual tasks, surpassing SFT baselines of 36%, 44%, and 12%.

arxiv arXiv cs.LG · 9d ago

Geometric Action Model for Robot Policy Learning

The Geometric Action Model (GAM) enables robot policies to reason about 3D physical interactions by repurposing a pretrained geometric foundation model. GAM splits the GFM to serve as both an observation encoder and a causal future predictor, then routes predicted future geometry and actions through the same backbone, achieving accurate, robust, and efficient manipulation performance in simulation and real-robot benchmarks.

arxiv arXiv cs.LG · 9d ago

Exact Posterior Score Estimation for Linear Inverse Problems

The paper derives the exact posterior score in closed form for linear Gaussian inverse problems, enabling efficient posterior sampling via denoising. It introduces Exact Posterior Score (EPS), a training objective that preserves pretraining structure and achieves superior performance on fidelity, perceptual, and distributional metrics with fewer denoiser evaluations than gradient-based methods.

media r/LocalLLaMA · 9d ago

Qwable-v1 Released as Distillation of Claude Fable-5

Qwable-v1, an open-weight model distilled from Anthropic's Fable-5, is now publicly available on Hugging Face. It captures 4,659 cleartext agentic-coding traces from Fable-5's public corpus and emits properly formatted <tool_use> XML calls to Claude-flavored tools, reflecting the original tool surface in its weights.

media r/LocalLLaMA · 9d ago

vLLM releases new streaming parser for Qwen3+ in nightly

vLLM has introduced a new streaming parser for Qwen3+ available in its nightly build, addressing issues like mid-turn stopping and failed streaming tool calls due to chunk boundaries. The update reportedly resolves these problems in limited testing, improving reliability for agentic workflows.

media r/LocalLLaMA · 9d ago

HalBench Tests 29 Open Source Models on Sycophancy and Hallucination

HalBench evaluates 29 open-source LLMs on a custom benchmark for sycophancy and hallucination. Qwen 3.6 and Gemma 4 outperform larger models, with Qwen 3.6 achieving 36.6% pushback—higher than GPT-5.4 and Gemini 3.1 Pro. Model size does not correlate with honest responses, indicating that architecture and training data matter more than parameters.

blog Simon Willison · 9d ago

Cloudflare CAPTCHA triggered only for searches with ampersand

Simon Willison configured Cloudflare's CAPTCHA to activate only for search queries containing at least one ampersand. The rule uses a custom filter: (http.request.uri.path wildcard r"/search/*" and http.request.uri.query contains "&"). This allows simple searches like /search/?q=lemur to pass without CAPTCHA.

media r/LocalLLaMA · 9d ago

Gemma3 270M Model Released on Reddit

A user posted an image of the Gemma3 270M model on the r/LocalLLaMA subreddit. The post includes a link to the image and comments section, indicating community discussion around the model.

blog Simon Willison · 9d ago

datasette-agent 0.3a0 releases with user approval for write SQL operations

datasette-agent 0.3a0 introduces the execute_write_sql tool that prompts users before writing to databases, ensuring permission checks are respected. The update also enhances datasette agent chat with user approval support, new command options like --unsafe for auto-approval, and plain text tool outputs for CLI display.