All articles — korshunov.ai

All articles Page 1 / 89

SDXL Running Locally in Browser on WebGPU, Open-Source

A browser extension enables local image generation using SDXL models via WebGPU, running on the user's GPU without external setups. The tool supports two models: SDXL-Lighting fp16 (7 GB) and a 4-bit version (3.6 GB), with requirements including at least 8 GB VRAM for the full model and a browser with WebGPU support (Chrome/Edge 122+ or latest Firefox).

arxiv arXiv cs.LG · 9h ago

Deep Learning Pipeline for Sign Language Recognition and Translation to Indian Vernaculars

A two-stage deep learning model classifies Indian sign language video clips into English words using a fine-tuned VideoMAE transformer, achieving 99% training and 78% validation accuracy on a 13-class dataset. The predicted English labels are translated into Hindi, Telugu, and Bengali using Meta AI's NLLB-200 multilingual model, with a Streamlit demo enabling user-uploaded video inference and cross-lingual output.

arxiv arXiv cs.LG · 9h ago

Prompt-Side Preprocessing Enhances Edge AI Accuracy

A structured prompt framework improves local LLM accuracy in environmental monitoring by transforming raw sensor data into enriched textual representations. Evaluations on indoor and outdoor datasets show local model accuracy increases from 50.9% to 81.7% indoors and from 63.7% to 89.3% outdoors with enriched prompts, while maintaining low latency near 0.22 seconds in no-chain-of-thought mode.

arxiv arXiv cs.LG · 9h ago

The Scissors Effect: Resize Diversity Hurts Robust Surrogate Transfer

Input diversity, a common practice in transfer attacks, improves success on standard surrogates but reduces it on robust ones. This regime-dependent effect, called the Scissors Effect, is driven by gradient geometry, with resize operations degrading alignment in robust models. A training-free rule (CG-DI) adjusts diversity based on local gradient consistency to preserve attack success across surrogate types.

arxiv arXiv cs.LG · 9h ago

HERTA: Automated Testing for FHE Framework Vulnerabilities

HERTA is the first automated testing tool designed for fully homomorphic encryption frameworks. It uses metamorphic testing with novel relations derived from FHE semantics to detect deep-seated logic bugs that can silently corrupt encrypted computations. Evaluation on three industry frameworks revealed 21 previously unknown bugs, several of which have been confirmed and fixed by developers, with significant implications for security and service integrity.

arxiv arXiv cs.LG · 9h ago

Robust Diffusion Models via Divergence-Induced Weighted Denoising

A new training method replaces MSE loss in diffusion models with an f-divergence-based transformation, creating a robust surrogate that improves performance under data contamination. The approach uses local divergence constructions under DDPM's Gaussian reverse-kernel, reducing the training objective to a one-dimensional function of denoising error, with bounded-influence divergences suppressing large errors and enhancing stability.

arxiv arXiv cs.LG · 9h ago

Generative Robust Optimisation Framework

Generative Robust Optimisation (GRO) introduces a deep generative model to define uncertainty sets, capturing nonlinear correlations, asymmetry, and multimodality. A five-point evaluation framework assesses neural network-based uncertainty sets across reconstruction fidelity, distribution matching, latent regularity, robust relevance, and computational tractability, with experiments validating GRO's effectiveness in production planning and facility location problems.

arxiv arXiv cs.LG · 9h ago

Introducing Quantum Measurement Temperature to Stabilize Hybrid QNN Training

A learnable scaling parameter called Quantum Measurement Temperature (QMT) is introduced to rescale quantum measurement outputs in hybrid quantum neural networks. This approach mitigates measurement-induced logit contraction, enhancing gradient magnitude and stability during training without altering the quantum circuit or measurement operators. Experiments show improved logit separation, gradient strength, and classification accuracy in protein and image classification tasks.

arxiv arXiv cs.LG · 9h ago

Deep material network for homogenization of piezoelectric composites

A piezoelectric deep material network (PDMN) is proposed to efficiently homogenize two-phase piezoelectric composites. The framework embeds electromechanical homogenization relations into its architecture, enabling physics-informed, semi-analytical predictions with over three orders of magnitude lower computational cost than direct numerical simulation, validated on PVDF-LiNbO3 and viscoelastic-piezoelectric composites under nonlinear loading.

arxiv arXiv cs.LG · 10h ago

Concept-Constrained Prompt Learning for Few-Shot CLIP Adaptation

CCPL introduces a lightweight framework that anchors class prompts to frozen concept prototypes, improving few-shot CLIP adaptation. It achieves better base-to-new performance on DTD and EuroSAT compared to CoOp, with consistent gains from text-space concept regularization, though results vary by dataset and protocol.

arxiv arXiv cs.LG · 10h ago

Stationary Robust Mean-Field Games under Model Mismatches

This paper introduces a stationary mean-field game framework that directly incorporates distributional model uncertainty into population-coupled dynamics. It establishes a robust dynamic programming principle, proves existence of a stationary robust equilibrium, and presents the first algorithm with convergence guarantees. The mean-field solution approximates finite-population equilibria and provides explicit non-asymptotic error bounds under model uncertainty.

arxiv arXiv cs.LG · 10h ago

Training-Free Task Classification for Multi-Task Model Merging

SiM enables dynamic routing in multi-task model merging without additional training or task ID access. It uses SVD-based manifold approximations and projects test inputs onto precomputed task manifolds to route inputs to relevant experts, improving performance and reducing the gap to individual expert levels.

arxiv arXiv cs.LG · 10h ago

Importance-Weighted On-Policy Distillation Addresses Position Bias

On-Policy Distillation (OPD) suffers from position bias where later tokens provide poor supervision. We introduce Importance-Weighted On-Policy Distillation (IW-OPD), which assigns weights based on distribution discrepancy, prioritizing early tokens. IW-OPD converges faster and achieves up to 6.9 point performance gains on AIME-2025.

arxiv arXiv cs.LG · 10h ago

Scalable Bayesian Models for Stellar Flare Detection

A generative surrogate framework using a Variational Autoencoder approximates Gaussian Process priors, bypassing costly covariance operations. The VAE+Hidden Markov Model architecture enables fast, scalable stellar flare detection in large astronomical time series, matching exact models in structural fidelity while reducing computational time significantly.

arxiv arXiv cs.LG · 10h ago

Small Language Models Outperform Frontier LLMs in Relation Extraction

A fine-tuned 0.5B-parameter Qwen2.5 model achieves 0.83 micro-F1 in general-domain relation extraction, surpassing zero-shot GPT-5.4 and Claude Sonnet 4.6. On literary benchmarks, it reaches 0.92 on the Biographical dataset, outperforming GPT-5.4 and exceeding frontier models in accuracy, demonstrating that task-adapted small models can deliver high performance with minimal hardware and privacy overhead.

media r/LocalLLaMA · 10h ago

I reverse engineered Windows Copilot into a free OpenAI-compatible API

A user has created a local API that replicates OpenAI-compatible GPT-4 functionality using Microsoft's free Copilot service. The tool logs into a Microsoft account once, runs locally on a Windows device, and exposes a server at http://localhost:8000/v1 that supports streaming and multi-turn conversations without requiring an API key or billing. It is designed for personal and educational use, and available via GitHub at https://github.com/sums001/Windows-Copilot-API.

blog Simon Willison · 10h ago

Tom MacWright on Accidental Anonymity in Job Applications

Tom MacWright observes that job applications increasingly feature LLM-generated content, including portfolios and GitHub projects with fabricated commit messages. He notes that such applications reveal little about the applicants, as they lack personal authenticity and genuine self-expression.

arxiv arXiv cs.AI · 10h ago

Geometry-Aware Online Scheduling for LLM Serving

A new scheduling algorithm, Smallest Volume First (SVF), reduces LLM inference latency by optimizing key-value cache management. Theoretical analysis shows a worst-case competitive ratio reduced from 48 to 5, with 1-bit SVF achieving strong performance using minimal information. Evaluations on Llama-3.1 models confirm improvements in both average and tail latency, with the approach integrated into vLLM.

arxiv arXiv cs.AI · 10h ago

BabelJudge: Measuring LLM-as-a-Judge Reliability Across Languages and Agent Trajectories

BabelJudge introduces an open-source framework to measure four key bias modes in LLM judges across languages and agent trajectories. It reveals a significant reliability drop from Hindi to Swahili—0.714 to 0's 0.550—highlighting cross-lingual degradation invisible to raw accuracy. The framework enables bias-aware evaluation without human labels, using controlled perturbations to create known gold labels, and extends to agentic workflows with new metrics on tool accuracy and hallucination detection.

arxiv arXiv cs.AI · 10h ago

Hypothesis-Driven Skill Optimization for LLM Agents

HDSO enables safe, auditable skill updates for LLM agents without training, using falsifiable hypotheses and validation. On ALFWorld, it improves Qwen3-8B by +6.9 Avg. SR points and maintains a +7.1-point gain under noisy feedback, with validated skills transferable across runs and models when diagnostic alignment is achieved.