All articles — korshunov.ai

All articles Page 1 / 122

Accelerating Disaggregated RL for Visual Generative LLMs with Diffusion-Based Parallelism

Researchers introduce DigenRL, a disaggregated reinforcement learning framework designed to address the inefficiencies of colocated execution in diffusion-based generative large language models. The system supports flexible resource allocation and heterogeneous GPUs while utilizing novel parallelism techniques to reduce execution bubbles.

arxiv arXiv cs.AI · 5h ago

When Helpfulness Overrides Causal Caution: Context-Dependent Suppression and Recovery in LLMs

A study reveals that large language models systematically suppress 'Causal Caution'—the tendency to refrain from causal judgment without sufficient evidence—when shifting from academic to practical advisory contexts. This suppression occurs despite the models retaining the underlying capability, as evidenced by the ability to restore cautious reasoning through specific prompts.

arxiv arXiv cs.AI · 5h ago

Structural Kolmogorov-Arnold Convolutions: Learnable Function on the Values or the Filter Shape

The article introduces Structural Kolmogorov-Arnold Networks (KANs) that place learnable functions in the convolution structure rather than on individual kernel entries, organizing the design by whether the function acts on pixel values or filter shape. It presents three realizations: SV-KAN with a shared value function, AG-KAN using a content-adaptive Gaussian gate, and RF-KAN which builds filters from oriented ridge profiles in a Morlet wavelet basis.

arxiv arXiv cs.AI · 5h ago

On the Stability of Prompt Ranking in Large Language Model Evaluation

This paper systematically studies the stability of prompt rankings under common variability sources like random seeds and limited evaluation subsets across three open-weight LLMs and two benchmark tasks.

arxiv arXiv cs.AI · 5h ago

Cycle-Consistent Neural Explanation of Formal Verification Certificates

Researchers propose a cycle-consistent neural architecture that generates faithful natural language explanations for formal verification certificates, addressing the opacity of these machine-checkable proofs for non-specialists. The system achieves 90.0% cycle-verified soundness on test data from a financial compliance domain, significantly outperforming multi-LLM baselines in both accuracy and inference speed.

media r/LocalLLaMA · 5h ago

Ornith 35B works reasonably well with Qwen3.6 35B DFlash speculative model

A user reports achieving a 30-40% increase in token generation speed by pairing the Ornith-1.0-35B model as a draft model with Qwen3.6-35B-A3B-DFlash using llama-server.

arxiv arXiv cs.AI · 6h ago

PHANTOM: A Large-Scale Dataset of Multimodal Adversarial Attacks for Vision-Language Models

Researchers have introduced PHANTOM, a large-scale, open-source dataset containing 47,524 pre-generated adversarial attacks designed to evaluate the safety and robustness of vision-language models (VLMs). This resource consolidates and extends prior benchmarks by covering 10 high-level categories and 55 subcategories of harmful intents, aiming to lower the computational barriers for adversarial research.

arxiv arXiv cs.AI · 6h ago

Female-RHINO: Real-Time Scanner-Integrated Framework for Automated Uterine MRI Analysis

This article introduces Female-RHINO, a real-time AI-assisted framework that integrates with MRI scanners to perform automated quantitative uterine analysis and structured reporting during image acquisition. The system combines deep learning models for segmentation and landmark detection to derive biomarkers from sagittal T2-weighted pelvic MRI without manual interaction.

arxiv arXiv cs.AI · 6h ago

Age of LLM: A Strategic 1v1 Benchmark for Reasoning, Diplomacy and Reliability

The authors introduce Age of LLM, a turn-based 1v1 benchmark where two large language models compete on a 13x7 grid to destroy an enemy base under conditions of fog of war and full diplomacy. This private engine mitigates data contamination by using fresh random map seeds and opponents for each match.

arxiv arXiv cs.AI · 6h ago

ATRIA: Adaptive Traceable ECG Reporting with Iterative Agents

The article introduces ATRIA, a multi-agent system for ECG reporting that addresses the limitations of existing end-to-end models and single-pass agents by mirroring the clinician's iterative workflow.

arxiv arXiv cs.AI · 6h ago

Average Rankings Mask Per-Subject Optimality: A Friedman-Nemenyi Benchmark of EEG Motor-Imagery BCI Decoders

This study evaluates whether any single decoding pipeline dominates across subjects in motor imagery brain-computer interfaces by testing 1,056 configurations on three public datasets using rigorous statistical benchmarks.

arxiv arXiv cs.AI · 6h ago

Entity Resolution via Batched Oracle Queries

This article addresses the problem of resolving entities in large datasets using an oracle that clusters records in limited batches, aiming for a pay-as-you-go approach to control costs while maximizing recall.

arxiv arXiv cs.AI · 6h ago

Agentic AI for Bilevel Long-Term Optimization of Policy-Driven Physical Layer Systems

This paper introduces Agentic-LTPO, a nested bilevel optimization framework designed to address the limitations of fixed-objective methods in physical layer systems facing dynamic operator policies and real-time constraints. The framework utilizes agentic AI to generate upper-level configurations that translate evolving policies and historical experiences into structured lower-level problems for immediate decision-making.

media r/LocalLLaMA · 6h ago

Second Circuit: An NGO for digital freedom of thought

Chris Tidesson announces the founding of Second Circuit, an NGO dedicated to supporting self-determined AI use and encouraging open-source software adoption among governments, companies, and private individuals. The organization was originally established in response to the ChatGPT 4o situation and currently operates a Discord community for over six months.

media r/LocalLLaMA · 6h ago

on Dario’s statement

This Reddit post from the r/LocalLLaMA community discusses a statement made by Dario Amodei. The content is limited to the title and metadata, with no detailed text or analysis provided in the source.

arxiv arXiv cs.AI · 7h ago

Can Aggregate Invariants Accelerate Continuous Subgraph Matching? Limits, Laws, and a Dynamic Spectral Index

This study evaluates whether spectral filtering can accelerate continuous subgraph matching (CSM) on dynamic graphs, finding that while lazy maintenance is ineffective, selective exact maintenance offers significant performance gains.

arxiv arXiv cs.AI · 7h ago

Detecting AI Coding Agents in Open Source: A Validated Multi-Method Census of 180 Million Repositories

A multi-layered detection framework analyzing 180 million Git repositories reveals that single-signal methods significantly underestimate the prevalence of generative AI coding agents, missing up to 97% of activity. The study identifies over 320,000 commits per month from agents like Claude Code, which dominates silent adoption through configuration files rather than bot accounts.

arxiv arXiv cs.AI · 7h ago

Transformation Behavior of Images in Latent Space

This paper investigates how classical image transformations affect embeddings in latent space using encoder networks from Lunit Inc., Bioptimus, and Meta Research Team.

arxiv arXiv cs.AI · 7h ago

MedPCFM: Improving Medical Point Cloud Completion by Integrating Point Transformers and Flow Matching

This article introduces PCFM, a flow matching approach for medical point cloud completion that integrates Point Transformer v3 (PTv3) to address insufficiently studied generative modeling in this domain. The method is evaluated on the SkullFix, SkullBreak, and Mandibular Defect datasets against strong deterministic and diffusion baselines.

arxiv arXiv cs.AI · 7h ago

ReM-MoA: Reasoning Memory Sustains Mixture-of-Agents Scaling

The authors propose ReM-MoA, a memory-augmented Mixture-of-Agents framework designed to sustain performance gains as model depth increases, addressing the degradation and saturation issues found in existing variants. The system utilizes a Ranked Reasoning Memory and a Curated Diversified Memory Routing scheme to preserve exploration diversity while propagating high-quality reasoning traces across layers.