Research paper — korshunov.ai

Research paper Page 1 / 20

Sumi: Open Uniform Diffusion Language Model from Scratch

Sumi is a 7B-parameter uniform diffusion language model pretrained from scratch on 1.5T tokens. It competes with autoregressive models on knowledge, reasoning, and coding tasks but underperforms on commonsense benchmarks, likely due to its education-heavy data mixture. The model weights, checkpoints, and full training recipe are publicly released.

arxiv arXiv cs.LG · 8d ago

Moat: Lifecycle-Aware Dynamic Analysis for Secure ML Model Execution

Moat is a dynamic analysis approach that secures ML model execution by monitoring host system interactions during well-defined model lifecycle phases. Re-Moat, its reference implementation, detects all evaluated attack classes with a near-zero false-positive rate across 77,974 real-world models and multiple frameworks, outperforming existing static model-scanning solutions.

arxiv arXiv cs.LG · 8d ago

Geometric and Stochastic Analysis of Discontinuities in Sparse Mixture-of-Experts

This paper analyzes discontinuities in Sparse Mixture-of-Experts models, classifying them by order and showing that lower-order discontinuities dominate in volume. It proves that random input paths almost surely first hit an order-1 discontinuity with finite-time probability bounds and derives occupation-time bounds for each order. A simple smoothing mechanism is proposed that enhances model continuity and performance with minimal computational overhead.

arxiv arXiv cs.LG · 8d ago

Positive-Unlabeled Learning for LLM Evaluation Auditing

A new framework uses positive-unlabeled learning and Partial Optimal Transport to audit LLM evaluation biases. It aligns human-verified positive outputs with unlabelled model responses in embedding space, identifying consistent human preferences and correcting verbosity bias without retraining. Experiments show improved human alignment, robustness to presentation biases, and interpretable confidence estimates.

arxiv arXiv cs.LG · 8d ago

Context-Aware Follow-Up Optimization for Type 2 Diabetes

A study uses a Contextual Markov Decision Process to optimize follow-up intervals for Type 2 Diabetes patients based on EHR data from 22,154 patients. The model identifies two clinical contexts—low and high risk—and recommends adaptive intervals: 1 month for unmeasured lab values, up to 3 months for elevated values or hospitalizations, and 6–12 months for stable control, with shorter intervals for high-risk patients. The CMDP policies reduced expected cumulative costs by 34.8% in high-comorbidity and 6.4% in low-comorbidity contexts compared to a fixed interval policy.

arxiv arXiv cs.LG · 8d ago

XAI reveals key drivers in European electricity markets

A study using SHAP and SSHAP techniques analyzes electricity price drivers across 39 European bidding zones. It finds solar energy has a disproportionate impact on prices, gas remains a dominant factor, and interconnections highlight regional interdependence. The research also builds a synthetic EU-wide market to examine a fully integrated, single-price scenario.

arxiv arXiv cs.LG · 8d ago

Giskard: Confidential and Byzantine-Robust Aggregation Protocol

Giskard enables confidential and Byzantine-robust decentralized machine learning aggregation by organizing parties into tree-based committees of size O(log n). It uses BGW-style MPC and a committee-adapted binary search to compute an approximate median, reducing per-party communication complexity asymptotically while maintaining model utility under up to n/4 Byzantine parties.

arxiv arXiv cs.LG · 8d ago

OrthoReg: Orthogonal Regularization for Hybrid Symbolic-Neural Dynamical Systems

OrthoReg introduces orthogonal regularization to prevent neural components from relearning symbolic structures in hybrid dynamical systems. By directly penalizing overlap between symbolic and neural parts, it enables a complementary decomposition where symbolic models capture expressible physics and neural models handle remaining dynamics. On benchmarks with partial library mismatch, OrthoReg improves symbolic recovery and out-of-distribution performance.

arxiv arXiv cs.LG · 8d ago

Local Population-Risk Certificates for Model Updates

The paper introduces local certificates that provide two-sided confidence bands for population-risk increments around a current model. The upper endpoint of this band defines a risk-controlled update rule: updates are accepted only if the certified upper endpoint is nonpositive, otherwise the current model is retained.

arxiv arXiv cs.CL · 8d ago

Morpheus: Neural Tokenizer and Embedder for Turkish

Morpheus is a morphology-aware neural tokenizer and word embedder for Turkish that preserves original text through lossless encoding and decoding. It achieves the lowest bits-per-character (1.425), improves morphological alignment (MorphScore macro-F1 0.61), and uses 19% less GPU memory than 64K-vocabulary subword tokenizers. Frozen Morpheus embeddings outperform BGE-M3 and BERTurk in lexical retrieval, with root-family MAP of 0.85 and ROC-AUC of 1.00.

arxiv arXiv cs.CL · 8d ago

LegalWorld: Life-Cycle Environment for Legal Agents

LegalWorld models Chinese civil litigation as a causally connected chain of five stages, based on 75,309 judgments. It includes reusable infrastructure to maintain consistency across stages and enables LongJud-Bench to evaluate agent performance across all phases, revealing significant capability gaps between models in different legal tasks.

arxiv arXiv cs.CL · 8d ago

Graph-ESBMC-PLC: Formal Verification of Graphical PLCopen LD Programs

Graph-ESBMC-PLC enables formal verification of graphical IEC 61131-3 Ladder Diagram programs by introducing a DFS-based resolver that converts graphical LD connections into valid GOTO intermediate representation. Validation on three real-world programs shows full IR generation and successful verification of safety properties at k=2 within 70ms, with no regression on textual benchmarks.

arxiv arXiv cs.CL · 8d ago

Middle-to-Late Segments of Research Papers Reveal Key Methodological Information

This study finds that methodological information in research papers is unevenly distributed, with middle-to-late and final segments showing greater discriminative power. Combining these segments with bibliographic metadata improves the accuracy of automatic research method classification in library and information science.

arxiv arXiv cs.AI · 8d ago

Scaling AEB with Massive Unlabeled Data via Meta-Feedback SSL

A meta-feedback semi-supervised learning framework enables scaling of automatic emergency braking using massive unlabeled fleet data. The stabilized approach reduces pseudo-label errors through noise-aware decoupling and kinematics-gated pseudo-labeling, improving safety with a 100:1 positive-to-false activation ratio and 35% more accident-free driving mileage compared to rule-based systems.

arxiv arXiv cs.AI · 8d ago

Domain-Shift Aware Neural Networks for Unbalance Mass Estimation

A domain-shift aware neural network is proposed for estimating unbalance masses in rotating systems under varying conditions. The model uses maximum mean discrepancy to align feature representations across different operating domains, improving prediction accuracy when system behaviors differ from training conditions. Results show its effectiveness in structural health monitoring applications.

arxiv arXiv cs.AI · 8d ago

TransitNet Achieves 95.2% Accuracy in Low-SNR Transit Searches

TransitNet, a compact attention-augmented deep learning framework, achieves 95.2% accuracy in low-SNR transit blind searches, outperforming TLS and BLS in ROC-AUC and PR-AP values. It recovers 93.0% of injected Earth- and sub-Earth-size transits, with 97.4% of injected transits fully covered by estimated transit windows, and successfully recovers all 34 confirmed Kepler planets with a mean midpoint error of 1.24 hours.

arxiv arXiv cs.AI · 8d ago

Variability in AI-Generated Software: A New Product-Line Approach

An exploratory analysis of 10 vibe-coded C/C++ projects reveals near-zero in-artifact variability, with all decisions resolved at generation time. The paper proposes Variability by Regeneration (VbR), a product-line approach where an LLM acts as a derivation engine, generating tailored binaries from declarative specifications, with a variant dispatcher routing user requests to the correct binary. VbR shifts variability into specifications, not code, offering a new paradigm for SPL engineering.

arxiv arXiv cs.AI · 8d ago

XAI reveals key drivers in European electricity markets

A study uses SHAP and SSHAP techniques to analyze electricity price drivers in 39 European bidding zones. It finds solar energy has a disproportionate impact on prices, gas remains a dominant factor, and interconnections highlight regional interdependence. The research also builds a synthetic EU-wide market to examine a fully integrated scenario.

arxiv arXiv cs.AI · 8d ago

Technical Taxonomy of LLM Agent Communication Protocols

A new taxonomy classifies LLM agent communication protocols across five dimensions: counterparty, payload, interaction state, discovery mechanism, and schema flexibility. Analysis shows hybrid payloads, session-state persistence, and runtime schema negotiation are common, with decentralized discovery remaining rare. The study predicts short-term convergence toward unified agent-to-agent and agent-to-context protocols, and long-term evolution toward a federated, layered protocol stack.

arxiv arXiv cs.AI · 8d ago

OrthoReg: Orthogonal Regularization for Hybrid Symbolic-Neural Dynamical Systems

OrthoReg introduces orthogonal regularization to prevent neural components from relearning symbolic structures in hybrid dynamical systems. By directly penalizing overlap between symbolic and neural parts, it enables a complementary decomposition where symbolic models capture expressible physics and neural components handle remaining dynamics. On benchmarks with partial library mismatch, OrthoReg improves symbolic recovery and out-of-distribution performance.