Research paper
arxiv arXiv cs.LG · 8d ago

Geometric and Stochastic Analysis of Discontinuities in Sparse Mixture-of-Experts

This paper analyzes discontinuities in Sparse Mixture-of-Experts models, classifying them by order and showing that lower-order discontinuities dominate in volume. It proves that random input paths almost surely first hit an order-1 discontinuity with finite-time probability bounds and derives occupation-time bounds for each order. A simple smoothing mechanism is proposed that enhances model continuity and performance with minimal computational overhead.

arxiv arXiv cs.LG · 8d ago

Context-Aware Follow-Up Optimization for Type 2 Diabetes

A study uses a Contextual Markov Decision Process to optimize follow-up intervals for Type 2 Diabetes patients based on EHR data from 22,154 patients. The model identifies two clinical contexts—low and high risk—and recommends adaptive intervals: 1 month for unmeasured lab values, up to 3 months for elevated values or hospitalizations, and 6–12 months for stable control, with shorter intervals for high-risk patients. The CMDP policies reduced expected cumulative costs by 34.8% in high-comorbidity and 6.4% in low-comorbidity contexts compared to a fixed interval policy.

arxiv arXiv cs.LG · 8d ago

OrthoReg: Orthogonal Regularization for Hybrid Symbolic-Neural Dynamical Systems

OrthoReg introduces orthogonal regularization to prevent neural components from relearning symbolic structures in hybrid dynamical systems. By directly penalizing overlap between symbolic and neural parts, it enables a complementary decomposition where symbolic models capture expressible physics and neural models handle remaining dynamics. On benchmarks with partial library mismatch, OrthoReg improves symbolic recovery and out-of-distribution performance.

arxiv arXiv cs.CL · 8d ago

Morpheus: Neural Tokenizer and Embedder for Turkish

Morpheus is a morphology-aware neural tokenizer and word embedder for Turkish that preserves original text through lossless encoding and decoding. It achieves the lowest bits-per-character (1.425), improves morphological alignment (MorphScore macro-F1 0.61), and uses 19% less GPU memory than 64K-vocabulary subword tokenizers. Frozen Morpheus embeddings outperform BGE-M3 and BERTurk in lexical retrieval, with root-family MAP of 0.85 and ROC-AUC of 1.00.

arxiv arXiv cs.AI · 8d ago

TransitNet Achieves 95.2% Accuracy in Low-SNR Transit Searches

TransitNet, a compact attention-augmented deep learning framework, achieves 95.2% accuracy in low-SNR transit blind searches, outperforming TLS and BLS in ROC-AUC and PR-AP values. It recovers 93.0% of injected Earth- and sub-Earth-size transits, with 97.4% of injected transits fully covered by estimated transit windows, and successfully recovers all 34 confirmed Kepler planets with a mean midpoint error of 1.24 hours.

arxiv arXiv cs.AI · 8d ago

Variability in AI-Generated Software: A New Product-Line Approach

An exploratory analysis of 10 vibe-coded C/C++ projects reveals near-zero in-artifact variability, with all decisions resolved at generation time. The paper proposes Variability by Regeneration (VbR), a product-line approach where an LLM acts as a derivation engine, generating tailored binaries from declarative specifications, with a variant dispatcher routing user requests to the correct binary. VbR shifts variability into specifications, not code, offering a new paradigm for SPL engineering.

arxiv arXiv cs.AI · 8d ago

Technical Taxonomy of LLM Agent Communication Protocols

A new taxonomy classifies LLM agent communication protocols across five dimensions: counterparty, payload, interaction state, discovery mechanism, and schema flexibility. Analysis shows hybrid payloads, session-state persistence, and runtime schema negotiation are common, with decentralized discovery remaining rare. The study predicts short-term convergence toward unified agent-to-agent and agent-to-context protocols, and long-term evolution toward a federated, layered protocol stack.

arxiv arXiv cs.AI · 8d ago

OrthoReg: Orthogonal Regularization for Hybrid Symbolic-Neural Dynamical Systems

OrthoReg introduces orthogonal regularization to prevent neural components from relearning symbolic structures in hybrid dynamical systems. By directly penalizing overlap between symbolic and neural parts, it enables a complementary decomposition where symbolic models capture expressible physics and neural components handle remaining dynamics. On benchmarks with partial library mismatch, OrthoReg improves symbolic recovery and out-of-distribution performance.