Training methods
arxiv arXiv cs.LG · 20h ago

Stationary Robust Mean-Field Games under Model Mismatches

This paper introduces a stationary mean-field game framework that directly incorporates distributional model uncertainty into population-coupled dynamics. It establishes a robust dynamic programming principle, proves existence of a stationary robust equilibrium, and presents the first algorithm with convergence guarantees. The mean-field solution approximates finite-population equilibria and provides explicit non-asymptotic error bounds under model uncertainty.

arxiv arXiv cs.AI · 22h ago

Sparsity-Storage-Accuracy Tradeoff in Parsimoniously Activated Dictionary Learning

Parsimoniously activated dictionary learning (PADL) establishes a structured generative model with auxiliary latent variables, enabling maximum a posteriori estimation. This framework provides generalization guarantees and an analytical characterization of the tradeoff between sparsity, storage cost, and reconstruction accuracy, allowing data-driven hyperparameter estimation. The resulting algorithm achieves better reconstruction performance and accelerates inference in vision-language models.

arxiv arXiv cs.AI · 22h ago

HyperAdapter: Structured Hyperedge Adaptation for Vision Transformer Fine-Tuning

HyperAdapter introduces a hypergraph-based adapter that performs structured, group-aware adaptation in vision transformers by operating in hyperedge space rather than token space. It uses prototype-based assignments to build a soft hypergraph, aggregates token features into hyperedge representations, applies lightweight adaptation, and diffuses updates back via hypergraph structure, enabling explicit structural inductive bias while maintaining efficiency. Experiments show consistent performance gains over baseline PEFT methods, especially on tasks requiring structured reasoning.

arxiv arXiv cs.AI · 22h ago

P4IR Framework Improves LLM-Based Code Compliance Accuracy

P4IR, a two-stage framework, uses supervised fine-tuning and Group Relative Policy Optimization to enhance large language model-based automated code compliance systems. It reduces tree edit and token-level Levenshtein distances by up to 23.8% and 38.6% respectively, outperforming leading LLMs like Claude Opus, GPT-5.2, and GLM-4.7 in zero-shot settings with few-shot prompting, and reduces false positives by a small but statistically significant margin.

arxiv arXiv cs.LG · 1d ago

BIPC Framework Accelerates Mixed-Integer Optimization with Machine Learning

The BIPC framework reduces solution time for large-scale mixed-integer programs by identifying a backdoor subset of variables that drive computational complexity. Using supervised learning, it predicts backdoor variable values and intervals, then solves a reduced problem with these predictions, achieving significant speedups with minimal quality loss. This enables rapid, high-quality solutions under parameter perturbations in real-world systems like power and supply chains.

arxiv arXiv cs.LG · 1d ago

Muon Optimizer: Power, Limits, and a River-Valley Theory

A new trajectory-level theory reveals Muon accelerates early in optimization along the information-bearing river direction but converges slowly near the bottom, unlike gradient descent. With momentum, Muon's orthogonalized updates remove residual scale information, leading to overshooting and oscillation. The study advocates a two-stage approach—using Muon early and switching to gradient descent-like optimizers later—for improved LLM training performance.

arxiv arXiv cs.LG · 1d ago

GOMA Achieves First Stochastic Convergence Guarantee for Variational Inequalities

The paper introduces GOMA, a family of first-order methods for monotone variational inequalities. In the stochastic setting with unbounded variance, a simplified variant of GOMA achieves an O(1/\sqrt{k}) last-iterate convergence rate on the squared gradient norm, without variance reduction or growing batches. This is the first such guarantee for unconstrained stochastic monotone Lipschitz variational inequalities.