Reasoning models
arxiv arXiv cs.LG · 9d ago

Adaptive Functional Gradient Descent with Convergence Guarantees

We propose a new functional gradient descent algorithm that adapts its representation during optimization. The method achieves convergence to a stationary point under smooth losses and to a global minimizer under smoothness and a Polyak-Lojasiewicz condition, despite using finite-dimensional approximations. It outperforms both fixed-approximation FGD and neural network baselines in regression, PDE solving, and computer vision tasks.

arxiv arXiv cs.LG · 9d ago

Unified Causal-Origin Taxonomy of Distributional Shifts in RL

This paper proposes a unified causal-origin taxonomy for distributional shifts in reinforcement learning, linking ID/OOD generalization to non-stationary settings. It decomposes the agent-environment interaction using a POMDP framework, identifying internal, agent-driven, and external, environment-driven shifts, with explicit, implicit, and hybrid types defined by the shifted-time boundary. The work introduces an evaluation framework to measure shift impact through performance degradation and recovery metrics, enabling systematic analysis of RL robustness.

arxiv arXiv cs.LG · 9d ago

CrossMaps: Confidence-Aware Semantic Mapping for Rover Navigation

CrossMaps is a real-time, confidence-aware semantic mapping pipeline that uses RGB-D data to create language-queryable maps. It integrates multi-scale CLIP embeddings with a dual-memory architecture—Short-Term and Long-Term Memory—to aggregate visual observations and promote coherent, confident cells as persistent semantic landmarks. The system enables natural language queries to guide rover navigation via semantic heatmaps.

arxiv arXiv cs.LG · 9d ago

CircuitLasso: Scalable Circuit Learning for LLM Interpretability

CircuitLasso enables scalable circuit learning in large language models by using sparse linear regression. It recovers circuits with structural accuracy matching state-of-the-art methods at significantly lower computational cost, and demonstrates human-interpretable semantic propagation through model components. The learned circuits achieve comparable performance on a domain-generalization task with reduced cost.

arxiv arXiv cs.LG · 9d ago

ROVE: Reinforcement Learning with Human Interventions for Humanoid Manipulation

ROVE enables humanoid Vision-Language-Action models to learn effective manipulation behaviors using imperfect human interventions. It combines a human-in-the-loop data collection pipeline with Optimistic Value Estimation and cross-embodiment supervision to prioritize high-value actions and improve robustness. ROVE outperforms baseline methods on real-world, contact-rich manipulation tasks through iterative rollout and intervention cycles.

arxiv arXiv cs.LG · 9d ago

Filtered Conformal Ellipsoids for Graph-Native Time Series

A new method called filtered conformal ellipsoids provides prediction sets for multivariate time series by using a frozen state-space filter to generate predictive means and covariances, then applying split-conformal calibration to Mahalanobis scores. The approach achieves coverage under dependence through contraction in an observable predictive-law quotient, with theoretical bounds derived under Gaussian-projection and observability conditions, and shows sharper ellipsoids on graph-native traffic benchmarks compared to static and non-filter baselines.

arxiv arXiv cs.LG · 9d ago

A Mathematical Review of Shape Space Analysis in Machine Learning

This survey presents a mathematical framework for analyzing geometric data, integrating differential geometry, statistics, and machine learning. It outlines a unified pipeline for shape representation, geodesic metrics, statistical analysis, and geometry-aware learning, enabling the study of shape variability and structural trajectories across populations and time. Applications span biology, medicine, anthropology, and computer vision, highlighting challenges in handling nonlinear and unaligned geometric variation.