Lab · OpenAI
arxiv arXiv cs.AI · 6d ago

Lean as Process-Verified Reward Oracle in RL for Theorem Proving

This work shows that Lean can serve as a symbolic process oracle, providing fine-grained, verified feedback during reinforcement learning. By parsing proof attempts into tactic sequences and using Lean's elaboration to mark sound steps and first failures, the system generates dense, type-theoretic reward signals. Experiments demonstrate tactic-level supervision outperforms outcome-only methods on benchmarks like MiniF2F and ProofNet, highlighting Lean's role as both evaluator and training reward source.

arxiv arXiv cs.AI · 6d ago

Dual-Agent Framework for Cross-Model Verified Translation

A dual-agent framework converts natural-language experiment protocols into executable commands for robotic lab platforms. It uses a Parser Agent and a rule-based mapping engine to translate protocols, with a heterogeneous LLM Validation Agent ensuring accuracy and triggering self-correction. The framework successfully enables end-to-end autonomous execution of microplate-based experiments like the Bradford assay.

arxiv arXiv cs.AI · 6d ago

ScaffoldAgent: Utility-Guided Dynamic Outline Optimization

ScaffoldAgent introduces a utility-guided framework for dynamic outline optimization in open-ended deep research. It models outline evolution through Expansion, Contraction, and Revision operations, guided by a feedback mechanism that evaluates retrieval gain, structural coherence, and generation quality. Experiments show it improves long-form report generation and factual grounding compared to existing agents.

arxiv arXiv cs.AI · 6d ago

MACR: Explicit Conflict Resolution for LLM Inference

MACR introduces a multi-agent reasoning framework to resolve knowledge conflicts in LLM inference by jointly assessing internal and external knowledge. It uses semantic entropy to measure confidence and employs three specialized agents to induce rules, detect conflicts, and resolve inconsistencies across contexts. Empirical results show MACR outperforms state-of-the-art methods and provides interpretable conflict resolutions.

arxiv arXiv cs.CL · 6d ago

Control-Window Law for Single-Neuron Steering in Language Models

A new framework defines when single-neuron interventions coherently control model behaviors without output collapse. The control window, based on alignment and norm ratios, predicts behavior triggers and collapse ceilings using forward pass data, with high accuracy on held-out neurons. On refusal, control is typed: coherent bypass occurs without actionable content, while genuine actionable reach appears only in specific cases and at later rollout stages.

arxiv arXiv cs.CL · 6d ago

REDACT: Multilingual PII Benchmark with Systematic Control

REDACT introduces a systematically controlled multilingual benchmark for personally identifiable information detection, featuring 51 entity types, 4,127 surface-form patterns, and 25 languages. It evaluates five detectors across 1,000 records, revealing that rule-based models fail on high-stakes data while LLMs perform better, especially in high-sensitivity categories. A reference-free LLM assessment confirms sensitivity-tier assignment as the most challenging evaluation axis.

arxiv arXiv cs.LG · 7d ago

Discriminator-Guided RL Corrects Flow Matching with Data-Aligned Rewards

Discriminator-Guided RL (DRL) uses a pretrained representation space to train a discriminator that separates real data from model-generated samples. Its logit is used as a reward in KL-regularized RL, aligning model outputs with visual and semantic realism without human preferences. DRL improves FID and semantic FD across models like SiT and JiT, and enhances the Pareto frontier between preference and fidelity.

arxiv arXiv cs.AI · 7d ago

User as Engram: Local Parametric Edits for Personal Memory

User as Engram proposes storing per-user facts as surgical, hash-keyed edits to a memory table, leaving reasoning in a shared adapter. This design achieves 5.6x higher indirect-reasoning accuracy and maintains base-level reasoning performance, with a memory footprint 33,000x smaller than per-user LoRA. The approach enables disjoint user edits that compose losslessly, outperforming retrieval pipelines beyond 100 facts.