The Yuvion LLM is a new large language model designed to address safety failures by treating adversarial robustness and agentic capability as primary objectives. It utilizes a pipeline combining adversarially aware data construction, knowledge-enhanced continued pretraining, and policy-grounded multi-task safety post-training.

  • The model employs risk-aware supervised fine-tuning and reinforcement learning-based policy optimization for tool use and multi-step reasoning.
  • Yuvion LLM RiskEval (YLRE) introduces 93 benchmarks across four categories to evaluate safety, adversarial robustness, and real-world capabilities.
  • The Yuvion-8B variant outperforms state-of-the-art baselines, including larger models like GPT-5.4 and Qwen3-MAX, on several safety tasks.

This approach aims to provide more realistic safety performance by focusing on strategic attempts to evade model policies rather than just natural inputs.