Keye-VL-2.0-30B-A3B is a 30B-parameter multimodal model designed for long-video understanding and agent functionality. It outperforms open-source rivals and matches Gemini-3-Flash in temporal grounding, supports up to 256K context with near-lossless reasoning, and includes built-in capabilities for code, tool, and web search agent workflows.
Keye-VL-2.0-30B-A3B Launches with Advanced Video Understanding and Agent Capabilities
from English