Mimo 2.5 maintains fast performance at large context lengths on dual RTX Pro 6000 cards using a 5-to-1 local/global sliding-window attention mechanism, similar to Gemma 3. It completes tasks in about 4 minutes, significantly faster than MiniMax M3, which takes around 40 minutes, despite both models having similar quality under VRAM limits.
Mimo 2.5 is fast at large context on dual RTX Pro 6000
from English