The Eagle3 has landed for Qwen

The Eagle3 speculative decoding model is now available in llama.cpp's latest release via --spec-type draft-eagle3. It requires a draft model, such as Ex0bit-Qwen3.6-27B-PRISM-EAGLE3-GGUF, and can be used with -md or --model-draft. Performance is comparable to draft-mtp, though tensor parallelism is not supported and VRAM usage is higher.