vLLM has introduced a new streaming parser for Qwen3+ available in its nightly build, addressing issues like mid-turn stopping and failed streaming tool calls due to chunk boundaries. The update reportedly resolves these problems in limited testing, improving reliability for agentic workflows.
vLLM releases new streaming parser for Qwen3+ in nightly
from English