Does llama cpp split mode tensor cause issues?
A user reports that using tensor split mode in llama.cpp causes looping issues with tool calls and reasoning traces when running Qwen 27B and Gemma 4 26B (MoE) models across an RTX 5080 and two RTX 5060 Ti GPUs.