llama.cpp version b9673 introduces optional USM system allocations for GPU buffers ≥1GB, enabling VRAM overcommit when device support is available. The feature requires GGML_SYCL_USM_SYSTEM environment variable and is disabled by default, falling back to regular allocations if unsupported.