全部文章 — korshunov.ai — ML 新闻

全部文章页 1 / 14

github llama.cpp · 16 天前

llama.cpp 发布 b9670：修复与新构建

llama.cpp release b9670 包含对 llama-graph 中 NVFP4 边缘情况的修复，例如移动 GEMM 后的 MUL 操作并将 build_ffn 限制为支持的组合。该版本提供了适用于 macOS、Linux、Android、Windows 和 openEuler 的二进制文件，涵盖多种架构和后端选项，包括 CUDA、Vulkan、SYCL 和 OpenVINO。

github llama.cpp · 16 天前

llama.cpp Release b9667 Adds Vulkan and CUDA Support

llama.cpp release b9667 introduces Vulkan support with S_v=16 via gated_delta_net. It includes binaries for macOS, Linux, Android, Windows, and openEuler across multiple architectures, with options for Vulkan, CUDA 12.4 and 13.3, ROCm, OpenVINO, and SYCL.

github llama.cpp · 16 天前

llama.cpp 发布 b9668，新增 UMA 主机可见内存和跨平台二进制文件

llama.cpp 版本 b9668 实现了 UMA 主机可见内存缓冲区，以改善 UMA 设备上的性能，基于 0cc4m 的建议。该版本包含适用于 macOS、Linux、Android、Windows 和 openEuler 的二进制文件，支持 CPU、Vulkan、ROCm、OpenVINO、SYCL 和 HIP，并附带专用 UI 包。

github llama.cpp · 16 天前

llama.cpp 发布 b9665，新增 --offline 标志和新二进制构建

llama.cpp 版本 b9665 引入了用于基准测试的新 --offline 标志。该版本包含适用于 macOS、Linux、Android、Windows 和 openEuler 的二进制构建，支持多种架构和硬件加速选项，包括 Vulkan、CUDA、ROCm、OpenVINO 和 SYCL。

github llama.cpp · 16 天前

LLaMA.cpp b9663 版本添加 SYCL 支持和新二进制构建

LLaMA.cpp b9663 版本添加了 OP EXPM1 支持，以及 FLOOR、TRUNC 和 ROUND 的所有单元测试用例。它包含了适用于 macOS、Linux、Android、Windows 和 openEuler 的更新二进制文件，支持 SYCL（FP32 和 FP16）、Vulkan、CUDA 12.4 和 13.3 以及 ROCm 7.2，并更新了 UI。

github llama.cpp · 16 天前

sycl：支持重排序的 Q4_K/Q5_K/Q6_K MoE MUL_MAT_ID

sycl 更新扩展了对 MoE MUL_MAT_ID 中重排序专家张量处理的支持，涵盖 Q4_K、Q5_K 和 Q6_K。不支持的 3D 重排序情况现在会回退而不是中止。

github llama.cpp · 16 天前

Vulkan 添加 col2im_1d 操作并支持多个平台

llama.cpp 发布版 b9661 为 Vulkan 添加了 GGML_OP_COL2IM_1D 支持，使用有界收集循环代替带取模的全 K 扫描。它对不支持的类型返回 nullptr，并为 macOS、Linux、Android、Windows 和 openEuler 提供了构建版本，涵盖 CPU、Vulkan、CUDA 和 SYCL。