forked from ggml-org/llama.cpp
-
-
Notifications
You must be signed in to change notification settings - Fork 226
Pull requests: TheTom/llama-cpp-turboquant
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
sync it
AMD ZenDNN
Apple Metal
Ascend NPU
build
devops
documentation
Improvements or additions to documentation
examples
ggml
Hexagon
IBM zDNN
jinja parser
model
Nvidia GPU
OpenCL
OpenVINO
python
script
server/webui
server
SYCL
testing
Vulkan
WebGPU
#127
opened May 5, 2026 by
cmeta
Loading…
vendor: bump cpp-httplib to 0.43.2 (openssl 4.0.0 fix)
python
script
#121
opened May 4, 2026 by
TheTom
Owner
Loading…
1 of 3 tasks
HIP mixed TurboQuant vec FA on gfx900/gfx906
build
ggml
Nvidia GPU
#99
opened Apr 21, 2026 by
2bigO
Loading…
perf: turbo VEC flash attention — +9% decode on CUDA via autoresearch
ggml
Nvidia GPU
script
#53
opened Apr 4, 2026 by
signalnine
Loading…
7 tasks done
fix: HIP/ROCm compatibility — check cudaMemcpyToSymbol errors, guard …
ggml
Nvidia GPU
#41
opened Apr 1, 2026 by
terrysimons
•
Draft
ProTip!
Add no:assignee to see everything that’s not assigned.