Skip to content

Pull requests: cactus-compute/cactus

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add Gemma 4 pruning blog post
#606 opened Apr 23, 2026 by ncylich Collaborator Loading…
Apple GPU Support
#604 opened Apr 22, 2026 by justinl66 Member Draft
Karen/tq
#603 opened Apr 21, 2026 by kar-m Collaborator Loading…
Split native LLM ownership into Model and Context
#602 opened Apr 21, 2026 by aarnav-11 Loading…
mlx added
#587 opened Apr 15, 2026 by kar-m Collaborator Draft
fix gemma4 audio/vision crash when NPU falls back to CPU
#586 opened Apr 15, 2026 by ncylich Collaborator Loading…
4 tasks done
Gemma sp tokenizer
#583 opened Apr 15, 2026 by aarnav-11 Loading…
Graph remaining ops
#578 opened Apr 14, 2026 by cattermelon1234 Contributor Loading…
Turboquant attention kernel
#573 opened Apr 13, 2026 by jrajala6 Contributor Loading…
Follow-up: consolidate sampling APIs after #560
#569 opened Apr 10, 2026 by DuFanYin Contributor Loading…
Qualcomm NPU Support
#563 opened Apr 7, 2026 by justinl66 Member Draft
Structured Generation
#555 opened Apr 6, 2026 by mhayes853 Contributor Loading…
Stateful chunked TDT streaming transcription
#552 opened Apr 5, 2026 by rshemet Collaborator Loading…
3 of 4 tasks
Add IBM Granite 3.3 model support
#541 opened Mar 31, 2026 by vyomshah05 Contributor Loading…
Diarization
#537 opened Mar 26, 2026 by ParkiratS Collaborator Draft
Per-layer KV heads, attention logit capping, MoE per-expert scales, NPU multi-input
#526 opened Mar 19, 2026 by ncylich Collaborator Loading…
4 tasks done
Accelerate MatMul FP16 for Apple GPUs
#523 opened Mar 17, 2026 by aarav18 Contributor Loading…
reverting attn exp calculations to before 3n
#511 opened Mar 9, 2026 by ncylich Collaborator Loading…
Fix gemma multi tool call and logit biasing
#510 opened Mar 8, 2026 by lennartvoelz Contributor Loading…
new approximation for exponent on (0,1)
#500 opened Mar 6, 2026 by kar-m Collaborator Loading…
Optimized Attention
#480 opened Mar 2, 2026 by ncylich Collaborator Loading…
Axis reductions fixes
#473 opened Feb 28, 2026 by cattermelon1234 Contributor Draft
Benchmarking against other quantized kernels
#458 opened Feb 26, 2026 by ncylich Collaborator Loading…
ProTip! Mix and match filters to narrow down what you’re looking for.