Run AI models too large for your Mac's memory — at near-full speed. Intelligent expert caching, speculative execution, and 15+ research techniques for MoE inference on Apple Silicon.
python macos rust machine-learning inference moe quantization mlx speculative-execution mixture-of-experts memory-optimization apple-silicon llm metal-gpu ssd-streaming expert-caching
-
Updated
Apr 1, 2026 - Python