Skip to content

AppMana/mps-fp8-for-torch-and-comfyui-python-package

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Registers FP8 (float8_e4m3fn, float8_e5m2) and FP4 (float4_e2m1fn_x2) support for PyTorch's MPS backend on Apple Silicon. Once installed, import torch auto-loads the extension via the torch.backends entry point, enabling tensor.to(torch.float8_e4m3fn), torch._scaled_mm, and tensor.copy_ to work transparently on MPS through Metal shader kernels dispatched via torch.mps.compile_shader. The FP8 encode is tested byte-for-byte against all 254 representable values and their midpoints to match CPU PyTorch exactly; FP4 decode is verified exhaustively against all 256 packed byte patterns. 80 tests run on macOS MPS hardware in CI.

pip install fp4-fp8-for-torch-mps

About

FP8 (Float8_e4m3fn) support for Apple Silicon MPS backend

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors