**MORI-EP** - [x] Intra-node dispatch / combine kernel - [x] High-bandwidth inter-node dispatch / combine kernel - [x] Low-latency inter-node dispatch / combine kernel - [x] Pybind & aiter integration - [x] Sglang integration - [x] FP8 support - [ ] FP4 support **MORI-IO** - [x] Read / write APIs - [x] Batch transfer APIs - [x] Session transfer APIs - [x] Multi-QP support - [ ] Fault tolerant (Retry / TCP fallback) - [ ] TCP / XGMI transport support - [ ] IBGDA support - [ ] GDS support **Vendor support** - [x] Mellanox - [x] BRCM Thor2 - [ ] Pensando **Performance** - [x] IBGDA core primitive benchmark & optimization - [ ] P2P core primitive benchmark & optimization - [ ] PoC for fused communication & computation kernels **Framework** - [x] RDMA atomic ops - [x] Topology detection - [ ] Barrier abstraction - [ ] Signal abstraction - [x] Testing framework - [ ] IBRC transport - [ ] Refactor & better abstraction of core primitives **MORI-CCL** TBD
MORI-EP
MORI-IO
Vendor support
Performance
Framework
MORI-CCL
TBD