diff --git a/_posts/ai/2026-04-13-mooncake_store_mechanism.md b/_posts/ai/2026-04-13-mooncake_store_mechanism.md new file mode 100644 index 0000000..ca8709b --- /dev/null +++ b/_posts/ai/2026-04-13-mooncake_store_mechanism.md @@ -0,0 +1,32 @@ +--- +title: "Mooncake Store" +subtitle: "存储机制剖析" +layout: chirpy-post +author: "Peter Lau" +published: true +header-style: text +categories: + - AI +tags: + - AI + - Engineering +--- + +## Mooncake Store + +**本次分析基于Mooncake版本v0.3.9** + +### 整体架构设计 + +
+ Mooncake Store Architecture +
+ + +## P2P Store + +p2p store主要用于大模型checkpoint分发,基于Transfer Engine构建。 + +试想如果所有GPU卡都从固定的源头同时加载权重切片,那么源头处的带宽会瞬间饱和,无法进一步提升传输性能。 + +这个方案的独特之处是每个GPU卡在加载完权重切片后,会将其传输到也需要这份切片的GPU卡上,这样源头处的带宽压力就会降低,数据传输效率得到提升。 diff --git a/_posts/ai/2026-04-13-mooncake_transfer_engine_mechanism.md b/_posts/ai/2026-04-13-mooncake_transfer_engine_mechanism.md new file mode 100644 index 0000000..ab71f19 --- /dev/null +++ b/_posts/ai/2026-04-13-mooncake_transfer_engine_mechanism.md @@ -0,0 +1,30 @@ +--- +title: "Mooncake Transfer Engine" +subtitle: "传输机制剖析" +layout: chirpy-post +author: "Peter Lau" +published: true +header-style: text +categories: + - AI +tags: + - AI + - Engineering +--- + +## Mooncake transfer engine + +**本次分析基于Mooncake版本v0.3.9** + +### Transfer engine + +
+ Transfer engine Architecture +
+ +上图中,**vRAM**代表GPU显存,**DRAM**代表CPU主存,**NVMe**(配合NvMEof协议)属于外接硬盘。 + + +### 相关问题 + +1. Prefill transfer failed for request rank xxx diff --git a/img/mooncake/mooncake-arch.png b/img/mooncake/mooncake-arch.png new file mode 100644 index 0000000..19d394a Binary files /dev/null and b/img/mooncake/mooncake-arch.png differ diff --git a/img/mooncake/mooncake-store-preview.png b/img/mooncake/mooncake-store-preview.png new file mode 100644 index 0000000..afb4c79 Binary files /dev/null and b/img/mooncake/mooncake-store-preview.png differ diff --git a/img/mooncake/transfer_engine_arch.png b/img/mooncake/transfer_engine_arch.png new file mode 100644 index 0000000..22b6989 Binary files /dev/null and b/img/mooncake/transfer_engine_arch.png differ