Skip to content

Lord1Egypt/PtahCore

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

64 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PtahCore 𓁰

An open-source FP8 tensor accelerator — SystemVerilog RTL → synthesis → place-and-route → 7nm GDSII, on a 100% open-source toolchain. Named after Ptah, the Egyptian creator god and patron of craftsmen & architects.

🟢 First GDS is out — a full multi-tile FP8 matmul runs end-to-end through real SystemVerilog (chip_top), bit-exact against a golden numpy model, and the mac_cell tile has a complete, signoff-clean 7nm GDSII on ASAP7: 250 MHz with +1928 ps setup / +13 ps hold slack (zero violations, zero masked), DRC clean, 750 µm². Phase 6 of 10 done. Watch this repo: the fight with real silicon physics lands here as it happens.

mac_cell routed 7nm layout
The routed mac_cell macro — RTL→GDSII on a laptop, 100% open tools (full story in docs/HARDENING.md)

Docs

📐 PLAN.md Roadmap contract — architecture decisions, 10 phases, risks
STEPS.md Live execution checklist, ticked per PR
🏛️ docs/ARCHITECTURE.md Block diagram, memory spaces, engines, module map
📜 docs/ISA.md The six instructions, barriers, canonical kernels
🛠️ docs/DEVELOPMENT.md Read before contributing — workflow & numeric contracts
📊 docs/ENGINEERING.md Honest status + differentiators vs prior art

What's here so far

  • config.py — single source of truth for every design parameter
  • golden/ — bit-exact fp8 (e4m3 and e5m2) encode/decode + matmul reference
  • pymodel/ — full cycle-level machine; e2e matmuls bit-exact, REPEAT K-loops, async-STORE overlap proven
  • rtl/13 SystemVerilog modules, from the fp8/fp32 arithmetic leaves up to chip_top — every one verified bit-exact under Verilator + cocotb
  • chip_top.sv — the whole accelerator: push an instruction stream, it runs a multi-tile matmul and writes results to DRAM, bit-exact vs golden
# Python side (no HW tools needed)
pip install numpy pytest && pytest          # 37 tests, < 1 s

# RTL side — 13 units incl. the full chip
sudo apt-get install verilator && pip install 'cocotb<2'
cd rtl/tb && make all_leaves                # 52 RTL tests

89 tests green (52 RTL + 37 Python).


Made with ❤️ by Lord1Egypt

About

𓁰 PtahCore — open-source FP8 tensor accelerator, RTL→GDSII on open 7nm ASAP7, 100% open toolchain

Topics

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors