Skip to content

#45: delete BCAST_PIPE — broadcast pipeline lives in skew_lane chain#46

Open
npip99 wants to merge 1 commit into
masterfrom
issue-45-delete-bcast-pipe
Open

#45: delete BCAST_PIPE — broadcast pipeline lives in skew_lane chain#46
npip99 wants to merge 1 commit into
masterfrom
issue-45-delete-bcast-pipe

Conversation

@npip99

@npip99 npip99 commented Jun 2, 2026

Copy link
Copy Markdown
Owner

Summary

  • Delete the parent-level BCAST_PIPE knob in compute_array.sv and all
    ~302 parent flops it carried (forward drain/scrub pipe + reverse
    output pipe). B6 (Full 32×32 compute_array abutment: re-harden cmd_unit + skew_lanes as abutment-ready tiles #40)'s per-skew abutment chain register in
    skew_lane_a/b absorbed the broadcast pipeline natively — the parent
    flops were dead weight that violated R1 ("parent design = macros +
    wires, no logic").
  • Strip BCAST_PIPE plumbing from pymodel/compute_array.py, the
    compute_array Makefile + cocotb env, sv2v sweep targets, and 14
    pure-BCAST_PIPE sweep configs.
  • Repoint production ORFS configs (compute_array, compute_array_abut,
    compute_array_tiny_bcast0) from chip_top_bcast1.v /
    compute_array_tiny_bcast1.v to the single chip_top.v /
    compute_array_tiny.v outputs.

Closes #45.

RTL functional check

cocotb compute_array pymodel pytest
pre-deletion baseline (BCAST_PIPE=0) PASS 2/2 (mma_done @ 1665 / 9200 ns) 90 PASS
post-deletion PASS 2/2 (mma_done @ 1665 / 9200 ns — bit-identical) 90 PASS

Cycle counts match the baseline exactly, confirming the deletion is
functionally a no-op when BCAST_PIPE was already 0 (which chip_top.v
already was, and what cocotb / pymodel always tested against).

Acceptance criteria (issue #45)

  • compute_array.sv has zero flops that aren't part of a macro
    instantiation (R1 invariant satisfied)
  • cocotb compute_array PASS 2/2 (no functional regression)
  • compute_array_abut 32×32: 0 DRC, 0 setup violations, 0 hold
    violations — 40-min ORFS harden running separately
  • tech/INVARIANTS.md R1 violation list updated (mb_pipe/md_pipe
    entry struck)

Risk / fallback

The harden is the issue's Option 1 gate. If compute_array_abut
fails to close at full 32×32 without the parent flop break on the
cmd_unit → skew_a[0] / skew_b[0] arc, Option 2 (absorb the pipes
inside cmd_unit) is a fresh patch on top. The cmd_unit→chain-head
arc is short by construction (cmd_unit and skew chain-heads abut at
the SW corner per compute_array_abut.macro_placement.tcl), so the
expectation is that it closes — but the empirical check is the gate.

Test plan

  • Local cocotb compute_array regression: PASS 2/2.
  • Local pymodel pytest: 90 PASS.
  • make -C tech/sky130 sv2v and compute_array_tiny.v builds
    clean; grep confirms zero BCAST_PIPE symbols in generated
    Verilog.
  • compute_array_tiny_bcast0 ORFS harden: 0 DRC, ≥0 setup/hold
    slack (sanity check before 32×32).
  • compute_array_abut 32×32 ORFS harden: 0 DRC, ≥0 setup/hold
    slack on cmd_unit → chain-head paths (the issue's acceptance).

🤖 Generated with Claude Code

compute_array.sv held ~302 parent flops on chip clk to register the
cmd_unit → cell broadcast (forward push_/drain_/scrub_ pipe) and the
symmetric cmd_unit → chip-external reverse pipe. B6 (#40)'s per-skew
abutment chain (skew_lane_a/b internal chain_w_s/chain_e_n register)
absorbed the forward broadcast pipeline natively, leaving the parent
flops as dead weight that violates R1 ("parent design = macros + wires,
no logic").

RTL:
- compute_array.sv: drop BCAST_PIPE parameter + macro, drop drain/scrub
  forward pipe, drop reverse output pipe, wire cmd_unit's already-
  registered status outputs (mma_busy/done, arrive_*, drain_*) and its
  combinational push_*/drain_en outputs directly to the parent ports
  and chain head.
- pymodel/compute_array.py: drop bcast_pipe= ctor arg, _u_* shadow
  registers, _fwd_pipe/_out_pipe shift registers — collapse to a
  single registered-output model.

Test + build:
- cocotb compute_array regression: PASS 2/2 (same cycle counts as
  pre-deletion baseline, mma_done at 1665 ns / 9200 ns).
- pymodel pytest: 90 PASS.
- compute_array/Makefile + test_compute_array.py: drop BCAST_PIPE env.
- tech/sky130/Makefile: drop sv2v-bcast-sweep + sv2v-tiny-bcast-sweep
  targets, add compute_array_tiny.v target (MMA=4, no BCAST_PIPE).
- tech/asap7/orfs/compute_array{,_abut}.config.mk: repoint
  VERILOG_FILES from chip_top_bcast1.v to chip_top.v.
- tech/asap7/orfs/compute_array_tiny_bcast0.config.mk: repoint to
  compute_array_tiny.v (nickname preserved for downstream stability).
- tech/asap7/orfs/run.sh: refresh staleness-check fallback target.
- Delete 14 sweep configs whose only knob was BCAST_PIPE:
  compute_array_bcast{1,2,3}, compute_array_tiny_bcast{1,2},
  compute_array_tiny_slow{,bal,pipe2,uskew}.

Docs:
- tech/INVARIANTS.md: strike "mb_pipe/md_pipe etc." from R1 violation
  list, refresh B1/B2 sv2v staleness-check refs.
- tech/asap7/DESIGN.md: rewrite the A2 mitigation section so it
  describes 2500 ps SDC alone; retire the parent-flop framing.
- tech/asap7/problems/A2_hold_timing_rtl.md: closure note on the
  BCAST_PIPE deletion.
- tech/asap7/problems/B2_smem_hold_timing.md: update the
  compute_array-analog references.
- tech/RCA_DISCIPLINE.md: refresh prelaunch-checklist recipe to
  reference chip_top.v instead of chip_top_bcast1.v.

The 40-min compute_array_abut harden is the empirical setup-slack gate
the issue calls for (Option 1) — not run here. If it fails to close,
the fallback is Option 2 (absorb the pipes inside cmd_unit) which
would be a fresh patch on top.

Closes #45.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

compute_array: absorb parent BCAST_PIPE flops into cmd_unit OR delete BCAST_PIPE entirely

1 participant